Materials Modeling at the Molecular Level

 

 

 

Lecture 2: Efficiency and Error Estimation

 

 

 

 

1. Review of Lecture 1 and Lab 1

 

2. More efficient neighbor finding

 

3. Estimating errors

 

 

 

 

 

 

Laura Frink

 

Sandia National Laboratories

Computational Materials Science Dept.

lfrink@cs.sandia.gov

 

Outline

 

lecture 1: An introduction to molecular modeling

 

lab 1: Getting to know MD and MC Codes

 

lecture 2: How to write efficient MC and MD codes

 

lab 2: Implementation of routines for improved efficiency and error estimation

 

lecture 3: What to do with an MD or MC Code ????

 

lab 3: Computing materials properties

 

Last Time

 

1. Overview of Materials Modeling in general;

and Molecular simulation in particular.

 

2. Introduction to Monte Carlo and Molecular Dynamics algorithms.

 

3. Got a feel for the components of MD and MC codes.

 

4. Computed pressures along two isotherms

(one sub-critical, one super-critical)

 

 

 

 

 

 

 

 

5. Found that the programs run slowly ... especially the MC code.

 

 

This Time

 

1. Improving code performance

 

2. Estimating errors

 

Force (MD) /Energy (MC) Computation

90% of the Effort

 

 

Simplest Approach

 

Loop over all particles in simulation volume

MD -- O(N2) -- have moved all N particles at once.

MC -- O(N) -- have only moved one particle at a time.

 

MD

loop over i (1,N)

loop over j (1,N)[

if (i¹ j) calculate fij

]

 

MC

loop over j (1,N)[

if (i¹ j) calculate uij

]

 

 

Better Approach: a factor of 2 !!

 

MD: Apply Newton’s 3rd Law :

for every action there is an equal and opposite reaction. Or fij =- fji.

 

loop over i (1,N-1)

loop over j (i+1,N)[

calculate fij

set fji =-fij

]

 

MC: ???

 

 

Best Approaches:

 

Reduce the size of the loop over particles by keeping track of particles that are within the cut-off distance of interest.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Verlet Neighbor Lists

 

rc = cutoff distance

d = ‘skin’ thickness

rn = rc + d = neighbor cutoff distance

 

 

 

 

 

 

 

 

 

 

 

MD: force calculation - O(N) operation

MC: energy calculation - O(1) operation

 

How to set up neighbor lists - MD

 

 

 

 

 

 

nlist = [2, 3, 4, 3, 4, 4, 7, 7]

nlistpnt = [1, 4, 6, 7, 7, 8, 9, 9]

 

particle 1 2 3 4 5 6 7

 

do I=1, N-1 (N = # of atoms)

do j=i+1, N

compute r2 (with periodic boundary conditions)

if (r2 < rn2) then

add "j" to list of atom "i"

endif

enddo

enddo

 

What to add to the code

 

(1) Main routine :

 

call force

call thermo

 

(2) Integrate routine :

 

update v (1/2 a timestep)

update x (full step)

call force

 

(3) Force and energy routines

 

(old)

do i=1,natoms-1

do j=i+1, natoms

compute distance between atoms i,j

 

(new)

do i=1,natoms-1

do k=nlistpnt(i), nlistpnt(i+1)-1

j=nlist(k)

compute distance between atoms i,j

 

Things to think about

 

How do we choose M (reneighboring frequency) and d (skin size) ?

 

 

 

 

What is "correct" reneighboring criterion ?

 

 

 

 

If you run with and without neighbor lists, will you get the same answer ?

 

 

 

 

Cautions:

 

do not overflow nlist

consider the case where the number of neighbors is zero !!

 

 

List Radius

Update interval

Time (N=256)

Time (N=500)

       

no list

 

3.33

10.0

2.60

6

2.24

4.93

2.70

13

2.17

4.55

2.90

26

2.28

4.51

3.10

43

2.47

4.79

3.43

83

2.89

 

3.50

100

 

5.86

 

Time = CPU time per step in seconds, (Thompson, 1983)

 

Neighbor Lists in MC

 

 

need to know all information for all the particles because we operate on particles one at a time

 

(1) Main

 

call neighbor

call config (initial energy)

 

in main loop:

every M steps -> call neighbor

 

(2) Energy and Config:

 

(old) do j=1, N

 

(new) do ilist=1,nlist_length (i)

j=nlist(i,ilist)

 

 

Linked Cells

 

 

 

 

 

 

 

 

 

 

 

regular grid of bins ibin_x = Nbins_x*x(i)/Lx

 

 

 

 

 

 

 

 

 

 

 

 

 

More or less efficient than neighbor lists ?

Linked Cells and neighbor lists

 

 

 

 

 

 

 

 

 

 

regular grid of bins ibin_x = Nbins_x*x(i)/Lx

 

 

 

 

 

Statistical Errors

 

 

 

For a simulation of infinite length ...

 

 

For a simulation of finite length, tB with correlated samples characterized by a correlation length of

 

 

 

Need to establish ...in the limit tB gets large

 

 

Strategy : Break a simulation up into blocks and take block averages

 

For a block of length tB, where we can calculate

 

 

 

Plot vs tB, and find the long time limit

 

 

 

 

 

Lab 2 : Computational Efficiency and Error Estimation.

 

A new MC code will be provided which has neighbor lists already implmemented. This code also has the basic coding for block averages for estimating errors in the potential energy.

 

 

1. First let’s see what we can expect from the neighbor lists. In the new MC code provided, set up the following parameters:

 

use: nmoves=5000 neq=1000 nsubavg= 100

nac = 1 ncx=ncy=20 xa(1)=ya(1) = 0.5

pshift = 0.1

rs2 = 0.8 kTinitial/e =1.0 rc /s= 2.5

 

A) tabulate the time in the energy routine and the total run time as a function of the parameter Nneigh (this is the number of moves between updates of the neighbor list). Compare your results with the old program where neighbor lists were not used.

 

What value of Nneigh gives the minimum run time ?

 

What is happening at high and low values of Nneigh to get the minimum you observe?

 

 

B) Using the value of Nneigh at the minimum run time in part A, compute the ratio Nneigh/N. Use this ratio to determine Nneigh, and tabulate the time in the energy routine and the total run time as a function of the number of particles, N. Include in your table the run times for the old program without the neighbor lists.

 

In the old and new programs, what is the expected scaling behavior of the energy routine, and the total run time as N increases ?

 

Do you observe these trends? At what values of N ?

 

If the limiting scaling is taken to be time = A Nb, what are the constants A and b for the scaling of energy routines and total run time for the old and new programs ?

 

 

2. Implement neighbor lists in the molecular dynamics program. This will entail 1) writing the subroutine ‘neighbor’ discussed in class, and modifying the loops in the force, energy, and pressure routines. All the other necessary code will be provided.

 

 

3. The MC code provided calculates the statistical inefficiency as a function of the length of block averages in the program.

 

Test the sensitivity of the standard deviation of the energy (i.e. the error in the average), and the statistical inefficiency (the number of steps between uncorrelated moves) to the number of MC moves and the state point.

 

start with

use: nmoves=1,100,000 neq=100,000 nsubavg= 10000

nac = 1 ncx=ncy=10 xa(1)=ya(1) = 0.5

pshift = 0.1

rs2 = 0.8 kTinitial/e =0.7 rc /s= 2.5

 

and vary the density, temperature, number of moves, and equilibration

 

 

A few more fun things to do ...

1. implement linked cells and/or the hybrid linked cell/neighbor list method.

2. implement error calculation in the MD code.

3. implement error calculation for pressure.

4. investigate ‘equilibrium’ with both MC and MD codes.