Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatic kpoint guess update #1233

Merged
merged 1 commit into from
May 29, 2020
Merged

automatic kpoint guess update #1233

merged 1 commit into from
May 29, 2020

Conversation

stevengj
Copy link
Collaborator

Implements strategy 1 from #1223.

@oskooi
Copy link
Collaborator

oskooi commented May 28, 2020

This PR seems to be speeding up the get_eigenmode_coefficients calculations (when compared with the master branch) as demonstrated in the example and results below involving an SOI strip waveguide (in 3d). The speedup increases with the number of frequencies and is nearly 25% for the largest test case involving 800 frequency points. The tests were performed using a single Intel Xeon processor at 4.2 GHz.

import meep as mp
import argparse

def main(args):
    resolution = 20 # pixels/unit length (1 um)                                                                                                                                                  

    h = args.hh
    w = args.w

    nSi = 3.45
    Si = mp.Medium(index=nSi)
    nSiO2 = 1.45
    SiO2 = mp.Medium(index=nSiO2)

    sxy = 4
    sz = 4
    cell_size = mp.Vector3(sxy,sxy,sz)

    # input waveguide                                                                                                                                                                            
    geometry = [mp.Block(material=Si,
                         center=mp.Vector3(),
                         size=mp.Vector3(mp.inf,w,h))]

    # substrate                                                                                                                                                                                  
    geometry.append(mp.Block(material=SiO2,
                             center=mp.Vector3(),
                             size=mp.Vector3(mp.inf,mp.inf,0.5*(sz-h))))

    dpml = 1.0
    boundary_layers = [mp.PML(dpml)]

    # mode frequency                                                                                                                                                                             
    fcen = 1/1.55

    # source width                                                                                                                                                                               
    df = 0.2*fcen

    sources = [mp.EigenModeSource(src=mp.GaussianSource(fcen, fwidth=df),
                                  component=mp.Ey,
                                  size=mp.Vector3(0,sxy-2*dpml,sz-2*dpml),
                                  center=mp.Vector3(-0.5*sxy+dpml),
                                  eig_match_freq=True,
                                  eig_parity=mp.ODD_Y,
                                  eig_kpoint=mp.Vector3(1.5,0,0),
                                  eig_resolution=32)]

    symmetries = [mp.Mirror(mp.Y,-1)]

    sim = mp.Simulation(resolution=resolution,
			cell_size=cell_size,
			boundary_layers=boundary_layers,
			geometry=geometry,
			sources=sources,
			symmetries=symmetries)

    nfreq = args.nfreq
    flux_mon = sim.add_flux(fcen, df, nfreq, mp.FluxRegion(center=mp.Vector3(0.5*sxy-dpml), size=mp.Vector3(0,sxy-2*dpml,sz-2*dpml)))

    sim.run(until_after_sources=100)

    res = sim.get_eigenmode_coefficients(flux_mon, [1], eig_parity=mp.ODD_Y)
    sim.print_times()

    print("mpb-time:, {}".format(sim.time_spent_on(6)[0]))

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('-hh', type=float, default=0.22, help='wavelength height (default: 0.22 um)')
    parser.add_argument('-w', type=float, default=0.50, help='wavelength width (default: 0.50 um)')
    parser.add_argument('-nfreq', type=int, default=100, help='number of frequencies')
    args = parser.parse_args()
    main(args)

mpb_time_comparison

@stevengj
Copy link
Collaborator Author

stevengj commented May 28, 2020

25% isn't very much. How many Newton steps is MPB taking to converge?

@oskooi
Copy link
Collaborator

oskooi commented May 28, 2020

The total number of Newton iterations as a function of the number of frequencies for the two cases is shown below. Similar to the wall-clock time, this PR consistently reduces the iteration count by ~25% relative to master.

mpb_iters_comparison

@stevengj
Copy link
Collaborator Author

How many Newton steps is that per frequency?

@oskooi
Copy link
Collaborator

oskooi commented May 28, 2020

As shown in the plot below, for this test case master shows ~26 iterations/frequency and this PR is 19 iterations/frequency (which is a difference of ~25%).

mpb_iters_comparison2

However and more significantly, if the number of bands (bands parameter from get_eigenmode_coefficients) is doubled (i.e. from [1] to [1,2]), for a test case involving just 50 frequencies the get_eigenmode_coefficients calculation for this PR finishes in around 62 s but for master requires more than 1.5 hours. In fact, the master run was taking so long that it had to be aborted because MPB was failing to converge and showed just repetitive output:

...
MPB solved for frequency_2(2.12841,0,0) = 1.03181 after 25 iters
    iteration   50: trace = 0.1703736127590456 (3.61041e-10% change)
MPB solved for frequency_2(0.187237,0,0) = 0.38548 after 72 iters
MPB solved for frequency_2(1.84788,0,0) = 0.972774 after 25 iters
MPB solved for frequency_2(0.163297,0,0) = 0.382337 after 36 iters
MPB solved for frequency_2(2.12841,0,0) = 1.03181 after 25 iters
MPB solved for frequency_2(0.187237,0,0) = 0.38548 after 34 iters
MPB solved for frequency_2(1.84788,0,0) = 0.972774 after 25 iters
MPB solved for frequency_2(0.163297,0,0) = 0.382337 after 36 iters
MPB solved for frequency_2(2.12841,0,0) = 1.03181 after 25 iters
MPB solved for frequency_2(0.187237,0,0) = 0.38548 after 34 iters
MPB solved for frequency_2(1.84788,0,0) = 0.972774 after 25 iters
MPB solved for frequency_2(0.163297,0,0) = 0.382337 after 36 iters
MPB solved for frequency_2(2.12841,0,0) = 1.03181 after 25 iters
MPB solved for frequency_2(0.187237,0,0) = 0.38548 after 34 iters
MPB solved for frequency_2(1.84788,0,0) = 0.972774 after 25 iters
MPB solved for frequency_2(0.163297,0,0) = 0.382337 after 36 iters
MPB solved for frequency_2(2.12841,0,0) = 1.03181 after 25 iters
...

For comparison, the output from this PR seemed fine:

...
Dominant planewave for band 2: (0.919242,-0.000000,0.000000)
MPB solved for frequency_2(0.923283,0,0) = 0.688611 after 26 iters
MPB solved for frequency_2(0.923282,0,0) = 0.688611 after 1 iters
Dominant planewave for band 2: (0.923282,-0.000000,0.000000)
MPB solved for frequency_2(0.927321,0,0) = 0.691245 after 28 iters
MPB solved for frequency_2(0.927321,0,0) = 0.691244 after 1 iters
Dominant planewave for band 2: (0.927321,-0.000000,0.000000)
MPB solved for frequency_2(0.931359,0,0) = 0.693878 after 27 iters
MPB solved for frequency_2(0.931358,0,0) = 0.693878 after 1 iters
Dominant planewave for band 2: (0.931358,-0.000000,0.000000)
MPB solved for frequency_2(0.935395,0,0) = 0.696511 after 30 iters
MPB solved for frequency_2(0.935395,0,0) = 0.696511 after 1 iters
Dominant planewave for band 2: (0.935395,-0.000000,0.000000)
MPB solved for frequency_2(0.939431,0,0) = 0.699144 after 26 iters
MPB solved for frequency_2(0.93943,0,0) = 0.699144 after 1 iters
Dominant planewave for band 2: (0.939430,-0.000000,0.000000)
MPB solved for frequency_2(0.943466,0,0) = 0.701778 after 27 iters
MPB solved for frequency_2(0.943465,0,0) = 0.701777 after 1 iters
Dominant planewave for band 2: (0.943465,-0.000000,0.000000)
MPB solved for frequency_2(0.9475,0,0) = 0.704411 after 27 iters
MPB solved for frequency_2(0.9475,0,0) = 0.704411 after 1 iters
Dominant planewave for band 2: (0.947500,-0.000000,0.000000)
MPB solved for frequency_2(0.951534,0,0) = 0.707044 after 42 iters
MPB solved for frequency_2(0.951534,0,0) = 0.707044 after 6 iters
MPB solved for frequency_2(0.951534,0,0) = 0.707044 after 2 iters
MPB solved for frequency_2(0.951534,0,0) = 0.707044 after 2 iters
MPB solved for frequency_2(0.951534,0,0) = 0.707044 after 1 iters
Dominant planewave for band 2: (0.951534,-0.000000,0.000000)
...

Thus, it seems that beyond just the speed up enabled by this PR when higher order modes are involved, this PR provides an important fix for making these types of calculations feasible.

@stevengj
Copy link
Collaborator Author

stevengj commented May 29, 2020

That's still more iterations than I would expect; there might be something wrong with the scaling in the k-point guess here?

@stevengj
Copy link
Collaborator Author

for a test case involving just 50 frequencies the get_eigenmode_coefficients calculation for this PR finishes in around 62 s but for master requires more than 1.5 hours.

Hmm, that's good news, although I'm not sure why it was so slow before.

@stevengj stevengj merged commit a647d6f into master May 29, 2020
@stevengj stevengj deleted the kpoint_guess_update branch May 29, 2020 02:35
@oskooi
Copy link
Collaborator

oskooi commented May 29, 2020

Just in case at some point we want to look into why the mode solver without this patch was failing to converge, the following test case can be used which is based on the original example above and involves a single mode (bands=[2]) and just 6 frequencies (nfreq=6).

When the number of frequencies is <6, the mode solver converges with or without this patch.
When the number of frequencies is >=6, the mode solver fails to converge without this patch.

import meep as mp
import argparse

def main(args):
    resolution = args.res

    h = 0.22        # waveguide height                                                                                                                                     
    w = 0.50        # waveguide width                                                                                                                                      

    nSi = 3.45
    Si = mp.Medium(index=nSi)
    nSiO2 = 1.45
    SiO2 = mp.Medium(index=nSiO2)

    sxy = 4
    sz = 4
    cell_size = mp.Vector3(sxy,sxy,sz)

    # input waveguide                                                                                                                                                      
    geometry = [mp.Block(material=Si,
                         center=mp.Vector3(),
                         size=mp.Vector3(mp.inf,w,h)),
                mp.Block(material=SiO2,
                         center=mp.Vector3(0,0,-0.5*sz+0.25*(sz-h)),
                         size=mp.Vector3(mp.inf,mp.inf,0.5*(sz-h)))]

    dpml = 1.0
    boundary_layers = [mp.PML(dpml)]

    # mode frequency                                                                                                                                                       
    fcen = 1/1.55

    # source width                                                                                                                                                         
    df = 0.2*fcen

    sources = [mp.EigenModeSource(src=mp.GaussianSource(fcen, fwidth=df),
                                  component=mp.Ey,
                                  size=mp.Vector3(0,sxy-2*dpml,sz-2*dpml),
                                  center=mp.Vector3(-0.5*sxy+dpml),
                                  eig_match_freq=True,
                                  eig_parity=mp.ODD_Y,
                                  eig_kpoint=mp.Vector3(1.5,0,0))]

    symmetries = [mp.Mirror(mp.Y,-1)]

    sim = mp.Simulation(resolution=resolution,
                        cell_size=cell_size,
                        boundary_layers=boundary_layers,
                        geometry=geometry,
                        sources=sources,
                        symmetries=symmetries)

    nfreq = args.nfreq
    flux_mon = sim.add_flux(fcen, df, nfreq, mp.FluxRegion(center=mp.Vector3(0.5*sxy-dpml), size=mp.Vector3(0,sxy-2*dpml,sz-2*dpml)))

    sim.run(until_after_sources=100)

    res = sim.get_eigenmode_coefficients(flux_mon, [2], eig_parity=mp.ODD_Y)
    sim.print_times()

    print("mpb-time:, {}".format(sim.time_spent_on(6)[0]))

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('-res', type=int, default=20, help='resolution (pixels/um)')
    parser.add_argument('-nfreq', type=int, default=6, help='number of frequencies')
    args = parser.parse_args()
    main(args)

@stevengj
Copy link
Collaborator Author

This PR certainly changes the errors slightly because it changes the starting guess, but the accuracy should(?) be no worse than before — that is, the digits that are changing should all be incorrect digits anyway.

(No matter how much you decrease the tolerance, at some point MPB will hit an accuracy limit, but this accuracy limit shouldn't(?) be changed by this PR.)

bencbartlett pushed a commit to bencbartlett/meep that referenced this pull request Sep 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants