Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write Analyzer for Naive vs CNN #1

Open
jkschin opened this issue Mar 3, 2022 · 1 comment
Open

Write Analyzer for Naive vs CNN #1

jkschin opened this issue Mar 3, 2022 · 1 comment
Assignees

Comments

@jkschin
Copy link
Owner

jkschin commented Mar 3, 2022

  1. decoder.py has too much code bloat. It needs to be refactored out into an analyzer.
  2. This analyzer takes a given image ID (and the relevant num_cities, file paths, etc.) and outputs all the routes a particular node can go to. See figure below.
  3. Similarly, it should be able to highlight exactly which edges were pruned. Specifically, if the CNN pruned some edges, and that particular edge was in the near-optimal solution, this edge needs to be highlighted.
  4. The analyzer should output a summary image (to be defined) and the individual routes that allow a deep dive (figure below).

Fig 1

@jkschin jkschin self-assigned this Mar 3, 2022
@jkschin
Copy link
Owner Author

jkschin commented Mar 5, 2022

Relevant code for analysis here was made in this commit: 8363468.

All experiments were verified to be running on a Xeon-P8 full node so there's no resource contention. We used our postprocessing method to extract the edges from the CNN output.

Previously, we ran the experiments in parallel, utilizing 96 // 2 - 1 parallel tasks. This resulted in the following test samples using a 10K cost edge:

[10072, 10088, 10098, 10137, 10413, 10421, 10535, 10552, 10677, 10678, 10691, 10907, 10984]

We previously noticed some stochasticity in parallel runs, and thus we wanted to isolate that effect. To isolate this effect, we ran all experiments henceforth in serial. True enough, running the above 13 samples again resulted in the following test samples using a 10K cost edge:

[10413, 10677, 10678, 10691, 10984]

Now, this got me really curious about this stochasticity. Intuitively, it makes sense since we are limiting the solver to 30 seconds and perhaps the solver got stuck at that local minima with that 10K cost edge - we didn't give it a chance to jump out of it. I decided to run it again on the 5 samples above and this resulted in the following test samples using a 10K cost edge:

[10678, 10691]

It reduced again. I wasn't sure I was seeing the right thing, and thus decided to copy paste the exact output from the terminal below:

# You can see that 10413 had a 10K edge used in the first try, and no 10K edge used in the second try
# In general, if the "10K" field has a cost above 10K, a 10K edge is used.
Naive: False | Adj Matrix Size: 5936 | Adj Matrix Shape: (200, 200)
ID: 10413 Orig-Z: 6552 NN-Z: 7094 10K: 16412

Naive: False | Adj Matrix Size: 5936 | Adj Matrix Shape: (200, 200)
ID: 10413 Orig-Z: 6552 NN-Z: 6542 10K: 6542

I ran [10678, 10691] one last time, and both of them used a 10K cost edge. Is this stochasticity, or local minima due to searching? The current algorithm is quite liberal and allows a particular node to visit up to the fourth city away. I really can't imagine it being stuck in local minima.

I changed the Google OR Tools parameters from 30 seconds to 60 seconds, and true enough, both these samples had the 10K edge removed. However, the naive method performed slightly better, by a slight advantage (2 and 20 units).

I then decided to try another parameter. What if the 10K penalty is too low? I set it to 1M and both samples [10678, 10691] ended up using a pruned arc. The efficacy of this is thus limited.

The conclusion here is that we should try running the entire analysis for a minute in parallel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant