Skip to content

Latest commit

 

History

History
166 lines (133 loc) · 5.26 KB

TODO.md

File metadata and controls

166 lines (133 loc) · 5.26 KB

Neural Sketch project

  • learning to learn sketches for programs

#TODO:

  • Get regexes working
  • figure out correct objective for learning holes -completed ish - no
  • start by training holes on pretrain
  • have lower likelihood of holes
  • make hole probibility vary with depth
  • fuss with correct way to generate sketches
  • Use more sophisticated nn architectures (perhaps only after getting RobustFill to work)
  • Try other domains??

#New TODO:

  • use fully supervised version of holes
  • more sophisiticated nn architectures

#TODO for fully supervised (at main_supervised.py):

  • implement first pass at make_holey_supervised
  • test first pass at make_holey_supervised
  • have some sort of enumeration
  • slightly dynamic batch bullshit so that i don't only take the top 1 sketch, and softmax them together as luke suggested
  • get rid of index in line where sketches are created
  • make scores be pytorch tensor
  • perhaps convert to EC domain for enumeration?? - actually not too hard ...
  • Kevin's continuation representation (see notes)

#TODO for PROBPROG 2018 submission:

  • make figures:
    • One graphic of model for model section
    • One example of data
    • One example of model outputs or something
    • One results graph, if i can swing it?
    • Use regex figure from nampi submission?
    • possibly parallelize for evaluation (or not)
    • write paper
    • show examples and stuff
    • explain technical process
    • write intro
    • remember to show

#TODO for ICLR submission

  • refactor/generalize evaluation
  • beam search/Best first search
  • multiple holes (using EC langauge)
  • build up Syntax-checker LSTM

#TODO for DEEPCODER for ICLR:

  • dataset/how to train
  • primitive set for deepcoder
  • implement the program.flatten() method for nns (super easy)
  • implement parsing from string (not bad either, use stack machine )
  • implement training (pretrain and makeholey)
  • implement evaluation
  • test out reversing nn implementation

#ACTUAL TODO for DEEPCODER ICLR Training:

  • DSL weights (email dude) - x
  • generating train/test data efficiently
  • offline dataset generation
  • constraint based data gen
  • pypy for data gen
  • modify mutation code so no left_application is a hole
  • adding HOLE to parser when no left_application is a hole
  • Compare no left_application is a hole to the full case where everything can be a hole
  • write parser for full case
  • add HOLE to nn
  • dealing with different request types
  • multple holes (modify mutator)
  • deepcoder recognition model in the loop - half completed
  • simple deepcoder baseline
  • make syntaxrobustfill have same output signature as regular robustfill
  • limit depth of programs
  • offline dataset collection and training
    • generation
    • training
    • with sketch stuff
  • filtering out some dumb programs (ex lambda $0)
  • use actual deepcoder data, write converter
  • deal with issue of different types and IO effectively in a reasonable manner
  • incorporate constraint based stuff from Marc

Evaluation:

  • parsing w/Holes - sorta
  • beam search
  • figure out correct evaluation scheme
  • parallelization? (dealing with eval speed)
  • good test set
  • validation set
  • test multiple holes training/no_left_application vs other case
  • using speed

Overall:

  • run training and evaluation (on val set) together for multple experiments to find best model
  • refactor everything (a model Class, perhaps??)

Tweaking:

  • tweak topk - did it with a temperature param, seemed to work well
  • neural network tweaks for correct output format (deal with types and such) - I just fudged it
  • [ ]

#TODO for refactoring:

  • make one main (domain agnostic) training script/file - decided against
  • make one main (domain agnostic) evaluation script/file - decided against
  • figure out the correct class structure to make work easier to extend and tweak. - decided against

#TODO for NAPS:

  • read their code
  • understand how to extend when needed for enumeration stuff
  • make validation set?
  • EMAIL THEM FOR ADDITIONAL QUESTIONS! TRY TO KNOW WHAT TO ASK FOR BY EOD FRIDAY. (IDEAL)
  • make choices such as types of holes, etc.
  • think about how to enumerate, etc.

#TODO for TPUs:

#FRIDAY TODO:

  • loading dataset
  • evaluation code, apart from IO concerns

#OCTOBER CLEAN UP

  • switch to hierarchical file structure
  • add EC as submodule or something
  • fix 'alternate' bug in evaluate code
    • eval script
    • loader scripts?
  • possibly find better names for saved things
  • remove all magic values
  • deal with silly sh scripts
  • fix readme for other users
  • run those other tests
  • perhaps redesign results stuff
  • make sure pypy stuff still works
  • make sure saved models work
  • figure out what needs to be abstracted, and abstract

folders to comb through for hierarchical struct:

  • train

  • eval

  • tests

  • data_src

  • models

  • plot

  • utils

  • run dc with smaller train 4 split