-
Notifications
You must be signed in to change notification settings - Fork 4
Allow cutoff
predictions, support empty background pixels and tests
#29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
false used to be the default, it now has to be set manually since we have custom code in the model checkpoint
This is to print out the filename if anything fails for further inspection
This is to speed up AL loop Not a perfect solution, the UI will now recommend that users predict "default" for a lot of the labels. But it is a first step to make sure we can handle large slide with millions of annotations
When testing the code, I prefer keeping all the models on CPU. Before this commit, the models would automatically go to the GPU if available. After this commit, users can pass a parameter to keep models on CPU
Easier for tests
This also makes it easier to run tests in CPU-only environments
exists, skip it. This is to allow sparse pixelmaps which has empty space between bounding boxes/superpixels
<longflag>usecuda</longflag> | ||
<description>Whether or not to use GPU/cuda (true) or cpu (false).</description> | ||
<label>Use CUDA</label> | ||
<default>false</default> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be true by default
<longflag>cutoff</longflag> | ||
<label>Number of annotations per slide</label> | ||
<default>500</default> | ||
<description>Number of unannotated superpixels to use per slide for features, training and predictions</description> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
training only uses labeled samples
@andsild I find that having multiple PRs accelerates review -- we can get simple things merged in and then the complex things look simpler. I often will do a |
I get that 😄 good practice to do anyway I've opened up
which is just this PR divided into multiple others |
Hi. Let me know if you want me to try and break down the PR into smaller PRs.
Otherwise, this PR is a first step to work with AML in DSA.
Changes are 1:
cutoff
predictions to be made, so that you only do predictions onn
unlabeled features per round/epoch. All labeled samples will always be included. This is to speed up training in cases where you have many slides with many annotations (in our AML case, several millions). The benefit is speed, the downside may be that feature files no longer contain all the data if a user wants to download them.histomics-label
(to index at 1)large_image
(for drawing superpixels where the indexing starts at 1) andhistomicstk
(to generate superpixels with starting value 1). I wanted to start simple by allowing an empty, unlabeled background pixel of size 1, which makes it more or less invisible in the UI.I have tested in DSA with my own AML slides and also using the default superpixels from the UI. The tests should also help to establish a baseline of what works and doesn't (
test_full_predict
starts from scratch with a slide of MNIST numbers and gets an accuracy around 80%`).For the UI, see also DigitalSlideArchive/histomics-label#207