2020: Diego & Kunal

Students

Kunal Mehta
- Blog posts: Detection of Orca calls using Active Learning
- Technical documentation
  - OrcaAL project page at ai4orcas.net
  - Orca AL SRKW call model research repo
Diego Rodriguez
- Blog posts
- Technical documentation
  - OrcaAL project page at ai4orcas.net
  - Orca AL web app repo

Mentor team

Valentina Staneva (University of Washington, eScience Institute) -- machine learning & data visualization
Jesse Lopez (Axiom Data Science) -- computational data science & machine learning
Val Veirs (Beam Reach) -- Orcasound Lab hydrophone host, machine learning & noise analysis
Scott Veirs (Beam Reach) -- Orcasound coordinator, marine bioacoustics

For more info, see the Orcasound Hacker Hall of Fame.

Advisors:

Abhishek Singh (Google Summer of Code 2019 alum; final year Computer Science & Engineering student at NIT Durgapur, India/ GSoC’19 at ESIP)
Dan Olsen (North Gulf Oceanic Society) -- killer whale bioacoustics
Hannah Meyers (University of Alaska) -- marine biology
Paul Cretu (Freelance software dev) -- lead Orcasound/orcasite dev for v1 UI
Shima Abadi, Univ. Washington Mechanical Engineering (acoustical oceanography & machine learning)

Handy links

Joint proposal
Trello board
Slack channel
AI for Orcas project, including GSoC 2020 blog posts

Meeting procedures

Report progress on goals from last week
Discuss any blocking issues or strategic decisions (e.g. upcoming scheduled events, code reviews, etc.)
Set new goals for next week

Meeting synopses

7/3/20 Friday GSoC call (10-11 Pacific; Kunal, Diego, Jesse, Abhishek, Val, Scott, Hannah)

Kunal update

Working on model, scripts
Confirmed Jesse's question: you have basic model ready, and scripts to run model from command line
Scott asked if first unlabeled data have been chosen: Kunal will work with Scott to prioritize an event from Scott's spreadsheet of labeling candidates; we decided an S3 bucket like the Acoustic Sandbox would be a good place to store the un/labeled audio data, at least initially
Jesse asked Kunal to polish up scripts, then share for feedback.

Diego update

Implemented performance metric as function of active learning training period (showed new plot, but hasn't pushed to Github yet)
Published blog post
Gave current UI tour to Hannah+
- Hannah mentioned main classification task for AK as orca or not
- But subsequent classifications that would be useful include: orca vs humpback vs boat, and then within orca -- resident vs general transient vs AT1 (unique sounding) calls

General notes & discussion

Thanks to all for completing first GSoC evaluations on time (due today 11 Pacific)
Oliver of Meridian plans to join next Friday

6/26/20 Friday GSoC call (10-11:15 Pacific)

Kunal's update

Working on Valentina’s guidance
- ROC (0.83, 0.2)
- Precision, recall plot
Jesse: Prepare to automate the active learning
- Use Rparse, or new libraries like Click (more efficient than Rparse)or Typer (requires Python versions) to create Command Line Interface
Blocking:
- Error/exception above 1 batch in Tensorboard
- Abhishek will help troubleshoot via DM

Diego's update

Documenting API (orcagsoc/tree/feature/statistics/api)
Added date in order to plot # sounds validated
- Idea of tracking speed of labeler (future feature when/if gamification is used to motivate citizen scientists?)
- Idea of tracking evolution of model performance along with # of sounds validated (possibly on same time-series graph?)
  - Valentina: Tensorboard has examples of how to plot increase in performance
    - here is an example of a sort of 3D plot that I referred to: https://jhui.github.io/2017/03/12/TensorBoard-visualize-your-learning/
  - Valentina: consider a version control (service?) to store model parameter evolution
    - Data Version Control
  - Jesse: save to Drive or S3 Bucket (otherwise Colab resources will be exceeded)

General discussion

Kunal: looking ahead, after a few rounds of active learning, could we use a much larger non-validated set of predictions to train in subsequent rounds?
Jesse: that is done but it is preferable to validate at least some of the predictions
Valentina: Ming used algorithm to get predictions of Beluga signals, then fed training data to deep learning model; other examples of good practice may be found in click detection literature.
Valentina: Plot idea for machine learning scientists: spread or distribution of prediction probabilities or scores for each sample (e.g. lots of 0s and 1s with nothing in the middle) to show over-fitting vs confidence...
Ideas for other (domain expert or data owner) user options:
- Scott: Maybe specify what portion of your data set you want to validate during each active learning iteration?
- - Jesse: Maybe good to indicate confidence for each prediction, or whether a threshold is met or not

Scott's updates & questions:

When/if to utilize Dan/Hannah data this summer, as well as iteratively improving Abhishek’s model and training/test data?
Do either of you need more feedback from me, e.g. user feature specification and priority in the Trello board?
How did Pod.Cast team choose the format of the tsv files and organization of the tar balls?
How different are the Pod.Cast label format and metadata from other training data sets (in bioacoustics, generally)?
DFO meeting #1 synopsis
- Oliver would like to join call on 2nd Fri in July
- DFO wants differentiation between ecotypes (SRKWs and Bigg’s)
Are there any/many Biggs signals in the OrcaCNN data set?
- Abhishek: maybe, but if so very few
- Jesse/Abhishek: general stats/format of OrcaCNN data/labels?
  - About 2000 KW labels (Abhishek generated samples; Dan provided small test set)
  - Humpback train/test data from Monterey Bay (GPL-like usage, so not fully open)

6/19/20 Friday GSoC call (10-11:30 Pacific)

Diego report

Pytest implemented; tabling test of click for later (via Praful)
enabled extension on backend
tested on edge browser (fixed bug), now works on Firefox & Chrome
added code snippet to handle expertise tag
deployed backend on Heroku, front end to Github pages
Github admin needs to publish
Using Postman and PGadmin

Diego goals

Valentina: add documentation to show how Heroku set-up and how to deploy Flask app on linux and docker
Jesse: document API, including end points
Charting libraries (8 default charts; will work with dummy data first)
- Valentina: look at tensor board/chart (are ML measures useful for expert users (scientists like Hannah), or could they be simplified for general audience?
- Grids to analyze/verify confusion matrix results (e.g. true vs false positives)
- Valentina: plot model performance over time (choose 1 score to track during internal validation, e.g. for each epoch)
Table for 2 weeks: Javascript testing. Scott suggestion: ask Praful for Thurs hack group invite & timing (to jumpstart JS testing next week, and/or following week)

Kunal report

working on documentation
Resnet 412 models, VGG16, Inception
Has not used WHOI data, only podcast round 2 & 3
Discussed pre-training on WHOI data vs other orca labeled data

Scott: in anticipation of experiments with different combinations of orca training data, add to orcadata wiki the size of related data sets (with links to them)?

OrcaCNN (Alaskan residents)
OrcaSPOT (NRKWs with data from Orchive)
OBI Lime Kiln data (SRKWs)

Kunal goals

Valentina: start looking into and documenting formats for importing/exporting models and performance comparisons (open source formats? HD55?)
Valentina: plot model performance over time (choose 1 score to track during internal validation, e.g. for each epoch)
Jesse: create a callback for checkpoints, but also an accuracy threshold (stop training if accuracy > 0.95)
Valentina: do a little more tuning, but main reason for over-fitting is that we need more data…
Jesse: ~70% access is a good place to start, then try to improve through active learning process
Valentina: do you have more negatives that you haven’t used for training? If so, does the model suggest that some are “interesting” -- possibly ones that are near your decision boundary?
Kunal: All Orcasound negatives have been used in training, but Ketos background sounds (from NRKWs) might be a possibility
Scott: Let me know if Google Cloud services help with Colab logistics. This week Beam Reach (my social purpose corp) was granted k credits that need to be used in next year...

6/12/20 Friday GSoC call (10-11:15 Pacific)

Mentor thoughts on process for weekly Friday meetings? Jesse: report on progress, blocking issues Scott: include goals for next week Valentina: also schedule (code) review events

Kunal report:

Scott chat links:

How to visualize the model performance?

ROC curves
Confusion matrices

Diego Q for Kunal: What is difference in performance if mp3 is used instead of WAV?

Scott thot: Two experiment ideas to seek an answer --

stream both HLS and FLAC when SRKWs are next calling, &
Go back to WAV files in training (e.g. Pod.Cast rounds) and convert WAV samples to mp3, then re-run model... Ask Val for ideas, too...

Kunal goals:

VGG may be best, but also trying ResNet and convolution 2D model
Jesse: look at how to make code reusable (e.g. Orca), Kunal will convert from colab to to Python scripts…
Valentina: Add markdown cells to document code (including organizing packages), and even images
Abhishek: For next notebook, add subsections in notebook and a top-level README

Diego report:

Added Bigg’s KWs to classification UI
Added option to indicate experience level of labeler
Table of labels, including mp3 filename, label, and user experience level

Diego goals:

Test GUI with Kunal’s processed data (e.g. put it in S3 bucket)
Jesse: include tests (for Flask you can use libraries to mock a post, and ensure something is returned) and embed in continuous integration
Abhishek goal: you had chance to look at JS library?
Valentina: look into each cloud environment’s app service…
Diego: Heroku is easier (Github integration vs ssh from Ubuntu instance), but is more expensive

6/5/20 Friday GSoC call (#1)

**Diego updates: ** -- UI branch w/ J,K,L and no orca categories (bird, ship…) -- SV: send goals for “expert user (SRKW, orca)” to Diego -- Error testing pod.cast - Goal: Will compare w/Valentina - SV: share Akash/Prakruti emails?

Kunal updates: -- Will share notebook with Ketos error -- Goals: -- Why getting error in Ketos (share w/Jesse to document for Fabio/Oliver)

Abhishek: -- keep documenting in the orcasoc repo README!

Val: -- experimenting with edge computing (with Fabio!)

5/20/20 Weekly Wednesday meet-up

Kunal, Val, and Scott discussed Kunal's initial call modeling efforts training with Podcast round 3 set and Val's latest pre-processing approaches.

Scott's list of insights from the discussion New open-source bioacoustic labeling tools should provide guidance about decisions made by domain experts (e.g. when validating predictions in a tool like Podcast and move towards standardization of annotation metadata:

time bounds (fixed duration or variable procedure, start bound time vs signal start time, how much background noise included before/after...) and resolution
frequency bounds and resolution
Whether to exclude calls with clicks, or whistles, or snaps?
What is a sufficient signal to noise ratio to qualify as a call vs a faint call vs a possible call?

Kunal ended with a good question about what to do next to improve his model performance...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly