Doreen J. Siria, Roger Sanou, Joshua Mitton, Emmanuel P. Mwanga, Abdoulaye Niang, Issiaka Sare, Paul C.D. Johnson, Geraldine Foster, Adrien M.G. Belem, Klaas Wynne, Roderick Murray-Smith, Heather M. Ferguson, Mario González-Jiménez, Simon A. Babayan, Abdoulaye Diabaté, Fredros O. Okumu, and Francesco Baldini
The malaria parasite, which is transmitted by several Anopheles mosquito species, requires more time to reach its human-transmissible stage than the average lifespan of mosquito vectors. Monitoring the species-specific age structure of mosquito populations is critical to evaluating the impact of vector control interventions on malaria risk. We developed a rapid, cost-effective surveillance method based on deep learning of mid-infrared spectra of mosquito cuticle that simultaneously identifies the species and age class of three main malaria vectors in natural populations. Using spectra from over 40,000 ecologically and genetically diverse An. gambiae, An. arabiensis, and An. coluzzii females, we developed a deep transfer learning model that learned and predicted the age of new wild populations in Tanzania and Burkina Faso with minimal sampling effort. Additionally, the model was able to detect the impact of simulated control interventions on mosquito populations, measured as a shift in their age structures. In the future, we anticipate our method can be applied to other arthropod vector-borne diseases.
- Code for converting spectra
- Code for UMAP Clustering
- Code for convolutional neural net
- Code for power analyses
- Code and instructions for calculating age structure proportions based on gonotrophic cycles
Developed on:
- Operating systems: macOS 10-12; Windows 7; Linux Ubuntu
- Hardware: CPU Intel Core i7, 16-64 GB RAM
- Specialised hardware for deep learning: GPU - TITAN Xp 12GB
- Requires Python 3.x (we recommend Anaconda) and R v. 4 and above, although older versions are likely to work too. Details of packages used are provided in the accompanying paper's methods section and in README files where relevant within each of the folders listed under resources
- Download this repository to your computer
- Download the data from Enlighten (link coming soon)
- Place the data in the same directory as the scripts
Scripts for processing MIRS data, training models, and reproducing are provided as jupyter notebooks. Before you can run the scripts as provided, take care to start by updating the file paths pointing to the source data as per your directory structure.
This work is licensed under a Creative Commons Attribution 4.0 International License.