This project will facilitate the development of strategic partnerships and resources around skills and capacity building in data science in biomedical research.
Introduction to Data Science and AI for senior researchers, group leaders, late PhD/Postdocs and mid to late-career biomedical scientists. Materials developed through this project will enable a foundational understanding of AI and data science in the context of biosciences. Furthermore, researchers will receive training for managing, supervising and facilitating open and reproducible research for the wider biology community. Funded by the AI for Science and Government Research programme, this project ran from October 2021 to March 2022.
This project is a follow-up of The Crick-Turing Biomedical Data Science Awards (BDSAs) (Phase 1 project period: 01/10/2019 – 28/02/2021) carried out under the Turing and Crick partnership.
- Introduction to Data Science and AI for senior researchers: https://carpentries-incubator.github.io/data-science-ai-senior-researchers/
- Managing Open and Reproducible Computational Projects: https://carpentries-incubator.github.io/managing-computational-projects/
Researchers from outside this project were invited to review and enhance these materials by integrating real-world examples from their work. Additionally, professional illustrators (Scriberia) worked with researchers in this project to develop illustrations to be paired with the written contents.
All materials including the illustrations (see illustrations-from-review-sprint
) are shared under CC-BY 4.0 License for reuse, remix, sharing and distribution with appropriate citation.
Proposal Lead
- Dr. Malvika Sharan, Senior Researcher - Tools, Practices and Systems
Development Team
- Dr. Lydia France, Research Data Scientist
- Dr. Malvika Sharan, Senior Researcher - Tools, Practices and Systems
- Dr. Federico Nanni, Senior Researcher Data Scientist
Reviewers and Editors
- Dr. Julien Colomb, Humboldt-Universität zu Berlin
- Dr. Jo Havemann, Access 2 Perspective
Contributors from the Turing Research Programmes
- Dr. Alisha Davies, AI for Science and Government Health Theme Lead
- Dr. Kirstie Whitaker, Programme Director - Tools, Practices and Systems
- Prof. Ben MacArthur, Director of AI for Science and Government, Deputy Programme Director for Health and Medical Sciences
Contributors from The Francis Crick Institute
- Prof. James Briscoe, Senior Group Leader - Assistant Research Director
- Rebecca Wilson, Head of Strategic Partnerships
- James Fleming, Chief Information Officer
Thanks to these researchers for sharing feedback and examples to include on the earlier drafts of our training materials!
- Victor Tybulewicz: Group Leader, The Francis Crick Institute, Lab page
- Radoslav Enchev: Group Leader, The Francis Crick Institute, Lab page
- Francesca Ciccarelli: Group Leader, The Francis Crick Institute, Lab page
- Florencia Iacaruso: Group Leader, The Francis Crick Institute, Lab page
- Evangenline Corcoran: Postdoctoral Researcher, The Alan Turing Institute, Personal page
- Jim Maas: Postdoctoral Scientist - Computational Biologist, John Innes Center, Personal page
More members from both the Turing and Crick represent this partnership, contribute to project meetings and help coordinate this project.
Please see the project proposal for details.
Please create an issue to share references or ideas related to the development of this project.
Please see the Project Charter for details.
- Draft a proposal collaboratively to define the scope of this project
- Set up the repository to develop this project openly
- Define the scope and stakeholders for the project (help develop a project charter)
- Identify potential contributors to this project at the Turing
- Identify potential contributors to this project at the Crick
- Define the common vision, mission and target audience
- Host meetings with all stakeholders to discuss
- The initial plans, project charter and goals
- Agree on the best way to collaborate and communicate
- Monthly updates and feedback on the development (align expectations)
- Identify potential contributors from the wider research community
- Other institutes, projects and people with a vested interest
- Define curriculum by selecting topics for content development (build concept map)
- Select open source references for reuse (see issues for reference materials)
- Design training curriculum (concept map, data, reusable materials) using Carpentries Development Handbook
- Set up the Carpentries template for material development (see community lessons)
- Define episodes (modules) and adapt training materials for biological datasets <-- REG member
- Select biological datasets - potentially provided by the Crick through 1:1 interviews
- Seek feedback from all stakeholders and invite contributions <-- Review and illustration sprint
- Release the draft and invite the community to test the materials
- Deliver a pilot training
Training materials for two masterclasses will be developed and shared from this project.
- Introduction to data science and AI for senior researchers: This masterclass will also touch on some concepts related to algorithm selection, statistical approaches and the potential additionality of Machine Learning and Deep Learning.
- Managing and supervising computational Projects: This masterclass will provide an understanding of open source tools, version control, literate programming, Markdown, GitHub, metadata and other collaborative approaches.
Inviting feedback from the mid-to late-career researchers from the Turing, the Crick and wider research communities, these masterclasses will build a shared understanding of good practice principles to facilitate the integration of reproducible computational approaches from data science into biological research.
The training materials will be developed openly from the start under The Carpentries Incubator GitHub organisation.
These are two separate GitHub repositories for the two masterclasses:
- Masterclass 1: Introduction to Data Science and AI for senior researchers: https://github.com/carpentries-incubator/data-science-ai-senior-researchers
- Masterclass 2: Managing Open and Reproducible Computational Projects: https://github.com/carpentries-incubator/managing-computational-projects
Though developed under subtitles Masterclass 1 and 2, both the materials will be standalone and modular to encourage their use independently of each other.
Please create an issue to add any milestones or goals that are currently missing from the roadmap, or to suggest new features.
This project is maintained by Malvika Sharan. For any organisation-related queries or concerns, you can directly reach out to her by emailing msharan@turing.ac.uk.
This work is licensed under the MIT license (code) and Creative Commons Attribution 4.0 International license (for documentation). You are free to share and adapt the material for any purpose, even commercially, as long as you provide attribution (give appropriate credit, provide a link to the license, and indicate if changes were made) in any reasonable manner, but not in any way that suggests the licensor endorses you or your use, and with no additional restrictions.