Skip to content

GSoC 2023 Projects

Xing Wang edited this page Jan 25, 2023 · 25 revisions

Getting started with AiiDA

AiiDA is a python framework for managing computational science workflows, with roots in computational materials science. It helps researchers manage large numbers of simulations (10k, 100k, 1M, ...) and complex workflows involving multiple executables. At the same time, it records the provenance of the entire simulation pipeline with the aim to make it fully reproducible.

AiiDA is used in research projects at universities, research institutes and companies (see SciPy 2020 talk, publications, and testimonials).

To be considered as a GSoC student, we ask you to make a small pull request to aiida-core - could be a simple bug fix, improving the documentation, etc. See e.g. GitHub issues by-label

Why work on AiiDA?

  • Help accelerate the transition to open (computational) science
  • Help fix the reproducibility crisis. Computational science is a good place to start.
  • Work with a team of computational scientists (mostly physics backgrounds) who are passionate about both science and coding.
    We have an active Slack workspace & biweekly developer meetings.

A background in materials science is not needed, but a basic interest in materials science topics will make things easier for you.

Project 1 - Easy Computer/Code setup with AiiDA code registry

Level intermediate

For beginners and even for the experienced AiiDA user, setting up computers and codes is still a tedious mission. If using the interactive mode, although it is good that options are prompted up and the user can set every option one by one carefully, it requires going through all options even if some are not necessary and time-consuming for a similar setup that have shared options with other code/computer setup. AiiDA provides the non-interactive mode to set up the computer/code from a config YAML file, which lower the burden for users who need to set up the computer/code next time. However, the non-interactive mode requires a YAML file as the input and not clear which options are mandatory and let alone it is not clear which default value will be used without checking the command help message or even the source code. Let alone for the computer setup it is a two-stage process, user needs to set up the computer for attributes that are common information for the computer that is stored in the database using verdi computer setup. Then running verdi computer configuration to set up information of the computer that is specific to the user or required to modify after the node stored in the database.

The computer/code can be set up from a YAML file, and we provide repository aiida-code-registry to store the YAML files for public computers and codes to share with others. Need to mention that the interactive setup command can accept a URL of a remote YAML file for setup. This makes it possible to not download/clone the aiida-code-registry repo to use the YAML to set up computer/code.

Expected outcomes

  • Set up the computer/code from the template config file.
  • A new aiida-code-registry repository and corresponding registry page for users to upload/change and fetch the computer/code config files.

Skills

We expect you to be familiar with python programming and have experience with the Jinja templating engine. It will be benefit if student have experience on web development (REST API, HTML etc.)

Project 2 - Training model to generate querybuilder from natural language

Level: advanced

Expected outcomes

At the end of the project, we expect to have a lightweight tool that can run locally to generate querybuilder from the sentence input from the user. It should be a python tool that can be an option to install and integrate with AiiDA, which can be called by verdi command to generate querybuilder.

Skills

We expect you to be familiar with object-oriented programming in python. You need to have experience in natural language processing and know how to train a model from scratch.

Project 3 - Ranking system for AiiDA plugin registry

Level intermediate

Expected outcomes

We expect a new AiiDA plugin registry where the information of plugins is not pulled from each plugin but pushed from every plugin by GitHub actions in every plugin. The order shown on the page is not in the alphabet but in the usefulness of the plugin which can be how many stars or ranking by the number of other tools use it.

Skills

We expect you to be familiar with programming in python and know how GitHub action works. Some familiarity with web development such as HTML development will be helpful.

Project 4 - Explore the AiiDA node graph in the browser

level intermediate

Expected outcomes

AiiDA automatically stores entities in its database and links them forming a directed graph. This directed graph automatically tracks the provenance of all data produced by calculations or returned by workflows. Tihs project plan to provide a more intuitive tool for browsing AiiDA graphs using the interactive browser.

Skills

Python, REST API, HTML, Javascript, React or Vue.

Mentorship

The mentors for GSOC 2023 are

Please use the GSOC 2023 discussion thread to say hi and ask any questions you may have.