Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(openchallenges): pull data from OC Data google sheet #2959

Merged
merged 15 commits into from
Jan 15, 2025

Conversation

vpchung
Copy link
Member

@vpchung vpchung commented Jan 9, 2025

Jira ticket: https://sagebionetworks.jira.com/browse/CHALLENGE-584

Changelog

  • copy useful functions from update_db_csv.py
  • update lambda_handler to pull data from OC Data google sheet
  • general updates to the project, e.g. project.json and Dockerfile

Preview (testing locally)

Docker

$ nx run openchallenges-data-lambda:invoke --event events/event.json
... [truncated]

{"statusCode": 200, "body": "{\"message\": \"Data successfully pulled from OC Data google sheet.\"}"}

Results in container logs

$ docker logs -f openchallenges-data-lambda
14 Jan 2025 22:12:14,553 [INFO] (rapid) exec '/var/runtime/bootstrap' (cwd=/var/task, handler=)
START RequestId: 0ce1bf16-5bfb-49db-9e54-eefc796c8969 Version: $LATEST
14 Jan 2025 22:12:22,710 [INFO] (rapid) INIT START(type: on-demand, phase: init)
14 Jan 2025 22:12:22,710 [INFO] (rapid) The extension's directory "/opt/extensions" does not exist, assuming no extensions to be loaded.
14 Jan 2025 22:12:22,710 [INFO] (rapid) Starting runtime without AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN , Expected?: false
14 Jan 2025 22:12:23,681 [INFO] (rapid) INIT RTDONE(status: success)
14 Jan 2025 22:12:23,681 [INFO] (rapid) INIT REPORT(durationMs: 971.299000)
14 Jan 2025 22:12:23,682 [INFO] (rapid) INVOKE START(requestId: c177ee09-5bdf-4f8d-ac98-8abbd1c2522a)
14 Jan 2025 22:12:23,685 [INFO] (rapid) INVOKE RTDONE(status: success, produced bytes: 0, duration: 3.190000ms)
END RequestId: c177ee09-5bdf-4f8d-ac98-8abbd1c2522a
REPORT RequestId: c177ee09-5bdf-4f8d-ac98-8abbd1c2522a  Init Duration: 0.04 ms  Duration: 974.85 ms     Billed Duration: 975 ms Memory Size: 3008 MB    Max Memory Used: 3008 MB
START RequestId: 2c84c9cc-df1f-4b5a-be5e-4be9c94bf168 Version: $LATEST
14 Jan 2025 22:12:35,682 [INFO] (rapid) INVOKE START(requestId: 46c6f7e0-6d5c-4ddf-ab27-1a1b5961f33c)
  id             slug  ...           created_at           updated_at
1  1          synapse  ...  2023-08-09 23:01:32  2023-11-21 01:07:14
3  3             cami  ...  2023-08-09 23:01:32  2023-10-19 21:50:25
5  5  grand-challenge  ...  2023-08-09 23:01:32  2023-10-19 21:50:26
6  6    precision-fda  ...  2023-08-09 23:01:32  2023-11-02 18:45:58
8  8           kaggle  ...  2023-08-09 23:01:32  2023-10-19 21:50:28

[5 rows x 7 columns]
... [truncated]

Python

Project setup:

After updating .env with real credentials (shared over LastPass)...

$ cd apps/openchallenges/data-lambda/
$ sh install.sh
$ export $(grep -v '^#' .env | xargs)

Run script with poetry:

$ poetry run python openchallenges_data_lambda/app.py 
  id             slug             name                avatar_url                           website_url           created_at           updated_at
1  1          synapse          Synapse          logo/synapse.png                  https://synapse.org/  2023-08-09 23:01:32  2023-11-21 01:07:14
3  3             cami             CAMI             logo/cami.png      https://data.cami-challenge.org/  2023-08-09 23:01:32  2023-10-19 21:50:25
5  5  grand-challenge  Grand Challenge  logo/grand-challenge.png          https://grand-challenge.org/  2023-08-09 23:01:32  2023-10-19 21:50:26
6  6    precision-fda     precisionFDA     logo/precisionfda.png  https://precision.fda.gov/challenges  2023-08-09 23:01:32  2023-11-02 18:45:58
8  8           kaggle           Kaggle           logo/kaggle.png               https://www.kaggle.com/  2023-08-09 23:01:32  2023-10-19 21:50:28
   id  challenge_id  organization_id              role
0   1             1               75           sponsor
1   2             2               28  data_contributor
2   3             2               45  data_contributor
3   4             2              151  data_contributor
4   5             2               52           sponsor
   id  challenge_id   category
0   1            55  benchmark
1   2           169  benchmark
2   3           155  benchmark
3   4           278  hackathon
4   5           264  hackathon
   id                                               name  login      avatar_url                                       website_url                                        description  challenge_count           created_at           updated_at acronym
0   1  Dialogue on Reverse Engineering Assessment and...  dream  logo/dream.png                       https://dreamchallenges.org  DREAM Challenges use crowd-sourcing to solve c...               74  2023-08-04 07:33:09  2024-06-10 16:09:41   DREAM
2   3  Critical Assessment of protein Function Annota...   cafa   logo/cafa.png       https://www.biofunctionprediction.org/cafa/  The Critical Assessment of protein Function An...                2  2023-06-23 00:00:00  2023-10-20 18:39:12    CAFA
3   4       Critical Assessment of Genome Interpretation   cagi   logo/cagi.png  https://genomeinterpretation.org/challenges.html  The Critical Assessment of Genome Interpretati...               26  2023-06-23 00:00:00  2023-11-18 03:49:35    CAGI
6   7   Critical Assessment of Metagenome Interpretation   cami   logo/cami.png                  https://data.cami-challenge.org/  CAMI, the initiative for the “Critical Assessm...                2  2023-06-23 00:00:00  2023-07-26 20:13:21    CAMI
8   9  Critical Assessment of protein Structure Predi...   casp   logo/casp.png                     https://predictioncenter.org/  Our goal is to help advance the methods of ide...                3  2023-06-23 00:00:00  2023-10-20 18:39:35    CASP
   id  challenge_id  edam_id
0   1             1       18
1   2             2     1056
2   3             3     1481
3   4             4     1056
4   5             4     1057
   id                                          slug                                          name                                           headline                                        description  ...  start_date    end_date operation_id           created_at           updated_at
0   1      network-topology-and-parameter-inference      Network Topology and Parameter Inference  Optimize methods to estimate biology model par...  Participants are asked to develop and/or apply...  ...  2012-06-01  2012-10-01           \N  2023-11-15 22:40:15  2024-03-04 18:31:19
1   2                       breast-cancer-prognosis                       Breast Cancer Prognosis  Predict breast cancer survival from clinical a...  The goal of the breast cancer prognosis Challe...  ...  2012-07-12  2012-10-15           \N  2023-11-14 20:36:32  2024-02-19 18:17:47
2   3          phil-bowen-als-prediction-prize4life          Phil Bowen ALS Prediction Prize4Life  Seeking treatment to halt ALS's fatal loss of ...  Amyotrophic Lateral Sclerosis (ALS), or Lou Ge...  ...  2012-06-01  2012-10-01           \N  2023-11-01 22:09:02  2024-05-21 20:47:21
3   4  drug-sensitivity-and-drug-synergy-prediction  Drug Sensitivity and Drug Synergy Prediction    Predicting drug sensitivity in human cell lines  Development of new cancer therapeutics current...  ...  2012-06-01  2012-10-01         2813  2023-11-01 22:08:36  2024-03-04 18:31:14
4   5                niehs-ncats-unc-toxicogenetics                NIEHS-NCATS-UNC Toxicogenetics  Predicting cytotoxicity from genomic and chemi...  This challenge is designed to build predictive...  ...  2013-06-10  2013-09-15           \N  2023-11-01 22:08:45  2023-11-01 22:06:01

[5 rows x 15 columns]
            incentives  challenge_id           created_at
1          publication             1  2023-11-15 22:40:15
2          publication             2  2023-11-14 20:36:32
3  speaking_engagement             2  2023-11-14 20:36:32
4             monetary             3  2023-11-01 22:09:02
5          publication             5  2023-11-01 22:08:45
  submission_types  challenge_id           created_at
1  prediction_file             1  2023-11-15 22:40:15
2  prediction_file             2  2023-11-14 20:36:32
3  prediction_file             3  2023-11-01 22:09:02
4  prediction_file             4  2023-11-01 22:08:36
5  prediction_file             5  2023-11-01 22:08:45

This reverts commit a5ddae3.
@vpchung vpchung marked this pull request as ready for review January 15, 2025 23:03
Copy link
Member

@tschaffter tschaffter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tschaffter tschaffter merged commit c02e5df into Sage-Bionetworks:main Jan 15, 2025
16 checks passed
@vpchung vpchung deleted the CHALLENGE-384 branch January 15, 2025 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants