Skip to content
Sam Tomioka edited this page Feb 1, 2019 · 8 revisions

Welcome to the sdtm_mapper wiki!

Project Considerations

  • Open Source
  • Remote Repository: See Git Intro.
    • Github
    • CodeCommit
  • Repository: move to PhUSE repo.
  • Data Storage: TBD – depends on company’s policy. Maybe use S3?
    • Github - storage limitation
    • S3 - free for up to 12 mo.
    • Google Drive
    • Box
    • ???
  • Programming Language and Versions:
    • R
    • Python 3
    • SAS
  • Programming Environment:
    • Local
    • GCP
    • Azul ML SDK (Python Only? Avaiability of ML frameworks?)
    • AWS SageMaker (TF, mxnet, PyTorch, Gluon, keras etc available, R can be added scikit-learn recently added)
    • IBM Watson ML (Just GUI???)
  • Final outputs:
    • Specs
    • SAS code
    • Datasets
    • ???
  • Final product:
    • GUI app
    • R library
    • Python package
    • command line
    • production ready model
    • whitepaper - discuss the methology for data pre-processing and algorithms, and the performance
  • Name of the project:
  • Present the project at PhUSE 2019?

Project Details

  • Gather Training Data/Test Data
  • Conventions for class variable
  • Model training, validation, and testing
  • Develop data product

Potential Activities

Ideas

Clone this wiki locally