Skip to content

Latest commit

 

History

History
136 lines (93 loc) · 7.7 KB

data-science.md

File metadata and controls

136 lines (93 loc) · 7.7 KB

Data Science

Table of Contents

Doc

Open data

Data storage

New tech

  • IPFS is the Distributed Web

Markup Language

Languages

Workflow/Pipelines tools

DSL

Language-dependent

  • toil - A scalable, efficient, cross-platform and easy-to-use workflow engine in pure Python
  • Ruffus - Ruffus is a Computation Pipeline library for python. It is open-sourced, powerful and user-friendly, and widely used in science and bioinformatics.

Dataset

  • awesome-public-datasets - An awesome list of (large-scale) public datasets on the Internet. (On-going collection)

Tools

Data structure

Algorithm

Statistics

p-value

Big Data and Cloud

Books

Course

Misc