Skip to content

Sample reports

noblem edited this page Mar 14, 2017 · 2 revisions

gdc_report

While loadfiles facilitate consumption by computers and analysis pipelines, it can be hard to get a broad picture of the downloaded data just by looking at or grep-ing through such a file. During TCGA, the Broad GDAC generated html nozzle reports, which relax the rigor of loadfiles for the sake of readability. These reports count the number of available data files for each sample set, as well as describe duplicates, redactions, and blacklisted samples.

Usage

gdc_report.py [-c CONFIG [CONFIG ...]] [OPTIONS] [datestamp]

options:
  datestamp             Use GDC data for a specific date. If omitted, the
                        latest available data will be used.

  -h, --help            show this help message and exit
  --verbose             set verbosity level [None]
  --version             show program's version number and exit
  -l LOG_DIR, --log-dir LOG_DIR
                        Folder to store logfiles
  -c CONFIG [CONFIG ...], --config CONFIG [CONFIG ...]
                        One or more configuration files
  -g program [program ...], --programs program [program ...]
                        Process data ONLY from these GDC programs
  -p project [project ...], --projects project [project ...]
                        Process data ONLY from these GDC projects
  --cases case_id [case_id ...]
                        Process data ONLY from these GDC cases

The reports folder will now contain the nozzle reports for all the available sample sets.

Clone this wiki locally