Skip to content
rishidev edited this page Oct 22, 2018 · 14 revisions

Welcome to the homepage for the Large Scale Genomics Work Stream, part of the Global Alliance for Genomics and Health. Led by Oliver Hofmann and Thomas Keane this Work Stream creates standardized methods for accessing large-scale genomic data (reads, variants, and expression data) by file-based, API-based, cloud-based, and distributed access.

To understand the role of Work Streams in GA4GH please visit the https://www.ga4gh.org/howwework.

This Work Stream meets at a high level quarterly, mainly focusing on the reporting on the developments of sub-groups to Driver Projects. The GA4GH strategic roadmap details the planned standards developments of this Work Stream. Minutes from the meetings are available here.

Task Teams

The work of the Large Scale Genomics Work Stream is mainly done in sub-groups that usually meet every four weeks. All meetings are minuted. Links to these are available for all to view.

File Formats

This team deals with the development and maintenance of standard file formats for the following:

  • standard read formats (BAM/CRAM/SAM)
  • standard variant file formats (VCF/BCF)

GitHub home - Meeting Minutes

There is also a team to look at encrypted versions of these formats. Meeting Minutes

htsget

A standardised non-file based API for securely streaming the above listed file formats

GitHub home - Meeting Minutes

RNASeq

Developing scalable ways of storing and transmitting expression information related to RNASeq data

GitHub home - Meeting Minutes

refget API

A framework to retrieve ‘reference sequences’ by a unique checksum, allowing users to retrieve such reference sequences without ambiguity from different databases and servers.

[GitHub home] (https://github.com/samtools/hts-specs) - Meeting Minutes

Clone this wiki locally