Skip to content

ISA-Tab annotation for data from SARS-CoV-2 infected host cell proteomics reveal potential therapy targets

License

Notifications You must be signed in to change notification settings

ISA-tools/PXD017710

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is part of an effort to (re-)annotate

https://dx.doi.org/10.21203/rs.3.rs-17218/v1

Proteomics data

Available from PRIDE at https://www.ebi.ac.uk/pride/archive/projects/PXD017710 and MassIVE/CCMS Maestro+MSstats reanalysis of MSV000085096 / PXD017710

ISA-Tab

The formatting and reannotation are based on information extracted from:

  • the original publication
  • the supplementary tables available from the publishers site
  • the 'filtered-results.csv' helper file as supplied to @sneumann during the HUPO-PSI

Viewing the ISA-tab formatted and reannotated PXD017710 with ISATab-Viewer

Viewing the ISA-tab formatted and reannotated PXD017710 locally, do the following:

python -m http.server 8000

Then point your browser to http://0.0.0.0:8000/isaviewer-demo.html

Task performed:

  • initial structure of the study design in ISA format:

  • linkage of Proteome and Translatome data (supplementary material) to ISA assay tables (via Derived Data File)

  • processing the Proteome and Translatome data (supplementary material) with python pandas library to generate the following csv files:

    • proteome_intensities_long_table_ggplot2.txt
    • proteome_diffanal_ratio_pvalue_long_table_ggplot2.txt
    • translatome_intensities_long_table_ggplot2.txt
    • translatome_diffanal_ratio_pvalue_long_table_ggplot2

    The files are long table corresponding to a melt on the Excel file originally generated by the users and can be readily loaded in R ggplot2 library for graphical representation. The statistical relevant elements have been annotated with the STATO ontology and the tables comply with a Frictionless.io Data Package. The jupyter notebook for the transformation is available.

  • conversion of raw data to mzML format

install docker:

		>brew update
		>brew install docker

sign in to docker

		>docker start
		>docker login

pull docker container for ProteoWizard:

>docker pull chambm/pwiz-i-agree-to-the-vendor-licenses

⚠️ be sure to sign-up and login to https://hub.docker.com/

in order to be able to reach

https://hub.docker.com/r/chambm/pwiz-skyline-i-agree-to-the-vendor-licenses

run the pwiz tool from the container over the raw data:

 docker run -it --rm -e WINEDEBUG=-all -v /Users/philippe/Downloads/PXD017710/raw/:/data chambm/pwiz-skyline-i-agree-to-the-vendor-licenses wine msconvert /data/*.raw --mzML
  • ontology markup for:
    • declaration of independant variables as ISA Study Factors:{biological agent, dose, timepoint,replicate} ->OBI
    • Taxonomic information (host cells and virus) -> NCBITaxonomy
    • Cell line: CaCo-2 cells -> Cell Line Ontology
    • Disease: Colon Cancer -> Human Phenotype Ontology
    • MS specific aspect (TMT reagent, instrument ... ) -> PSI-MS
    • Statistical Tests -> STATO

unresolved curatorial issues:

  1. ambiguities related to Tandem Mass Tag labeling protococol

  2. SARS-Cov2 isolate: no clear NCBI Taxonomic anchoring and unclear origin: -> the markup is made to the parent class (as of 06.04.2020)

validation with ISA API:

The default ISA configuration from https://isa-tools.org/format/configurations/index.html was used for validation.

Code snipet showing how to invoke the python ISA validator from the isatools API

import isatools
import os
from isatools import isatab

my_json_report = isatab.validate(open(os.path.join('PXD017710', 'i_PXD017710.txt')))

print(my_json_report)

FAIRification Objectives, Inputs and Outputs

Actions.Objectives.Tasks Input Output
validation Investigation Study Assay (ISA) report
formatting tab delimited file Investigation Study Assay (ISA)
formatting Waters MS format mzML
text annotation CLO annotated text
text annotation OBI annotated text
text annotation NCBI taxonomy annotated text
text annotation HP annotated text
text annotation MS annotated text

Table of Data Standards

Data Formats Terminologies Models
Investigation Study Assay (ISA) CLO
mzML OBI
NCBI taxonomy
HP
MS
STATO

Authors:

Name Affiliation orcid CrediT role
Steffen Neumann IPB-Halle 0000-0002-7899-7192 Writing - First Draft
Philippe Rocca-Serra Data Readiness Group, Department of Engineering Science, University of Oxford, 0000-0001-9853-5668 Writing - Review

License:

About

ISA-Tab annotation for data from SARS-CoV-2 infected host cell proteomics reveal potential therapy targets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 86.1%
  • HTML 8.0%
  • CSS 5.9%