Skip to content

Data Loading : What You Need To Change

pieterlukasse edited this page Mar 23, 2016 · 9 revisions

This page contains an overview to help you transition to the new file formats. As this should be a one-time operation for all users, this page is only temporary.

  1. Update your genes table, see issues #799 and #805 on how to do this
    • Reason: cBioPortal in the past accidentally imported the wrong column as HUGO symbol. This will cause many errors during the validation process.
  2. Check the following table for your data types:
DataType What you have to do
Cancer Study (optionally) Add add_global_case_list
Cancer Type Create the meta file
Discrete Copy Number Data Update meta file:
  • change stable_id to gistic, cna, cna_rae or cna_consensus
  • add data_filename
Remark: copynumber profiles used by the cross-cancer histogram no longer use the name to check whether the data is GISTIC or RAE; this is now based on the stable_id.
Copy Number Data Update meta file:
  • if datatype is LOG-VALUE change it to LOG2-VALUE
  • if datatype is CONTINUOUS, change stable_id to linear_CNA
  • add data_filename
Segmented Data Update meta file:
  • change genetic_alteration_type to COPY_NUMBER_ALTERATION
  • change datatype to SEG
  • remove: stable_id, show_profile_in_analysis_tab, profile_name, profile_description
  • add: description, data_filename
Expression Data Update meta file:
  • check your stable_id against the table
  • add data_filename
Mutation Data Update meta file:
  • change your stable id to mutations
  • add data_filename
Fusion Data (TODO) Update meta file:
  • add data_filename
Methylation Data Update meta file:
  • change stable_id to methylation_hm27 or methylation_hm450
  • add data_filename
RPPA Data Update meta file:
  • change genetic_alteration_type to PROTEIN_LEVEL
  • change datatype to LOG2_VALUE or Z-SCORE
  • change stable_id to rppa or rppa_Zscores
  • add data_filename
Clinical Data
  • Create two separate meta files, one for samples and one for patients
  • Create two separate data files, one for samples and one for patients
    • remove the row describing whether an attribute is a SAMPLE or a PATIENT attribute
For full instructions, check the file formats
Case Lists -
Timeline Data Update meta file(s):
  • remove: stable_id, show_profile_in_analysis_tab, profile_name, profile_description
  • add: data_filename
Gistic Data Create the meta file
MutSig Data Create the meta file