Skip to content

Commit

Permalink
update ReadMe
Browse files Browse the repository at this point in the history
  • Loading branch information
Edward Wang authored and Edward Wang committed Nov 20, 2024
1 parent ec2d92d commit 07b750d
Show file tree
Hide file tree
Showing 33 changed files with 156 additions and 22,245 deletions.
32 changes: 2 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,37 +8,9 @@ HgvsGo was specifically developed for clinical use, making it well-suited for me

## How to Use HgvsGo

### Step 1: Download the Repository and Build HgvsGo

```
git clone https://github.com/SoloEdward/HgvsGo.git
cd ./HgvsGo/src/
mkdir build
cd build/
cmake ..
make
cd ../../
```

### Step 2: Download and Prepare the Human Genome

```
wget https://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/annotation/GRCh37_latest/refseq_identifiers/GRCh37_latest_genomic.fna.gz
gunzip GRCh37_latest_genomic.fna.gz
python parse_genome.py
```

### Step 3: Download RNA Sequences for All Transcripts

```
wget https://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/annotation/GRCh37_latest/refseq_identifiers/GRCh37_latest_rna.fna.gz
gunzip GRCh37_latest_rna.fna.gz
```

### Step 4: Run the Program

```
./src/build/HgvsGo ./GRCh37_latest_rna.fna.gz ./human.genome.fa ./refseq.select.hg19.parsed.txt demo.input.txt demo.output.txt
apptainer run HgvsGo.hg19.sif demo.input.txt demo.output.txt
apptainer run HgvsGo.hg38.sif demo.hg38.input.txt demo.hg38.output.txt
```

## Input Format
Expand Down
21 changes: 21 additions & 0 deletions demo.hg38.input.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
chrom pos ref alt
7 55154121 C T
7 55155858 G A
7 55155858 G C
7 55163790 G C
7 55170317 G T
7 55174790 ATCTCCGAAAGCCAACAAGGAAATC A
7 55181377 A T
7 55181379 G A
7 55181379 G T
7 55191821 C T
7 55191821 CT AG
7 55191822 T A
7 55191822 TG GT
7 55191823 G T
7 55191858 A G
7 55192790 G T
7 55192858 A T
7 55198790 C T
7 55200325 T C
7 55200325 T G
133 changes: 133 additions & 0 deletions demo.hg38.output.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
chrom pos ref alt transcript_id gene exon_id hgvs_c hgvs_p
7 55154121 C T NM_005228.5 EGFR 7 c.858C>T p.Ser286=
7 55154121 C T NM_001346899.2 EGFR 6 c.723C>T p.Ser241=
7 55154121 C T NM_001346941.2 EGFR 2 c.89-1709C>T NA
7 55154121 C T NM_001346898.2 EGFR 7 c.858C>T p.Ser286=
7 55154121 C T NM_001346897.2 EGFR 6 c.723C>T p.Ser241=
7 55154121 C T NM_201284.2 EGFR 7 c.858C>T p.Ser286=
7 55154121 C T NM_201282.2 EGFR 7 c.858C>T p.Ser286=
7 55154121 C T NM_201283.2 EGFR 7 c.858C>T p.Ser286=
7 55154121 C T NM_001346900.2 EGFR 7 c.699C>T p.Ser233=
7 55155858 G A NM_005228.5 EGFR 8 c.918G>A p.Ser306=
7 55155858 G A NM_001346899.2 EGFR 7 c.783G>A p.Ser261=
7 55155858 G A NM_001346941.2 EGFR 2 c.117G>A p.Ser39=
7 55155858 G A NM_001346898.2 EGFR 8 c.918G>A p.Ser306=
7 55155858 G A NM_001346897.2 EGFR 7 c.783G>A p.Ser261=
7 55155858 G A NM_201284.2 EGFR 8 c.918G>A p.Ser306=
7 55155858 G A NM_201282.2 EGFR 8 c.918G>A p.Ser306=
7 55155858 G A NM_201283.2 EGFR 8 c.918G>A p.Ser306=
7 55155858 G A NM_001346900.2 EGFR 8 c.759G>A p.Ser253=
7 55155858 G C NM_005228.5 EGFR 8 c.918G>C p.Ser306=
7 55155858 G C NM_001346899.2 EGFR 7 c.783G>C p.Ser261=
7 55155858 G C NM_001346941.2 EGFR 2 c.117G>C p.Ser39=
7 55155858 G C NM_001346898.2 EGFR 8 c.918G>C p.Ser306=
7 55155858 G C NM_001346897.2 EGFR 7 c.783G>C p.Ser261=
7 55155858 G C NM_201284.2 EGFR 8 c.918G>C p.Ser306=
7 55155858 G C NM_201282.2 EGFR 8 c.918G>C p.Ser306=
7 55155858 G C NM_201283.2 EGFR 8 c.918G>C p.Ser306=
7 55155858 G C NM_001346900.2 EGFR 8 c.759G>C p.Ser253=
7 55163790 G C NM_005228.5 EGFR 14 c.1689G>C p.Leu563=
7 55163790 G C NM_001346899.2 EGFR 13 c.1554G>C p.Leu518=
7 55163790 G C NM_001346941.2 EGFR 8 c.888G>C p.Leu296=
7 55163790 G C NM_001346898.2 EGFR 14 c.1689G>C p.Leu563=
7 55163790 G C NM_001346897.2 EGFR 13 c.1554G>C p.Leu518=
7 55163790 G C NM_201284.2 EGFR 14 c.1689G>C p.Leu563=
7 55163790 G C NM_201282.2 EGFR 14 c.1689G>C p.Leu563=
7 55163790 G C NM_001346900.2 EGFR 14 c.1530G>C p.Leu510=
7 55170317 G T NM_005228.5 EGFR 16 c.1881-858G>T NA
7 55170317 G T NM_001346899.2 EGFR 15 c.1746-858G>T NA
7 55170317 G T NM_001346941.2 EGFR 10 c.1080-858G>T NA
7 55170317 G T NM_001346898.2 EGFR 16 c.1881-858G>T NA
7 55170317 G T NM_001346897.2 EGFR 15 c.1746-858G>T NA
7 55170317 G T NM_201284.2 EGFR 16 c.1891G>T p.Glu631Ter
7 55170317 G T NM_001346900.2 EGFR 16 c.1722-858G>T NA
7 55174790 ATCTCCGAAAGCCAACAAGGAAATC A NM_005228.5 EGFR 19 c.2254_2277del p.Ser752_Ile759del
7 55174790 ATCTCCGAAAGCCAACAAGGAAATC A NM_001346899.2 EGFR 18 c.2119_2142del p.Ser707_Ile714del
7 55174790 ATCTCCGAAAGCCAACAAGGAAATC A NM_001346941.2 EGFR 13 c.1453_1476del p.Ser485_Ile492del
7 55174790 ATCTCCGAAAGCCAACAAGGAAATC A NM_001346898.2 EGFR 19 c.2254_2277del p.Ser752_Ile759del
7 55174790 ATCTCCGAAAGCCAACAAGGAAATC A NM_001346897.2 EGFR 18 c.2119_2142del p.Ser707_Ile714del
7 55174790 ATCTCCGAAAGCCAACAAGGAAATC A NM_001346900.2 EGFR 19 c.2095_2118del p.Ser699_Ile706del
7 55181377 A T NM_005228.5 EGFR 20 c.2368A>T p.Thr790Ser
7 55181377 A T NM_001346899.2 EGFR 19 c.2233A>T p.Thr745Ser
7 55181377 A T NM_001346941.2 EGFR 14 c.1567A>T p.Thr523Ser
7 55181377 A T NM_001346898.2 EGFR 20 c.2368A>T p.Thr790Ser
7 55181377 A T NM_001346897.2 EGFR 19 c.2233A>T p.Thr745Ser
7 55181377 A T NM_001346900.2 EGFR 20 c.2209A>T p.Thr737Ser
7 55181379 G A NM_005228.5 EGFR 20 c.2370G>A p.Thr790=
7 55181379 G A NM_001346899.2 EGFR 19 c.2235G>A p.Thr745=
7 55181379 G A NM_001346941.2 EGFR 14 c.1569G>A p.Thr523=
7 55181379 G A NM_001346898.2 EGFR 20 c.2370G>A p.Thr790=
7 55181379 G A NM_001346897.2 EGFR 19 c.2235G>A p.Thr745=
7 55181379 G A NM_001346900.2 EGFR 20 c.2211G>A p.Thr737=
7 55181379 G T NM_005228.5 EGFR 20 c.2370G>T p.Thr790=
7 55181379 G T NM_001346899.2 EGFR 19 c.2235G>T p.Thr745=
7 55181379 G T NM_001346941.2 EGFR 14 c.1569G>T p.Thr523=
7 55181379 G T NM_001346898.2 EGFR 20 c.2370G>T p.Thr790=
7 55181379 G T NM_001346897.2 EGFR 19 c.2235G>T p.Thr745=
7 55181379 G T NM_001346900.2 EGFR 20 c.2211G>T p.Thr737=
7 55191821 C T NM_005228.5 EGFR 21 c.2572C>T p.Leu858=
7 55191821 C T NM_001346899.2 EGFR 20 c.2437C>T p.Leu813=
7 55191821 C T NM_001346941.2 EGFR 15 c.1771C>T p.Leu591=
7 55191821 C T NM_001346898.2 EGFR 21 c.2572C>T p.Leu858=
7 55191821 C T NM_001346897.2 EGFR 20 c.2437C>T p.Leu813=
7 55191821 C T NM_001346900.2 EGFR 21 c.2413C>T p.Leu805=
7 55191821 CT AG NM_005228.5 EGFR 21 c.2572_2573inv p.Leu858Arg
7 55191821 CT AG NM_001346899.2 EGFR 20 c.2437_2438inv p.Leu813Arg
7 55191821 CT AG NM_001346941.2 EGFR 15 c.1771_1772inv p.Leu591Arg
7 55191821 CT AG NM_001346898.2 EGFR 21 c.2572_2573inv p.Leu858Arg
7 55191821 CT AG NM_001346897.2 EGFR 20 c.2437_2438inv p.Leu813Arg
7 55191821 CT AG NM_001346900.2 EGFR 21 c.2413_2414inv p.Leu805Arg
7 55191822 T A NM_005228.5 EGFR 21 c.2573T>A p.Leu858Gln
7 55191822 T A NM_001346899.2 EGFR 20 c.2438T>A p.Leu813Gln
7 55191822 T A NM_001346941.2 EGFR 15 c.1772T>A p.Leu591Gln
7 55191822 T A NM_001346898.2 EGFR 21 c.2573T>A p.Leu858Gln
7 55191822 T A NM_001346897.2 EGFR 20 c.2438T>A p.Leu813Gln
7 55191822 T A NM_001346900.2 EGFR 21 c.2414T>A p.Leu805Gln
7 55191822 TG GT NM_005228.5 EGFR 21 c.2573_2574delinsGT p.Leu858Arg
7 55191822 TG GT NM_001346899.2 EGFR 20 c.2438_2439delinsGT p.Leu813Arg
7 55191822 TG GT NM_001346941.2 EGFR 15 c.1772_1773delinsGT p.Leu591Arg
7 55191822 TG GT NM_001346898.2 EGFR 21 c.2573_2574delinsGT p.Leu858Arg
7 55191822 TG GT NM_001346897.2 EGFR 20 c.2438_2439delinsGT p.Leu813Arg
7 55191822 TG GT NM_001346900.2 EGFR 21 c.2414_2415delinsGT p.Leu805Arg
7 55191823 G T NM_005228.5 EGFR 21 c.2574G>T p.Leu858=
7 55191823 G T NM_001346899.2 EGFR 20 c.2439G>T p.Leu813=
7 55191823 G T NM_001346941.2 EGFR 15 c.1773G>T p.Leu591=
7 55191823 G T NM_001346898.2 EGFR 21 c.2574G>T p.Leu858=
7 55191823 G T NM_001346897.2 EGFR 20 c.2439G>T p.Leu813=
7 55191823 G T NM_001346900.2 EGFR 21 c.2415G>T p.Leu805=
7 55191858 A G NM_005228.5 EGFR 21 c.2609A>G p.His870Arg
7 55191858 A G NM_001346899.2 EGFR 20 c.2474A>G p.His825Arg
7 55191858 A G NM_001346941.2 EGFR 15 c.1808A>G p.His603Arg
7 55191858 A G NM_001346898.2 EGFR 21 c.2609A>G p.His870Arg
7 55191858 A G NM_001346897.2 EGFR 20 c.2474A>G p.His825Arg
7 55191858 A G NM_001346900.2 EGFR 21 c.2450A>G p.His817Arg
7 55192790 G T NM_005228.5 EGFR 22 c.2650G>T p.Glu884Ter
7 55192790 G T NM_001346899.2 EGFR 21 c.2515G>T p.Glu839Ter
7 55192790 G T NM_001346941.2 EGFR 16 c.1849G>T p.Glu617Ter
7 55192790 G T NM_001346898.2 EGFR 22 c.2650G>T p.Glu884Ter
7 55192790 G T NM_001346897.2 EGFR 21 c.2515G>T p.Glu839Ter
7 55192790 G T NM_001346900.2 EGFR 22 c.2491G>T p.Glu831Ter
7 55192858 A T NM_005228.5 EGFR 22 c.2701+17A>T NA
7 55192858 A T NM_001346899.2 EGFR 21 c.2566+17A>T NA
7 55192858 A T NM_001346941.2 EGFR 16 c.1900+17A>T NA
7 55192858 A T NM_001346898.2 EGFR 22 c.2701+17A>T NA
7 55192858 A T NM_001346897.2 EGFR 21 c.2566+17A>T NA
7 55192858 A T NM_001346900.2 EGFR 22 c.2542+17A>T NA
7 55198790 C T NM_005228.5 EGFR 23 c.2775C>T p.Ser925=
7 55198790 C T NM_001346899.2 EGFR 22 c.2640C>T p.Ser880=
7 55198790 C T NM_001346941.2 EGFR 17 c.1974C>T p.Ser658=
7 55198790 C T NM_001346898.2 EGFR 23 c.2775C>T p.Ser925=
7 55198790 C T NM_001346897.2 EGFR 22 c.2640C>T p.Ser880=
7 55198790 C T NM_001346900.2 EGFR 23 c.2616C>T p.Ser872=
7 55200325 T C NM_005228.5 EGFR 24 c.2858T>C p.Ile953Thr
7 55200325 T C NM_001346899.2 EGFR 23 c.2723T>C p.Ile908Thr
7 55200325 T C NM_001346941.2 EGFR 18 c.2057T>C p.Ile686Thr
7 55200325 T C NM_001346898.2 EGFR 24 c.2858T>C p.Ile953Thr
7 55200325 T C NM_001346897.2 EGFR 23 c.2723T>C p.Ile908Thr
7 55200325 T C NM_001346900.2 EGFR 24 c.2699T>C p.Ile900Thr
7 55200325 T G NM_005228.5 EGFR 24 c.2858T>G p.Ile953Arg
7 55200325 T G NM_001346899.2 EGFR 23 c.2723T>G p.Ile908Arg
7 55200325 T G NM_001346941.2 EGFR 18 c.2057T>G p.Ile686Arg
7 55200325 T G NM_001346898.2 EGFR 24 c.2858T>G p.Ile953Arg
7 55200325 T G NM_001346897.2 EGFR 23 c.2723T>G p.Ile908Arg
7 55200325 T G NM_001346900.2 EGFR 24 c.2699T>G p.Ile900Arg
27 changes: 0 additions & 27 deletions parse_genome.py

This file was deleted.

Loading

0 comments on commit 07b750d

Please sign in to comment.