-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Genome-structure-evolution-analysis edited this page Nov 14, 2023
·
3 revisions
The AKRUP process for inferring ancestral karyotypes consists of several subroutines, and the user simply modifies the configuration file and enters the name of the subroutine to be executed, such as AKRUP -rb run_blast.conf
Below, we describe the AKRUP software in detail.
AKRUP -h
AKRUP --help
Parameters | Functions | |
---|---|---|
-h | --help | Show this help message and exit |
-e | --example | Displays the configured example |
-rb | --runblast | Search for potential homologous gene pairs |
-rc | --runcolinearscan | Infer genomic collinearity information |
-rk | --runks | Calculate Ka/Ks for homologous gene pairs |
-d | --dotplot | Show homologous gene dotplot |
-bd | --blockdotplot | Show synteny block dotplot |
-lk | --loadblock | Load collinearity information |
-eb | --eventblock | Obtain event-related syntenic region |
-kf | --ksfigure | Draw Ks distribution |
-ec | --event-correspondence | Extract event-related syntenic region |
-cd | --csrdotplot | Show continuous syntenic regions dotplot |
-ed | --eventdotplot | Show event-related syntenic region dotplot |
-iak | --inferranckaryotype | Inferring ancestral karyotypes |
-akf | --anckaryotypefig | Draw karyotypes figure |
-ags | --ancgenomeseqs | Extraction of ancestral genome sequence |
-td | --trajectorydotplot | Show ancestal karyotype trajectory |
# Get subroutine config file
AKRUP -e rb > run_blast.conf
# running program commond
AKRUP -rb run_blast.conf
# run_blast.conf:
[blast]
num_thread = num/auto
evalue = 1e-5
outfmt = 6
max_target_seqs = 10
querypep = query pep file
subjectpep = subject pep file
outblast = save blast file (spec_spec.blast)
Parameters | Standards and instructions |
---|---|
num_thread | Type: int/auto | Default: auto Set the number of threads; auto: Automatic setting. |
evalue | Type: float | Default: 1e-5 Evalue value in blast result. |
outfmt | Type: int | Default: 6 outformat, 6 = Tabular. |
max_target_seqs | Type: int | Default: 10 Maximum number of aligned sequences to keep. |
querypep | Type: file | Default: none Query Sequence pep file. |
subjectpep | Type: file | Default: none Subject Sequence pep file. |
outblast | Type: file | Default: none blast results file |
# Get subroutine config file
AKRUP -e rc > run_ColinearScan.conf
# running program commond
AKRUP -rc run_ColinearScan.conf
# run_ColinearScan.conf:
[colinearscan]
num_thread = num/auto
lens_file1 = lens1 file
lens_file2 = lens2 file
gff_file1 = gff1 file
gff_file2 = gff2 file
blast_file = blast file
save_block_file = save block file (spec_spec.block.rr.txt)
Parameters | Standards and instructions |
---|---|
num_thread | Type: int/auto | Default: auto The first two columns of the blast result swap positions. |
lens_file1 | Type: file| Default: - lens file. |
lens_file2 | Type: file| Default: - lens file. |
gff_file1 | Type: file| Default: - gff file. |
gff_file2 | Type: file| Default: - gff file. |
blast_file | Type: file | Default: *.blast Result of running blast. |
save_block_file | Type: file | Default: *.csv Colinearscan result file. |
# Get subroutine config file
AKRUP -e rk > run_ks.conf
# running program commond
AKRUP -rk run_ks.conf
# run_ks.conf:
[ks]
species_cds1 = cds1 file
species_cds2 = cds2 file
block_file = block file
save_ks_file = save ks file (spec_spec.ks.txt)
Parameters | Standards and instructions |
---|---|
species_cds1 | Type: file | Default: *.cds cds file. |
species_cds2 | Type: file | Default: *.cds cds file. |
block_file | Type: file | Default: .block.rr.txt Result of running Colinearscan. |
save_ks_file | Type: file | Default: *.ks.txt Ks calculation result. |
# Get subroutine config file
AKRUP -e d > blast_dotplot.conf
# running program commond
AKRUP -d blast_dotplot.conf
# blast_dotplot.conf:
[dotplot]
genome1_name = genome1 name
genome2_name = genome2 name
lens_file1 = lens1 file
lens_file2 = lens2 file
gff_file1 = gff1 file
gff_file2 = gff2 file
blast_file = blast file
multiple = 1
score = 100
evalue = 1e-5
repnum = 20
hitnum = 5
dpi = 300
savefile = savefile (*.png pdf svg)
Parameters | Standards and instructions |
---|---|
genome1_name | Type: str | Default: - Latin name of species. |
genome2_name | Type: str | Default: - Latin name of species. |
lens_file1 | Type: file| Default: - lens file. |
lens_file2 | Type: file| Default: - lens file. |
gff_file1 | Type: file| Default: - gff file. |
gff_file2 | Type: file| Default: - gff file. |
blast_file | Type: file | Default: none Result of running blast. |
multiple | Type: int| Default: 1 The best number of homologous genes shown with red dots. |
score | Type: int| Default: 100 Score value in blast result. |
evalue | Type: float| Default: 1e-5 Evalue in blast result. |
repnum | Type: int| Default: 20 The maximum number of homologous genes to be plotted. |
hitnum | Type: int| Default: 5 The number of second-best homologous genes, indicated by blue dots. |
dpi | Type: int| Default: 300 The number of pixels per inch of the image. |
savefile | Type: [*.png, *.pdf, *.svg]| Default: *.png Save pictures support png, pdf, svg formats. |
# Get subroutine config file
AKRUP -e bd > block_dotplot.conf
# running program commond
AKRUP -bd block_dotplot.conf
# block_dotplot.conf:
[blockdotplot]
genome1_name = genome1 name
genome2_name = genome2 name
lens_file1 = lens1 file
lens_file2 = lens2 file
gff_file1 = gff1 file
gff_file2 = gff2 file
blast_file = blast file
block_file = block file
multiple = 1
block_num = 8
score = 100
evalue = 1e-5
repnum = 20
hitnum = 5
dpi = 300
savefile = savefile (*.png pdf svg)
Parameters | Standards and instructions |
---|---|
genome1_name | Type: str | Default: - Latin name of species. |
genome2_name | Type: str | Default: - Latin name of species. |
lens_file1 | Type: file| Default: - lens file. |
lens_file2 | Type: file| Default: - lens file. |
gff_file1 | Type: file| Default: - gff file. |
gff_file2 | Type: file| Default: - gff file. |
blast_file | Type: file | Default: none Result of running blast. |
block_file | Type: file | Default: none Result of running Colinearscan. |
multiple | Type: int| Default: 1 The best number of homologous genes, indicated by red dots. |
block_num | Type: int| Default: 8 Show the minimum length of a synteny block. |
score | Type: int| Default: 100 Score value in blast result. |
evalue | Type: int| Default: 1e-5 Evalue in blast result. |
repnum | Type: int| Default: 20 The maximum number of homologous genes is allowed to remove more than part of the population. |
hitnum | Type: int| Default: 5 The number of second-best homologous genes, indicated by blue dots. |
dpi | Type: int| Default: 300 The number of pixels per inch of the image. |
savefile | Type: [*.png, *.pdf, *.svg]| Default: *.png Save pictures support png, pdf, svg formats. |
# Get subroutine config file
AKRUP -e lk > Loadblock.conf
# running program commond
AKRUP -lk Loadblock.conf
# Loadblock.conf:
[loadblock]
lens_file1 = lens1 file
lens_file2 = lens2 file
gff_file1 = gff1 file
gff_file2 = gff2 file
blast_file = blast file
block_file = block file
ks_file = ks file
score = 100
evalue = 1e-5
repnum = 20
save_file = *.block.information.csv
Parameters | Standards and instructions |
---|---|
lens_file1 | Type: file| Default: - lens file. |
lens_file2 | Type: file| Default: - lens file. |
gff_file1 | Type: file| Default: - gff file. |
gff_file2 | Type: file| Default: - gff file. |
blast_file | Type: file | Default: none Result of running blast. |
block_file | Type: file | Default: none Result of running Colinearscan. |
ks_file | Type: file | Default: none Result of Ks calculation. |
score | Type: int| Default: 100 Score value in blast result. |
evalue | Type: float| Default: 1e-5 Evalue in blast result. |
repnum | Type: int| Default: 20 The maximum number of homologous genes is allowed to remove more than part of the population. |
save_file | Type: file| Default: *.block.information.csv result file. |
# Get subroutine config file
AKRUP -e eb > event_block.conf
# running program commond
AKRUP -eb event_block.conf
# event_block.conf:
[eventblock]
name = spec1name_spec2name
lens_file1 = lens1 file
lens_file2 = lens2 file
block_info = block info file (*.block.information.csv)
hocv_depth = 1
pkcolor = orange
pk_hocv = 0.8
pk_block_num = 30
range_k = 0.15
block_num = 5
hocv = -1
dpi = 300
save_file = save file (*.EventRelate_block.information.csv)
# Result files automatically generated by subroutines
*.ks_distribute.txt Event-related ks peak parameters: a,u,sigma;
*Ks-event.middle.dotplot.png Event-related region dotplot for checking if the parameters are appropriate;
Parameters | Standards and instructions |
---|---|
name | Type: str| Default: spec1name_spec2name Short for combination of two species. |
lens_file1 | Type: file| Default: - lens file. |
lens_file2 | Type: file| Default: - lens file. |
block_info | Type: file| Default: - result of Subroutine loadblock (-lk). |
hocv_depth | Type: int[1-8]| Default: 1 Depth of best homologous gene pairs. |
pkcolor | Type: color | Default: Orange The color of the event related peak (Ks distribution). |
pk_hocv | Type: float[-1-1] | Default: 0.8 Minimum event-related peak fit best homologous gene pair ratio. |
pk_block_num | Type: int | Default: 30 Minimum block length for event relate related fitting. |
range_k | Type: float| Default: 0.3 Get the allowed Ks error range for the event-related region. |
block_num | Type: int| Default: 5 Show the minimum length of a synteny block. |
hocv | Type: float| Default: -1 Evaluate the ratio of the best homologous gene pairs in synteny block, with a range of -1, 1. |
dpi | Type: int| Default: 300 The number of pixels per inch of the image. |
save_file | Type: file| Default: *.EventRelate_block.information.csv result file. |
# Get subroutine config file
AKRUP -e kf > ksdistribute.conf
# running program commond
AKRUP -kf ksdistribute.conf
# ksdistribute.conf:
[ksdistribute]
ks_file = ks_input.csv
area = 0,2.2
width = 6
height = 5
scale = 2
alpha = 0.6
dpi = 300
save_fig = save.ksdistribute.png # png pdf svg
Parameters | Standards and instructions |
---|---|
ks_file | Type: file | Default: *.csv Output result of Subroutine eventblock (-eb). |
area | Type: int | Default: 1 Show the range of ks. |
width | Type: int | Default: 6 Save the width of the graph. |
height | Type: int | Default: 5 Save the height of the graph. |
scale | Type: int | Default: 2 Proportion of event-related peaks. |
alpha | Type: float | Default: 0.6 Saving Graphic Transparency. |
dpi | Type: int| Default: 300 The number of pixels per inch of the image. |
savefile | Type: [*.png, *.pdf, *.svg]| Default: *.png Save pictures support png, pdf, svg formats. |
# Get subroutine config file
AKRUP -e ec > Polyploidy_CSR.conf
# running program commond
AKRUP -ec Polyploidy_CSR.conf
# Polyploidy_CSR.conf:
[Polyploidy_CSR]
corr_file = *.top.correspondence.txt
blockinfo = *.EventRelate_block.information.csv
save_file = *.Polyploidy-block.information.csv
Parameters | Standards and instructions |
---|---|
corr_file | Type: file | Default: - CSR location file. |
blockinfo | Type: file | Default: - Output result of Subroutine eventblock (-eb). |
save_file | Type: file | Default: Polyploidy-block.information.csv result file. |
# Get subroutine config file
AKRUP -e cd > CSR_dotplot.conf
# running program commond
AKRUP -cd CSR_dotplot.conf
# CSR_dotplot.conf:
[csrdotplot]
genome1_name = genome1 name
genome2_name = genome2 name
lens_file1 = lens1 file
lens_file2 = lens2 file
block_info = block info file (*.Polyploidy-block.information.csv)
top_ancestor_file = top.color.pos (A.*.top.color.pos.txt)
left_ancestor_file = left.color.pos (A.*.left.color.pos.txt)
block_num = 8
hocv_depth = 1
hocv = -1
dpi = 300
savefile = savefile (*.png pdf svg)
Parameters | Standards and instructions |
---|---|
genome1_name | Type: str | Default: - Latin name of species. |
genome2_name | Type: str | Default: - Latin name of species. |
lens_file1 | Type: file| Default: - lens file. |
lens_file2 | Type: file| Default: - lens file. |
block_info | Type: file| Default: *.csv Output result of Subroutine event-correspondence (-ec). |
top_ancestor_file | Type: file| Default: - ancestor location file. |
left_ancestor_file | Type: file | Default: none ancestor location file. |
block_num | Type: int| Default: 8 Show the minimum length of a synteny block. |
hocv_depth | Type: int[1-8]| Default: 1 Depth of best homologous gene pairs. |
hocv | Type: float| Default: -1 Evaluate the ratio of the best homologous gene pairs in synteny block, with a range of -1, 1. |
dpi | Type: int| Default: 300 The number of pixels per inch of the image. |
savefile | Type: [*.png, *.pdf, *.svg]| Default: *.png Save pictures support png, pdf, svg formats. |
# Get subroutine config file
AKRUP -e ed > event_dotplot.conf
# running program commond
AKRUP -ed event_dotplot.conf
# event_dotplot.conf:
[eventdotplot]
genome1_name = genome1 name
genome2_name = genome2 name
lens_file1 = lens1 file
lens_file2 = lens2 file
block_info = EventRelate_block (*.EventRelate_block.information.csv)
range_k = 0.3 (error range)
peaks = a,u,sigma
pkcolor = orange
peakflag = True
block_num = 5
hocv_depth = 1
hocv = -1
dpi = 300
savefile = savefile (*.png pdf svg)
Parameters | Standards and instructions |
---|---|
genome1_name | Type: str | Default: - Latin name of species. |
genome2_name | Type: str | Default: - Latin name of species. |
lens_file1 | Type: file| Default: - lens file. |
lens_file2 | Type: file| Default: - lens file. |
block_info | Type: file| Default: *.csv Output result of Subroutine eventblock (-eb) / event-correspondence (-ec). |
peaks | Type: [a,u,sigma] | Default: - Output result of Subroutine eventblock (-eb): *.ks_distribute.txt. |
range_k | Type: float| Default: 0.3 Get the allowed Ks error range for the event-related region. |
pkcolor | Type: color | Default: Orange The color of the area related to the event. |
peakflag | Type: bool | Default: True Plot peaks in the dotplot. |
block_num | Type: int| Default: 8 Show the minimum length of a synteny block. |
hocv_depth | Type: int[1-8]| Default: 1 Depth of best homologous gene pairs. |
hocv | Type: float| Default: -1 Evaluate the ratio of the best homologous gene pairs in synteny block, with a range of -1, 1. |
dpi | Type: int| Default: 300 The number of pixels per inch of the image. |
savefile | Type: [*.png, *.pdf, *.svg]| Default: *.png Save pictures support png, pdf, svg formats. |
# Get subroutine config file
AKRUP -e iak > infer_anckaryotype.conf
# running program commond
AKRUP -iak infer_anckaryotype.conf
# infer_anckaryotype.conf:
[ancestral]
species = refspec_spec1_spec2
bk_files = refspec_spec1:refspec_spec1.Polyploidy-block.information.csv,refspec_spec2:refspec_spec2.Polyploidy-block.information.csv
len_files = refspec:refspec.lens,spec1:spec1.lens,spec2:spec2.lens
wgds = species1:wgdnums,species2:wgdnums (All wgd events experienced by the species, doubled to 2, tripled to 3, and so on, connected by "_". For example, species1:2_2 WGD occurred twice after disagreement with species3)
+--WGD2--species1
+--WGD1--|
| +--------species2
-----|
+-----------------species3
latin_name = refspec:refspec genome name,spec1:spec1 genome name,spec2:spec2 genome name
intergenomicratio = orthologous synteny ratio (default: 1)
hocv_depths = refspec_spec1:num,refspec_spec2:num
select_ref_ancestor_spec = Species names (construction of ancestral genomes based on sequences of selected species)
block_num = 5
hocv = -1 (range: -1< hocv <1)
# This parameter is required if the two species share WGD
common_wgd = False
Conserved_spec = spec
# Inferring karyotype before WGD
infer_wgd_flag = False
infer_name = refspec_spec1,refspec_spec2
# Shared WGD to infer karyotype/inferring karyotype before WGD Both require this parameter
recentwgdchr = spec1:spec1_recent_wgd_chr.txt,spec2:spec2_recent_wgd_chr.txt
save_path = . # Default current path strength
# Result files automatically generated by subroutines
A.AKRUP-ags-select_*.Construct_ancestral_genomes.conf.txt Profiles required to construct ancestral genomes;
ancestor_color_order Ancestor color order file;
A.*.left.color.pos.txt Ancestor location file (left: For example, Os species in Os_Bdi);
A.*.top.color.pos.txt Ancestor location file (top: For example, Bdi species in Os_Bdi);
*.pdf/*.png The inferred ancestral karyotype results are shown in Fig;
A-*-ancestral_chromosome_conf.txt Inferred ancestral karyotype information;
A-*-chromosome_CAR.txt Inferred conservative ancestral regions (CARs);
A-*-ancestral_CSR-color_conf.txt Colors corresponding to CSRs in the ancestral karyotype;
# Files generated when inferring the WGD-before ancestral karyotype of a species
A-*-chromosome_CAR.WGD-before.txt Inferred conservative ancestral regions (CARs) prior to WGD;
A-(*)-chromosome_ancestral_color_pos.(*)-WGD-before.txt Ancestor location file prior to WGD;
Parameters | Standards and instructions |
---|---|
species | Type: str | Default: - Abbreviations for outgroup and two investigated species (refspec_spec1_spec2). |
bk_files | Type: file | Default: - Output result of Subroutine event-correspondence (-ec). |
len_files | Type: file | Default: - Lens files for outgroup and inferred two investigated species. |
wgds | Type: str | Default: - WGD events experienced by investigated species:species1:wgdnums,species2:wgdnums (All wgd events experienced by the species, doubled to 2, tripled to 3, and so on, connected by "_". For example, species1:2_2 WGD occurred twice after disagreement with species3). |
latin_name | Type: float | Default: 1e-5 Latin name of outgroup and inferred two investigated species. |
intergenomicratio | Type: int | Default: 1 Inferred orthologous ratios of the two investigated species. |
hocv_depth | Type: int[1-8]| Default: 1 Depth of best homologous gene pairs. |
hocv | Type: float| Default: -1 Evaluate the ratio of the best homologous gene pairs of collinearity block, with a range of -1, 1. |
select_ref_ancestor_spec | Type: str| Default: - An abbreviation for relatively conserved species and outgroup (or reference genome) used to construct the ancestral genome. |
block_num | Type: int| Default: 8 Show the minimum length of a synteny block. |
common_wgd | Type: bool | Default: False WGD events shared by the two investigated genomes within the evolutionary node of ancestral karyotype inference. |
Conserved_spec | Type: str | Default: - Abbreviation for the relatively conserved species of the two investigated species (Required if common_wgd is True). |
infer_wgd_flag | Type: bool | Default: False Inferring ancestral karyotypes prior to WGD. |
infer_name | Type: file | Default: - Species names for which pre-WGD ancestral karyotypes need to be inferred (Required if infer_wgd_flagis True). |
recentwgdchr | Type: file | Default: - Subsidiary files required for inferring ancestral karyotypes prior to WGD events. |
save_path | Type: dir | Default: . Path to save the results. |
# Get subroutine config file
AKRUP -e akf > ancestral_plotfig.conf
# running program commond
AKRUP -akf ancestral_plotfig.conf
# ancestral_plotfig.conf:
[ancestralfig]
name = species/ancestor node name
anc_reverse = True
frame_flag = True
ancestor_file = ancestor location file
savefile = savefile (*.png pdf svg)
Parameters | Standards and instructions |
---|---|
name | Type: str | Default: - species/ancestor node name. |
anc_reverse | Type: bool | Default: True Ancestral karyotype drawing reverses direction. |
frame_flag | Type: bool| Default: True Draw black box. |
ancestor_file | Type: file| Default: - Ancestor location file. |
savefile | Type: [*.png, *.pdf, *.svg]| Default: *.png Save pictures support png, pdf, svg formats. |
# Get subroutine config file
AKRUP -e ags > ancestralseq.conf
# running program commond
AKRUP -ags ancestralseq.conf
# ancestralseq.conf:
[ancestralseq]
mark = seq name
gff = gff file
ancestor_color_order = ancestot_color_order
ancestor_conf = ancestor conf
pep_file = pep file
cds_file = cds file
ancestor_pep = ancestor pep file
ancestor_cds = ancestor cds file
ancestor_gff = ancestor gff file
ancestor_lenstxt = ancestor lenstxt
ancestor_lens = ancestor lens
ancestor_file = ancestor file
Parameters | Standards and instructions |
---|---|
mark | Type: str | Default: - Ancestral genome name set. |
gff | Type: file | Default: - Gff file for the relatively conserved species of the two investigated species. |
ancestor_color_order | Type: file | Default: - Ancestor color order, AKRUP -iak result file |
ancestor_conf | Type: file | Default: - Ancestor location file. |
pep_file | Type: file | Default: - Pep file for the relatively conserved species of the two investigated species. |
cds_file | Type: file | Default: - Cds file for the relatively conserved species of the two investigated species. |
ancestor_pep | Type: file | Default: - The maximum number of homologous genes is allowed to remove more than part of the population. |
ancestor_cds | Type: file | Default: - The maximum number of homologous genes is allowed to remove more than part of the population. |
ancestor_gff | Type: file | Default: - Gff file of the extracted ancestral genome. |
ancestor_lenstxt | Type: file | Default: - Lens file of the extracted ancestral karyotype (chr\tchr length\tgene num). |
ancestor_lens | Type: file | Default: - Lens file of the extracted ancestral karyotype (chr\tgene num). |
ancestor_file | Type: file | Default: - Updated the ancestral location file. |
# Get subroutine config file
AKRUP -e td > dotplot_trajectory.conf
# running program commond
AKRUP -td dotplot_trajectory.conf
# dotplot_trajectory.conf:
[karyotype]
genome1_name = genome1 name
genome2_name = genome2 name
lens_file1 = lens1 file
lens_file2 = lens2 file
gff_file1 = gff1 file
gff_file2 = gff2 file
blast_file = blast file
top_redefining_ancestor_file = redefining color file (*.ancestor_genome_conf.txt)
top_trajectory_ancestor_file = trajectory color (*.ancestor_trajectory_conf.txt)
left_ancestor_file = ancestor color file (*.ancestor_genome_conf.txt)
multiple = 1
score = 100
evalue = 1e-5
repnum = 20
hitnum = 5
process_name = WGD name/num fusion
dpi = 800
savefile = savefile (*.png pdf svg)
Parameters | Standards and instructions |
---|---|
genome1_name | Type: str | Default: - Latin name of species. |
genome2_name | Type: str | Default: - Latin name of species. |
lens_file1 | Type: file| Default: - lens file. |
lens_file2 | Type: file| Default: - lens file. |
gff_file1 | Type: file| Default: - gff file. |
gff_file2 | Type: file| Default: - gff file. |
blast_file | Type: file | Default: - Result of running blast. |
top_redefining_ancestor_file | Type: file | Default: - Redefine the ancestor color in the inner node ancestor location file. |
top_trajectory_ancestor_file | Type: file | Default: - Inner node ancestor location file. |
left_ancestor_file | Type: file | Default: - Outer node ancestor location file. |
multiple | Type: int| Default: 1 The number of best homologous genes, indicated by red dots. |
score | Type: int| Default: 100 Score value in blast result. |
evalue | Type: float| Default: 1e-5 Evalue in blast result. |
repnum | Type: int| Default: 20 The maximum number of homologous genes is allowed to remove more than part of the population. |
hitnum | Type: int| Default: 5 The number of second-best homologous genes, indicated by blue dots. |
process_name | Type: str| Default: - WGD name/num fusion. |
dpi | Type: int| Default: 300 The number of pixels per inch of the image. |
savefile | Type: [*.png, *.pdf, *.svg]| Default: *.png Save pictures support png, pdf, svg formats. |