Skip to content
Genome-structure-evolution-analysis edited this page Nov 14, 2023 · 3 revisions

AKRUP

The AKRUP process for inferring ancestral karyotypes consists of several subroutines, and the user simply modifies the configuration file and enters the name of the subroutine to be executed, such as AKRUP -rb run_blast.conf Below, we describe the AKRUP software in detail.

Help

AKRUP -h

AKRUP --help

AKRUP Arguments:

AKRUP-Software

Parameters Functions
-h --help Show this help message and exit
-e --example Displays the configured example
-rb --runblast Search for potential homologous gene pairs
-rc --runcolinearscan Infer genomic collinearity information
-rk --runks Calculate Ka/Ks for homologous gene pairs
-d --dotplot Show homologous gene dotplot
-bd --blockdotplot Show synteny block dotplot
-lk --loadblock Load collinearity information
-eb --eventblock Obtain event-related syntenic region
-kf --ksfigure Draw Ks distribution
-ec --event-correspondence Extract event-related syntenic region
-cd --csrdotplot Show continuous syntenic regions dotplot
-ed --eventdotplot Show event-related syntenic region dotplot
-iak --inferranckaryotype Inferring ancestral karyotypes
-akf --anckaryotypefig Draw karyotypes figure
-ags --ancgenomeseqs Extraction of ancestral genome sequence
-td --trajectorydotplot Show ancestal karyotype trajectory

Subroutines

runblast

# Get subroutine config file
AKRUP -e rb > run_blast.conf

# running program commond
AKRUP -rb run_blast.conf

# run_blast.conf:
[blast]
num_thread = num/auto
evalue = 1e-5
outfmt = 6
max_target_seqs = 10
querypep = query pep file
subjectpep = subject pep file
outblast = save blast file (spec_spec.blast)
Parameters Standards and instructions
num_thread Type: int/auto | Default: auto
Set the number of threads; auto: Automatic setting.
evalue Type: float | Default: 1e-5
Evalue value in blast result.
outfmt Type: int | Default: 6
outformat, 6 = Tabular.
max_target_seqs Type: int | Default: 10
Maximum number of aligned sequences to keep.
querypep Type: file | Default: none
Query Sequence pep file.
subjectpep Type: file | Default: none
Subject Sequence pep file.
outblast Type: file | Default: none
blast results file

runcolinearscan

# Get subroutine config file
AKRUP -e rc > run_ColinearScan.conf

# running program commond
AKRUP -rc run_ColinearScan.conf

# run_ColinearScan.conf:
[colinearscan]
num_thread = num/auto
lens_file1 = lens1 file
lens_file2 = lens2 file
gff_file1 = gff1 file
gff_file2 = gff2 file
blast_file = blast file
save_block_file = save block file (spec_spec.block.rr.txt)
Parameters Standards and instructions
num_thread Type: int/auto | Default: auto
The first two columns of the blast result swap positions.
lens_file1 Type: file| Default: -
lens file.
lens_file2 Type: file| Default: -
lens file.
gff_file1 Type: file| Default: -
gff file.
gff_file2 Type: file| Default: -
gff file.
blast_file Type: file | Default: *.blast
Result of running blast.
save_block_file Type: file | Default: *.csv
Colinearscan result file.

runks

# Get subroutine config file
AKRUP -e rk > run_ks.conf

# running program commond
AKRUP -rk run_ks.conf

# run_ks.conf:
[ks]
species_cds1 = cds1 file
species_cds2 = cds2 file
block_file = block file
save_ks_file = save ks file  (spec_spec.ks.txt)
Parameters Standards and instructions
species_cds1 Type: file | Default: *.cds
cds file.
species_cds2 Type: file | Default: *.cds
cds file.
block_file Type: file | Default: .block.rr.txt
Result of running Colinearscan.
save_ks_file Type: file | Default: *.ks.txt
Ks calculation result.

dotplot

# Get subroutine config file
AKRUP -e d > blast_dotplot.conf

# running program commond
AKRUP -d blast_dotplot.conf

# blast_dotplot.conf:
[dotplot]
genome1_name = genome1 name
genome2_name = genome2 name
lens_file1 = lens1 file
lens_file2 = lens2 file
gff_file1 = gff1 file
gff_file2 = gff2 file
blast_file = blast file
multiple = 1
score = 100
evalue = 1e-5
repnum = 20
hitnum = 5
dpi = 300
savefile = savefile (*.png pdf svg)
Parameters Standards and instructions
genome1_name Type: str | Default: -
Latin name of species.
genome2_name Type: str | Default: -
Latin name of species.
lens_file1 Type: file| Default: -
lens file.
lens_file2 Type: file| Default: -
lens file.
gff_file1 Type: file| Default: -
gff file.
gff_file2 Type: file| Default: -
gff file.
blast_file Type: file | Default: none
Result of running blast.
multiple Type: int| Default: 1
The best number of homologous genes shown with red dots.
score Type: int| Default: 100
Score value in blast result.
evalue Type: float| Default: 1e-5
Evalue in blast result.
repnum Type: int| Default: 20
The maximum number of homologous genes to be plotted.
hitnum Type: int| Default: 5
The number of second-best homologous genes, indicated by blue dots.
dpi Type: int| Default: 300
The number of pixels per inch of the image.
savefile Type: [*.png, *.pdf, *.svg]| Default: *.png
Save pictures support png, pdf, svg formats.

blockdotplot

# Get subroutine config file
AKRUP -e bd > block_dotplot.conf

# running program commond
AKRUP -bd block_dotplot.conf

# block_dotplot.conf:
[blockdotplot]
genome1_name = genome1 name
genome2_name = genome2 name
lens_file1 = lens1 file
lens_file2 = lens2 file
gff_file1 = gff1 file
gff_file2 = gff2 file
blast_file = blast file
block_file = block file
multiple = 1
block_num = 8
score = 100
evalue = 1e-5
repnum = 20
hitnum = 5
dpi = 300
savefile = savefile (*.png pdf svg)
Parameters Standards and instructions
genome1_name Type: str | Default: -
Latin name of species.
genome2_name Type: str | Default: -
Latin name of species.
lens_file1 Type: file| Default: -
lens file.
lens_file2 Type: file| Default: -
lens file.
gff_file1 Type: file| Default: -
gff file.
gff_file2 Type: file| Default: -
gff file.
blast_file Type: file | Default: none
Result of running blast.
block_file Type: file | Default: none
Result of running Colinearscan.
multiple Type: int| Default: 1
The best number of homologous genes, indicated by red dots.
block_num Type: int| Default: 8
Show the minimum length of a synteny block.
score Type: int| Default: 100
Score value in blast result.
evalue Type: int| Default: 1e-5
Evalue in blast result.
repnum Type: int| Default: 20
The maximum number of homologous genes is allowed to remove more than part of the population.
hitnum Type: int| Default: 5
The number of second-best homologous genes, indicated by blue dots.
dpi Type: int| Default: 300
The number of pixels per inch of the image.
savefile Type: [*.png, *.pdf, *.svg]| Default: *.png
Save pictures support png, pdf, svg formats.

loadblock

# Get subroutine config file
AKRUP -e lk > Loadblock.conf

# running program commond
AKRUP -lk Loadblock.conf

# Loadblock.conf:
[loadblock]
lens_file1 = lens1 file
lens_file2 = lens2 file
gff_file1 = gff1 file
gff_file2 = gff2 file
blast_file = blast file
block_file = block file
ks_file = ks file
score = 100
evalue = 1e-5
repnum = 20
save_file = *.block.information.csv
Parameters Standards and instructions
lens_file1 Type: file| Default: -
lens file.
lens_file2 Type: file| Default: -
lens file.
gff_file1 Type: file| Default: -
gff file.
gff_file2 Type: file| Default: -
gff file.
blast_file Type: file | Default: none
Result of running blast.
block_file Type: file | Default: none
Result of running Colinearscan.
ks_file Type: file | Default: none
Result of Ks calculation.
score Type: int| Default: 100
Score value in blast result.
evalue Type: float| Default: 1e-5
Evalue in blast result.
repnum Type: int| Default: 20
The maximum number of homologous genes is allowed to remove more than part of the population.
save_file Type: file| Default: *.block.information.csv
result file.

eventblock

# Get subroutine config file
AKRUP -e eb > event_block.conf

# running program commond
AKRUP -eb event_block.conf

# event_block.conf:
[eventblock]
name = spec1name_spec2name
lens_file1 = lens1 file
lens_file2 = lens2 file
block_info = block info file (*.block.information.csv)
hocv_depth = 1
pkcolor = orange
pk_hocv = 0.8
pk_block_num = 30
range_k = 0.15
block_num = 5
hocv = -1
dpi = 300
save_file = save file (*.EventRelate_block.information.csv)
# Result files automatically generated by subroutines
*.ks_distribute.txt    Event-related ks peak parameters: a,u,sigma;
*Ks-event.middle.dotplot.png    Event-related region dotplot for checking if the parameters are appropriate;
Parameters Standards and instructions
name Type: str| Default: spec1name_spec2name
Short for combination of two species.
lens_file1 Type: file| Default: -
lens file.
lens_file2 Type: file| Default: -
lens file.
block_info Type: file| Default: -
result of Subroutine loadblock (-lk).
hocv_depth Type: int[1-8]| Default: 1
Depth of best homologous gene pairs.
pkcolor Type: color | Default: Orange
The color of the event related peak (Ks distribution).
pk_hocv Type: float[-1-1] | Default: 0.8
Minimum event-related peak fit best homologous gene pair ratio.
pk_block_num Type: int | Default: 30
Minimum block length for event relate related fitting.
range_k Type: float| Default: 0.3
Get the allowed Ks error range for the event-related region.
block_num Type: int| Default: 5
Show the minimum length of a synteny block.
hocv Type: float| Default: -1
Evaluate the ratio of the best homologous gene pairs in synteny block, with a range of -1, 1.
dpi Type: int| Default: 300
The number of pixels per inch of the image.
save_file Type: file| Default: *.EventRelate_block.information.csv
result file.

ksfigure

# Get subroutine config file
AKRUP -e kf > ksdistribute.conf

# running program commond
AKRUP -kf ksdistribute.conf

# ksdistribute.conf:
[ksdistribute]
ks_file = ks_input.csv
area = 0,2.2
width = 6
height = 5
scale = 2
alpha = 0.6
dpi = 300
save_fig = save.ksdistribute.png  # png pdf svg
Parameters Standards and instructions
ks_file Type: file | Default: *.csv
Output result of Subroutine eventblock (-eb).
area Type: int | Default: 1
Show the range of ks.
width Type: int | Default: 6
Save the width of the graph.
height Type: int | Default: 5
Save the height of the graph.
scale Type: int | Default: 2
Proportion of event-related peaks.
alpha Type: float | Default: 0.6
Saving Graphic Transparency.
dpi Type: int| Default: 300
The number of pixels per inch of the image.
savefile Type: [*.png, *.pdf, *.svg]| Default: *.png
Save pictures support png, pdf, svg formats.

event-correspondence

# Get subroutine config file
AKRUP -e ec > Polyploidy_CSR.conf

# running program commond
AKRUP -ec Polyploidy_CSR.conf

# Polyploidy_CSR.conf:
[Polyploidy_CSR]
corr_file = *.top.correspondence.txt
blockinfo = *.EventRelate_block.information.csv
save_file = *.Polyploidy-block.information.csv
Parameters Standards and instructions
corr_file Type: file | Default: -
CSR location file.
blockinfo Type: file | Default: -
Output result of Subroutine eventblock (-eb).
save_file Type: file | Default: Polyploidy-block.information.csv
result file.

csrdotplot

# Get subroutine config file
AKRUP -e cd > CSR_dotplot.conf

# running program commond
AKRUP -cd CSR_dotplot.conf

# CSR_dotplot.conf:
[csrdotplot]
genome1_name = genome1 name
genome2_name = genome2 name
lens_file1 = lens1 file
lens_file2 = lens2 file
block_info = block info file (*.Polyploidy-block.information.csv)
top_ancestor_file = top.color.pos (A.*.top.color.pos.txt)
left_ancestor_file = left.color.pos (A.*.left.color.pos.txt)
block_num = 8
hocv_depth = 1
hocv = -1
dpi = 300
savefile = savefile (*.png pdf svg)
Parameters Standards and instructions
genome1_name Type: str | Default: -
Latin name of species.
genome2_name Type: str | Default: -
Latin name of species.
lens_file1 Type: file| Default: -
lens file.
lens_file2 Type: file| Default: -
lens file.
block_info Type: file| Default: *.csv
Output result of Subroutine event-correspondence (-ec).
top_ancestor_file Type: file| Default: -
ancestor location file.
left_ancestor_file Type: file | Default: none
ancestor location file.
block_num Type: int| Default: 8
Show the minimum length of a synteny block.
hocv_depth Type: int[1-8]| Default: 1
Depth of best homologous gene pairs.
hocv Type: float| Default: -1
Evaluate the ratio of the best homologous gene pairs in synteny block, with a range of -1, 1.
dpi Type: int| Default: 300
The number of pixels per inch of the image.
savefile Type: [*.png, *.pdf, *.svg]| Default: *.png
Save pictures support png, pdf, svg formats.

eventdotplot

# Get subroutine config file
AKRUP -e ed > event_dotplot.conf

# running program commond
AKRUP -ed event_dotplot.conf

# event_dotplot.conf:
[eventdotplot]
genome1_name = genome1 name
genome2_name = genome2 name
lens_file1 = lens1 file
lens_file2 = lens2 file
block_info = EventRelate_block (*.EventRelate_block.information.csv)
range_k = 0.3 (error range)
peaks = a,u,sigma
pkcolor = orange
peakflag = True
block_num = 5
hocv_depth = 1
hocv = -1
dpi = 300
savefile = savefile (*.png pdf svg)
Parameters Standards and instructions
genome1_name Type: str | Default: -
Latin name of species.
genome2_name Type: str | Default: -
Latin name of species.
lens_file1 Type: file| Default: -
lens file.
lens_file2 Type: file| Default: -
lens file.
block_info Type: file| Default: *.csv
Output result of Subroutine eventblock (-eb) / event-correspondence (-ec).
peaks Type: [a,u,sigma] | Default: -
Output result of Subroutine eventblock (-eb): *.ks_distribute.txt.
range_k Type: float| Default: 0.3
Get the allowed Ks error range for the event-related region.
pkcolor Type: color | Default: Orange
The color of the area related to the event.
peakflag Type: bool | Default: True
Plot peaks in the dotplot.
block_num Type: int| Default: 8
Show the minimum length of a synteny block.
hocv_depth Type: int[1-8]| Default: 1
Depth of best homologous gene pairs.
hocv Type: float| Default: -1
Evaluate the ratio of the best homologous gene pairs in synteny block, with a range of -1, 1.
dpi Type: int| Default: 300
The number of pixels per inch of the image.
savefile Type: [*.png, *.pdf, *.svg]| Default: *.png
Save pictures support png, pdf, svg formats.

inferranckaryotype

# Get subroutine config file
AKRUP -e iak > infer_anckaryotype.conf

# running program commond
AKRUP -iak infer_anckaryotype.conf

# infer_anckaryotype.conf:
[ancestral]
species = refspec_spec1_spec2
bk_files = refspec_spec1:refspec_spec1.Polyploidy-block.information.csv,refspec_spec2:refspec_spec2.Polyploidy-block.information.csv
len_files = refspec:refspec.lens,spec1:spec1.lens,spec2:spec2.lens

wgds = species1:wgdnums,species2:wgdnums (All wgd events experienced by the species, doubled to 2, tripled to 3, and so on, connected by "_". For example, species1:2_2 WGD occurred twice after disagreement with species3)

              +--WGD2--species1
     +--WGD1--|
     |        +--------species2
-----|
     +-----------------species3
 
latin_name = refspec:refspec genome name,spec1:spec1 genome name,spec2:spec2 genome name
intergenomicratio = orthologous synteny ratio (default: 1)
hocv_depths = refspec_spec1:num,refspec_spec2:num
select_ref_ancestor_spec = Species names (construction of ancestral genomes based on sequences of selected species)
block_num = 5
hocv = -1 (range: -1< hocv <1)

# This parameter is required if the two species share WGD
common_wgd = False
Conserved_spec = spec

# Inferring karyotype before WGD
infer_wgd_flag = False
infer_name = refspec_spec1,refspec_spec2

# Shared WGD to infer karyotype/inferring karyotype before WGD Both require this parameter
recentwgdchr = spec1:spec1_recent_wgd_chr.txt,spec2:spec2_recent_wgd_chr.txt

save_path = .  # Default current path strength
# Result files automatically generated by subroutines
A.AKRUP-ags-select_*.Construct_ancestral_genomes.conf.txt    Profiles required to construct ancestral genomes;
ancestor_color_order    Ancestor color order file;
A.*.left.color.pos.txt    Ancestor location file (left: For example, Os species in Os_Bdi);
A.*.top.color.pos.txt    Ancestor location file (top: For example, Bdi species in Os_Bdi);
*.pdf/*.png   The inferred ancestral karyotype results are shown in Fig;
A-*-ancestral_chromosome_conf.txt    Inferred ancestral karyotype information;
A-*-chromosome_CAR.txt    Inferred conservative ancestral regions (CARs);
A-*-ancestral_CSR-color_conf.txt    Colors corresponding to CSRs in the ancestral karyotype;

# Files generated when inferring the WGD-before ancestral karyotype of a species
A-*-chromosome_CAR.WGD-before.txt    Inferred conservative ancestral regions (CARs) prior to WGD;
A-(*)-chromosome_ancestral_color_pos.(*)-WGD-before.txt    Ancestor location file prior to WGD;
Parameters Standards and instructions
species Type: str | Default: -
Abbreviations for outgroup and two investigated species (refspec_spec1_spec2).
bk_files Type: file | Default: -
Output result of Subroutine event-correspondence (-ec).
len_files Type: file | Default: -
Lens files for outgroup and inferred two investigated species.
wgds Type: str | Default: -
WGD events experienced by investigated species:species1:wgdnums,species2:wgdnums (All wgd events experienced by the species, doubled to 2, tripled to 3, and so on, connected by "_". For example, species1:2_2 WGD occurred twice after disagreement with species3).
latin_name Type: float | Default: 1e-5
Latin name of outgroup and inferred two investigated species.
intergenomicratio Type: int | Default: 1
Inferred orthologous ratios of the two investigated species.
hocv_depth Type: int[1-8]| Default: 1
Depth of best homologous gene pairs.
hocv Type: float| Default: -1
Evaluate the ratio of the best homologous gene pairs of collinearity block, with a range of -1, 1.
select_ref_ancestor_spec Type: str| Default: -
An abbreviation for relatively conserved species and outgroup (or reference genome) used to construct the ancestral genome.
block_num Type: int| Default: 8
Show the minimum length of a synteny block.
common_wgd Type: bool | Default: False
WGD events shared by the two investigated genomes within the evolutionary node of ancestral karyotype inference.
Conserved_spec Type: str | Default: -
Abbreviation for the relatively conserved species of the two investigated species (Required if common_wgd is True).
infer_wgd_flag Type: bool | Default: False
Inferring ancestral karyotypes prior to WGD.
infer_name Type: file | Default: -
Species names for which pre-WGD ancestral karyotypes need to be inferred (Required if infer_wgd_flagis True).
recentwgdchr Type: file | Default: -
Subsidiary files required for inferring ancestral karyotypes prior to WGD events.
save_path Type: dir | Default: .
Path to save the results.

anckaryotypefig

# Get subroutine config file
AKRUP -e akf > ancestral_plotfig.conf

# running program commond
AKRUP -akf ancestral_plotfig.conf

# ancestral_plotfig.conf:
[ancestralfig]
name = species/ancestor node name
anc_reverse = True
frame_flag = True
ancestor_file = ancestor location file
savefile = savefile (*.png pdf svg)
Parameters Standards and instructions
name Type: str | Default: -
species/ancestor node name.
anc_reverse Type: bool | Default: True
Ancestral karyotype drawing reverses direction.
frame_flag Type: bool| Default: True
Draw black box.
ancestor_file Type: file| Default: -
Ancestor location file.
savefile Type: [*.png, *.pdf, *.svg]| Default: *.png
Save pictures support png, pdf, svg formats.

ancgenomeseqs

# Get subroutine config file
AKRUP -e ags > ancestralseq.conf

# running program commond
AKRUP -ags ancestralseq.conf

# ancestralseq.conf:
[ancestralseq]
mark = seq name
gff = gff file
ancestor_color_order = ancestot_color_order
ancestor_conf = ancestor conf
pep_file = pep file
cds_file = cds file
ancestor_pep =  ancestor pep file
ancestor_cds = ancestor cds file
ancestor_gff =  ancestor gff file
ancestor_lenstxt = ancestor lenstxt
ancestor_lens =  ancestor lens
ancestor_file = ancestor file
Parameters Standards and instructions
mark Type: str | Default: -
Ancestral genome name set.
gff Type: file | Default: -
Gff file for the relatively conserved species of the two investigated species.
ancestor_color_order Type: file | Default: -
Ancestor color order, AKRUP -iak result file
ancestor_conf Type: file | Default: -
Ancestor location file.
pep_file Type: file | Default: -
Pep file for the relatively conserved species of the two investigated species.
cds_file Type: file | Default: -
Cds file for the relatively conserved species of the two investigated species.
ancestor_pep Type: file | Default: -
The maximum number of homologous genes is allowed to remove more than part of the population.
ancestor_cds Type: file | Default: -
The maximum number of homologous genes is allowed to remove more than part of the population.
ancestor_gff Type: file | Default: -
Gff file of the extracted ancestral genome.
ancestor_lenstxt Type: file | Default: -
Lens file of the extracted ancestral karyotype (chr\tchr length\tgene num).
ancestor_lens Type: file | Default: -
Lens file of the extracted ancestral karyotype (chr\tgene num).
ancestor_file Type: file | Default: -
Updated the ancestral location file.

trajectorydotplot

# Get subroutine config file
AKRUP -e td > dotplot_trajectory.conf

# running program commond
AKRUP -td dotplot_trajectory.conf

# dotplot_trajectory.conf:
[karyotype]
genome1_name = genome1 name
genome2_name = genome2 name
lens_file1 = lens1 file
lens_file2 = lens2 file
gff_file1 = gff1 file
gff_file2 = gff2 file
blast_file = blast file
top_redefining_ancestor_file = redefining color file (*.ancestor_genome_conf.txt)
top_trajectory_ancestor_file = trajectory color (*.ancestor_trajectory_conf.txt)
left_ancestor_file = ancestor color file (*.ancestor_genome_conf.txt)
multiple = 1
score = 100
evalue = 1e-5
repnum = 20
hitnum = 5
process_name = WGD name/num fusion
dpi = 800
savefile = savefile (*.png pdf svg)
Parameters Standards and instructions
genome1_name Type: str | Default: -
Latin name of species.
genome2_name Type: str | Default: -
Latin name of species.
lens_file1 Type: file| Default: -
lens file.
lens_file2 Type: file| Default: -
lens file.
gff_file1 Type: file| Default: -
gff file.
gff_file2 Type: file| Default: -
gff file.
blast_file Type: file | Default: -
Result of running blast.
top_redefining_ancestor_file Type: file | Default: -
Redefine the ancestor color in the inner node ancestor location file.
top_trajectory_ancestor_file Type: file | Default: -
Inner node ancestor location file.
left_ancestor_file Type: file | Default: -
Outer node ancestor location file.
multiple Type: int| Default: 1
The number of best homologous genes, indicated by red dots.
score Type: int| Default: 100
Score value in blast result.
evalue Type: float| Default: 1e-5
Evalue in blast result.
repnum Type: int| Default: 20
The maximum number of homologous genes is allowed to remove more than part of the population.
hitnum Type: int| Default: 5
The number of second-best homologous genes, indicated by blue dots.
process_name Type: str| Default: -
WGD name/num fusion.
dpi Type: int| Default: 300
The number of pixels per inch of the image.
savefile Type: [*.png, *.pdf, *.svg]| Default: *.png
Save pictures support png, pdf, svg formats.