Skip to content

Basic Options

jbelyeu edited this page Jun 3, 2021 · 1 revision

Samplot has a large number of command-line parameters. This page will describe the essential parameters in detail and a few of the optional but often recommended parameters.

Essential parameters:

  -b BAMS [BAMS ...], --bams BAMS [BAMS ...]
                        Space-delimited list of BAM/CRAM file names
  -o OUTPUT_FILE, --output_file OUTPUT_FILE
                        Output file name
  -c CHROM, --chrom CHROM
                        Chromosome
  -s START, --start START
                        Start position of region/variant
  -e END, --end END     End position of region/variant
  -r REFERENCE, --reference REFERENCE
                        Reference file for CRAM, required if CRAM files used
  • -b is a list of BAM or CRAM filenames with spaces between them (i.e. -b bam1.bam bam2.bam).
  • -o is the name for the new image file that will be created. Allowed types are .png, .jpg, .jpeg, .pdf, and .svg. If you re-use a plot name, be aware that samplot plot will overwrite the older file.
  • -c is the chromosome of the variant.
  • -s is the chromosomal coordinate of the start of the variant.
  • -e is the chromosomal coordinate of the end of the variant.
  • -r is the name of a reference file. This is only required if at least one input file is a CRAM, as CRAM files require a reference for reading.

Commonly useful optional parameters:

  -h, --help            show this help message and exit
  -n TITLES [TITLES ...], --titles TITLES [TITLES ...]
                        Space-delimited list of plot titles. Use quote marks
                        to include spaces (i.e. "plot 1" "plot 2")
  -d MAX_DEPTH, --max_depth MAX_DEPTH
                        Max number of normal pairs to plot
  -t SV_TYPE, --sv_type SV_TYPE
                        SV type. If omitted, plot is created without variant bar
  -T TRANSCRIPT_FILE, --transcript_file TRANSCRIPT_FILE
                        GFF of transcripts
  -A ANNOTATION_FILES [ANNOTATION_FILES ...], --annotation_files ANNOTATION_FILES [ANNOTATION_FILES ...]
                        Space-delimited list of bed.gz tabixed files of
                        annotations (such as repeats, mappability, etc.)
  --zoom ZOOM           Only show +- zoom amount around breakpoints
  • -h will show help and end (will not create image).
  • -n is a space-delimited list of names of the plot titles. If not included, plot titles will be the names of the alignment files for the given plot (here plot refers to a single-sample track, several of which may be in a given image). *-d is an integer number of "normal", i.e. concordant, reads to show in the plot. This is not applied to long-read data. Frequently concordant reads add little to the information in the plot, as the coverage track provides a summary of depth. This option allows downsampling of those reads.
  • -t the type of SV to be shown in the plot. If this argument is omitted no variant bar will be plotted at the top of the image.
  • -T a transcript file in GFF or GTF format. This will be plotted along the bottom of the image showing any overlap of the structural variant with genic regions.
  • -A a space-delimited list of annotation files. These may include repeat annotations or other information useful for analysis of structural variant calls.
  • --zoom an integer number of bases around the breakpoints of the variant to include in the plot. This is especially useful for very large variants and is highly recommended when plotting variants at least 1 megabase in length. The plot will show the positions set with the -s and -e parameters +/- --zoom, with a dotted line connecting the ends of the variant bar at the top to show that part of the interval between those ends has been removed from the plot. If the zoom distance is greater than the window size, a warning message will print and zoom will be ignored. An example of a zoomed image is shown below, with a zoom distance of 1000bp.

zoom_example