Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support bedGraph output in only-depth #46

Open
ghuls opened this issue Dec 1, 2021 · 1 comment
Open

Support bedGraph output in only-depth #46

ghuls opened this issue Dec 1, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@ghuls
Copy link

ghuls commented Dec 1, 2021

Support bedGraph output in only-depth:

This is basically perbase -z output without header and chromosomes sorted by lexographically.

This would allow to create bigwig files relatively easily. (naive sorting takes quite a while instead of reading the reads by chromosome in lexographical order).

perbase only-depth -z ${bam} | tail -n +2 | LC_COLLATE=C sort -k 1,1 -k 2,2n -k 3,3n > ${bedgraph}

bedGraphToBigWig ${bedgraph} ${chrom_sizes} ${bigwig}

bedGraph format:
http://genome.ucsc.edu/goldenPath/help/bedgraph.html

@sstadick sstadick added the enhancement New feature or request label Dec 3, 2021
@sstadick
Copy link
Owner

sstadick commented Dec 3, 2021

I'll looking into making this more natively supported. There is also the --bed-format option which, with -z should make the output more BED like. The output order is currently dictated by the BAM header, if the input BAM is sorted lexographically then so will the output of perbase (not ideal, I'm just noting it).

Options to make this work would be:

  • Add a "lexographical sort" flag that sorts the BAM header before processing, this could be fine...but I could see it still resulting in slightly different sort orders than what is needed downstream.
  • Add support for outputing in the same order as an input BED file or sequence dictionary. This probably provides the most flexibility / explicitness
  • Add an optional input that is just a list of chromosome names that represent the order things should be output. meh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants