Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace GNU parallel with tee #29

Closed
wants to merge 6 commits into from
Closed

Replace GNU parallel with tee #29

wants to merge 6 commits into from

Conversation

njspix
Copy link

@njspix njspix commented Sep 15, 2022

I recently ran into a bug using a new version of GNU parallel. I quickly developed a bit of annoyance when I had a difficult time finding a changelog for parallel. I believe the 'tee' command should work just as well and it's a lot easier to read.

(edit) OK, maybe it's not as easy to read anymore... :-D

used named pipes for first tee process to make sure that all recipients of first tee command finish before second tee command starts
@njspix
Copy link
Author

njspix commented Sep 19, 2022

Tests on sample bam files:

tee, single cell

  • real 2m52.947s
  • user 2m30.063s
  • sys 0m28.862s

parallel, single cell

  • real 2m51.425s
  • user 5m41.463s
  • sys 0m32.889s

tee, bulk

  • real 61m50.013s
  • user 95m12.041s
  • sys 26m16.641s

parallel, bulk

  • real 61m30.633s
  • user 100m6.832s
  • sys 26m19.267s

files produced are identical by md5sum

@njspix
Copy link
Author

njspix commented Sep 20, 2022

Updated with more/hopefully better comments clarifying how the tee code works

@njspix
Copy link
Author

njspix commented Sep 27, 2022

Just gave this a test run on a medium-sized BAM. Results below:

Finished BISCUIT QC at Tue Sep 27 10:24:41 EDT 2022
real	17m29.583s
user	47m22.398s
sys	5m16.096s

Finished BISCUIT QC at Tue Sep 27 10:42:04 EDT 2022
real	17m23.558s
user	46m36.619s
sys	5m15.435s

00ef368e28bc23c9d8e011e882cbe108  parallel_CpHRetentionByReadPos.txt
00ef368e28bc23c9d8e011e882cbe108  tee_CpHRetentionByReadPos.txt
05d330834eee785d10cb08bc92450d08  parallel_covdist_all_cpg_topgc_table.txt
05d330834eee785d10cb08bc92450d08  tee_covdist_all_cpg_topgc_table.txt
0d01f101ae0558f120612441847497d1  parallel_cv_table.txt
0d01f101ae0558f120612441847497d1  tee_cv_table.txt
21f36255266994f59e2065ecdc46ee35  parallel_CpGRetentionByReadPos.txt
21f36255266994f59e2065ecdc46ee35  tee_CpGRetentionByReadPos.txt
347e8979a45698944a8b84004327cefd  parallel_covdist_q40_base_topgc_table.txt
347e8979a45698944a8b84004327cefd  tee_covdist_q40_base_topgc_table.txt
421094667ca7675b69817709e26f332c  parallel_isize_table.txt
421094667ca7675b69817709e26f332c  tee_isize_table.txt
4d05292b54d2b30eefe36fbea4758478  parallel_totalReadConversionRate.txt
4d05292b54d2b30eefe36fbea4758478  tee_totalReadConversionRate.txt
4dcfe62f7a1b117d61e34f21eec79669  parallel_mapq_table.txt
4dcfe62f7a1b117d61e34f21eec79669  tee_mapq_table.txt
50d439ecdbce24e1d4293aeedb66c3ba  parallel_covdist_all_base_table.txt
50d439ecdbce24e1d4293aeedb66c3ba  tee_covdist_all_base_table.txt
89880ce1c6b67699f7b106d92f1e0d42  parallel_covdist_q40_base_table.txt
89880ce1c6b67699f7b106d92f1e0d42  tee_covdist_q40_base_table.txt
8cd0fb99738cd394f9159775e6042ab1  parallel_covdist_all_base_botgc_table.txt
8cd0fb99738cd394f9159775e6042ab1  tee_covdist_all_base_botgc_table.txt
a7cf43fc5f0f8043d288a7503146ccef  parallel_dup_report.txt
a7cf43fc5f0f8043d288a7503146ccef  tee_dup_report.txt
aaceddf2f79fd55ca71d99ceb6dc78d7  parallel_covdist_q40_cpg_topgc_table.txt
aaceddf2f79fd55ca71d99ceb6dc78d7  tee_covdist_q40_cpg_topgc_table.txt
ac36d069cd44425c2fb26a464ae81472  parallel_covdist_q40_base_botgc_table.txt
ac36d069cd44425c2fb26a464ae81472  tee_covdist_q40_base_botgc_table.txt
b3ab4899c23d4b6db83f0fddc513ab09  parallel_covdist_all_cpg_table.txt
b3ab4899c23d4b6db83f0fddc513ab09  tee_covdist_all_cpg_table.txt
b5cde1b72db50677574903a5174f4fab  parallel_covdist_q40_cpg_botgc_table.txt
b5cde1b72db50677574903a5174f4fab  tee_covdist_q40_cpg_botgc_table.txt
be38874b9b518e08e89502bfa071dafd  parallel_strand_table.txt
be38874b9b518e08e89502bfa071dafd  tee_strand_table.txt
d9e8ba48fb2af5fff06dd25092f2eee2  parallel_covdist_all_base_topgc_table.txt
d9e8ba48fb2af5fff06dd25092f2eee2  tee_covdist_all_base_topgc_table.txt
f2f473fdb9e52ee0e590ad9a47c2afb1  parallel_covdist_q40_cpg_table.txt
f2f473fdb9e52ee0e590ad9a47c2afb1  tee_covdist_q40_cpg_table.txt
f4126f00d2ea0ef8c5b8eb65d06d4aad  parallel_covdist_all_cpg_botgc_table.txt
f4126f00d2ea0ef8c5b8eb65d06d4aad  tee_covdist_all_cpg_botgc_table.txt

@jamorrison
Copy link

This has been resolved with PR #31 and superseded by PR #33.

@jamorrison jamorrison closed this Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants