Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column 1 of result for group 3421 is type 'double' but expecting type 'integer' #99

Closed
huizhen-yan opened this issue Jul 24, 2023 · 7 comments

Comments

@huizhen-yan
Copy link

Hi,
I ran DAS_Tool and got the following error.
`time DAS_Tool -i z1.Contig2bin.tsv,p1.Contig2bin.tsv,p2.Contig2bin.tsv -l z1,p1,p2 -c contigs.fa -t 56 --write_bins -o all_bins_dastool
DAS Tool 1.1.6

Analyzing assembly
Predicting genes
Annotating single copy genes using diamond
Dereplicating, aggregating, and scoring bins
Error in [.data.table(bin_tab_contig, , .(binSize = calc_bins_size(contig_id, :
Column 1 of result for group 3421 is type 'double' but expecting type 'integer'. Column types must be consistent for each group.
Calls: cherry_pick -> score_bins -> %>% -> setkey -> [ -> [.data.table
In addition: Warning message:
In calc_N50(contig_id, contig_length) :
integer overflow in 'cumsum'; use 'cumsum(as.numeric(.))'
Execution halted

real 158m57.298s
user 5466m32.675s
sys 10m37.638s
$`

Here are the input files.

head z1.Contig2bin.tsv 
k127_5405675 z1.bin.1000
k127_7207940 z1.bin.1000
k127_7208440 z1.bin.1000
k127_4507492 z1.bin.1000
k127_12615522 z1.bin.1000
k127_4508035 z1.bin.1000
k127_10814506 z1.bin.1000
k127_3606553 z1.bin.1000
k127_10815671 z1.bin.1000
head p1.Contig2bin.tsv 
k127_21624164 p1.bin.1000
k127_11266923 p1.bin.1000
k127_14421405 p1.bin.1000
k127_19376857 p1.bin.1000
k127_13167 p1.bin.1000
k127_9923194 p1.bin.1000
k127_8577692 p1.bin.1000
k127_5878300 p1.bin.1000
k127_20739313 p1.bin.1000
head contigs.fa
>k127_13963658 flag=0 multi=51.4235 len=1029
TGAAGTCGACGTTGAACAGAAAGCCGAGGCCGGCCGCAAAATCGCCGGCATACGCCGCGTAGCCGATCATGACCAGCATCATCGCAAACAGCGAGGGCATCAGCATCTTCACGGCCTTCTCGATGCCATCCTGCAGGCCACGGCCGACGATGGAGAGCGCGATGGCGATGAACACCGTGTGCCAGAGGGTCATCGTCACCGGGTCCGCCAGCAACCCGTCGAACTGCCCCGCCACCTCGAGCGGACCGGCGCCGCTGAAGCCGCCCGCCGCCTTGCCGATGTAGCTCAGCGTCCAGCCGGCGATGACGCTGTAGTAGGTCGCGATCAGGAAGCCGACGATTGTCCCCATCCAGCCGACGATGCGCCAGGCCCTGGAGCGGCCGGCACTCGCGGCGAGCGTCGACATGGCCACCGGCGGGCTGCTCGCGCCACGACGGCCGATGAGTAGTTCCGCGATGAGGATCGGGATGGCGACGAAGACCACGCAGGCGAGGTAGACCAGCACGAAGGCGCCGCCGCCGCTGACGCCGGCAACGAACGGGAACTTCCAGATATTACCGAGGCCGACCGCCGCGCCGACCGCGGCGAGGATGAACGTGAAACCCGAAGACCAGTTCTGTGTGCTGCCTGTGCCTGCCATTAGTTGCTCGCTTGTTGGTGGTTATCCAGTACGCGGCCGGGATTCAGGATATTGCTGGGGTCGAGCGCGGACTTCAACGCGCGCATGAGCCCGATCTCCGCGGCCGTGCGGCTGTGCGGCAGCCACTTGAGTTTTTCCGTGCCGATACCGTGCTCCGCCGAAACCGAGCCGCCGATATCAGTGAGCGGCCCGTACACGCACTCATCGCTGGTCTCGTGATGGTCGCCCTCGGCGTTCGGCGCGACGAAGAAATGCAGGTTGCCGTCGGCAACGTGACCTATCGTATAGCACTCACCGCGCGGCCAGCGCCCCCTGACATGGGTTTTCACCGCCTCGACGTAAGCGGCCATGCTGCGAATCGGCAGGCTGACATCGTACAAATAGACCGG

How to fix it?

@vinisalazar
Copy link

Having this same issue with versions 1.1.4 and 1.1.6.

@dportik
Copy link

dportik commented Nov 29, 2023

Bumping this issue as I am having a similar problem:

DAS Tool 1.1.6 
Analyzing assembly 
Warning message:
In calc_N50(contigTab[, contig_id], contigTab[, contig_length]) :
  integer overflow in 'cumsum'; use 'cumsum(as.numeric(.))'
Predicting genes 
Annotating single copy genes using diamond 
Dereplicating, aggregating, and scoring bins 
Error in `[.data.table`(bin_tab_contig, , .(binSize = calc_bins_size(contig_id,  : 
  Column 1 of result for group 2 is type 'double' but expecting type 'integer'. Column types must be consistent for each group.
Calls: cherry_pick -> score_bins -> %>% -> setkey -> [ -> [.data.table
In addition: Warning message:
In calc_N50(contig_id, contig_length) :
  integer overflow in 'cumsum'; use 'cumsum(as.numeric(.))'
Execution halted

@vinisalazar
Copy link

@cmks I understand that this time of the year is quite busy, but would it be possible to give some attention to this issue for the next version of DAS Tool? Please let us know if there's any way to help.

Many thanks
Vini

@cmks
Copy link
Owner

cmks commented Dec 18, 2023

Hi all, thanks for reporting this bug. I've just pushed a fix but I can't test it as I'm not able to replicate the issue. Can you please re-run your data using the new version and tell me if it is working?
You can either install the pre-release: DAS Tool 1.1.7-b.1
or checkout this branch: issue_99

@vinisalazar
Copy link

Thank you @cmks, not sure if this is of any help, but I seem to only get this issue with fairly large datasets. It hasn't happened with smaller datasets.

@vinisalazar
Copy link

Hi @cmks, coming in to report that the fix seems to have worked. We are no longer getting that error and DAS Tool finished successfully. Thank you for your responsiveness!

BTW, our only struggle was installing from source. We eventually figured it out, but it took some time to find and replace the DAS_Tool.R file in the R library directory (we had a conda-based installation).

Best,
Vini

@cmks
Copy link
Owner

cmks commented Jan 8, 2024

Great, thanks for your feedback! The fix is now part of version 1.1.7 in case you want to update your conda-based installation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants