Skip to content

Commit

Permalink
update archaeal market set
Browse files Browse the repository at this point in the history
  • Loading branch information
pchaumeil committed May 11, 2022
1 parent 4927bc4 commit ab3a01d
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 5 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Please post questions and issues related to GTDB-Tk on the Issues section of the
## New Features

GTDB-Tk v2.1.0 includes the following new features:
- GTDB-TK now uses a **divide-and-conquer** approach where the bacterial reference tree is split into multiple **class**-level subtrees. This reduces the memory requirements of GTDB-Tk from **320 GB** of RAM when using the full GTDB R07-RS207 reference tree to approximately **50 GB**. A manuscript describing this approach is in preparation. If you wish to continue using the full GTDB reference tree use the `--full-tree` flag.
- GTDB-TK now uses a **divide-and-conquer** approach where the bacterial reference tree is split into multiple **class**-level subtrees. This reduces the memory requirements of GTDB-Tk from **320 GB** of RAM when using the full GTDB R07-RS207 reference tree to approximately **55 GB**. A manuscript describing this approach is in preparation. If you wish to continue using the full GTDB reference tree use the `--full-tree` flag.
This is the main change from v2.0.0. The split tree approach has been modified from order-level trees to class-level trees to resolve specific classification issues (See [#383](https://github.com/Ecogenomics/GTDBTk/issues/383)).
- Genomes that cannot be assigned to a domain (e.g. genomes with no bacterial or archaeal markers or genomes with no genes called by Prodigal) are now reported in the `gtdbtk.bac120.summary.tsv` as 'Unclassified'
- Genomes filtered out during the alignment step are now reported in the `gtdbtk.bac120.summary.tsv` or `gtdbtk.ar53.summary.tsv` as 'Unclassified Bacteria/Archaea'
Expand Down
2 changes: 1 addition & 1 deletion docs/src/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Change log

Major changes:

* GTDB-TK now uses a **divide-and-conquer** approach where the bacterial reference tree is split into multiple **class**-level subtrees. This reduces the memory requirements of GTDB-Tk from **320 GB** of RAM when using the full GTDB R07-RS207 reference tree to approximately **50 GB**. A manuscript describing this approach is in preparation. If you wish to continue using the full GTDB reference tree use the `--full-tree` flag. This is the main change from v2.0.0. The split tree approach has been modified from order-level trees to class-level trees to resolve specific classification issues (see `#383 <https://github.com/Ecogenomics/GTDBTk/issues/383>`_).
* GTDB-TK now uses a **divide-and-conquer** approach where the bacterial reference tree is split into multiple **class**-level subtrees. This reduces the memory requirements of GTDB-Tk from **320 GB** of RAM when using the full GTDB R07-RS207 reference tree to approximately **55 GB**. A manuscript describing this approach is in preparation. If you wish to continue using the full GTDB reference tree use the `--full-tree` flag. This is the main change from v2.0.0. The split tree approach has been modified from order-level trees to class-level trees to resolve specific classification issues (see `#383 <https://github.com/Ecogenomics/GTDBTk/issues/383>`_).
* Genomes that cannot be assigned to a domain (e.g. genomes with no bacterial or archaeal markers or genomes with no genes called by Prodigal) are now reported in the `gtdbtk.bac120.summary.tsv` as 'Unclassified'
* Genomes filtered out during the alignment step are now reported in the `gtdbtk.bac120.summary.tsv` or `gtdbtk.ar53.summary.tsv` as 'Unclassified Bacteria/Archaea'
* `--write_single_copy_genes` flag in now available in the `classify_wf` and `de_novo_wf` workflows.
Expand Down
7 changes: 5 additions & 2 deletions docs/src/installing/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ Hardware requirements
- ~65 GB
- ~1 hour / 1,000 genomes @ 64 CPUs
* - Bacteria
- ~320 GB ( 20GB for divide-and-conquer)
- ~320 GB ( 55GB for divide-and-conquer)
- ~65 GB
- ~1 hour / 1,000 genomes @ 64 CPUs

Expand Down Expand Up @@ -137,9 +137,12 @@ Note that different versions of the GTDB release data may not run on all version
* - GTDB Release
- Minimum version
- Maximum version
* - R207_v2
- 2.1.0
- Current
* - R207
- 2.0.0
- Current
- 2.0.0
* - R202
- 1.5.0
- 1.7.0
Expand Down
2 changes: 1 addition & 1 deletion gtdbtk/classify.py
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,7 @@ def place_genomes(self,
if levelopt is None or levelopt == 'high':
self.logger.info(f'pplacer version: {pplacer.version}')
# #DEBUG line
run_pplacer = False
run_pplacer = True
if run_pplacer:
pplacer.run(self.pplacer_cpus, 'wag', pplacer_ref_pkg, pplacer_json_out,
user_msa_file, pplacer_out, pplacer_mmap_file)
Expand Down

0 comments on commit ab3a01d

Please sign in to comment.