Skip to content

Tutorial

Alexey Kozlov edited this page Aug 7, 2019 · 27 revisions

IMPORTANT NOTE: This tutorial describes the functionality of RAxML-NG v. 0.8.0 and later!

Intro

RAxML-NG replaces standard RAxML as well as the corresponding supercomputer version ExaML. So RAxML-NG is one single code base that scales from the laptop to the supercomputer. RAxML-NG does not (yet) support all options of standard RAxML, only the most important and frequently used ones. Some options are now implemented as stand-alone tools, e.g. phylogenetic placement (EPA).

This tutorial is partially based on RAxML-NG practical taught by Alexandros Stamatakis at COME 2018. It will cover most common use cases, for information on advanced usage see next section.

Before you start:

Check that you have RAxML-NG version 0.8.0 BETA or later:

$ raxml-ng -v

RAxML-NG v. 0.8.0 BETA released on 11.01.2019 by The Exelixis Lab.

Getting help

If you run RAxML-NG executable without parameters, it will show quick usage help:

$ raxml-ng

RAxML-NG v. 0.8.0 BETA released on 11.01.2019 by The Exelixis Lab.
Authors: Alexey Kozlov, Alexandros Stamatakis, Diego Darriba, Tomas Flouri, Benoit Morel.
Latest version: https://github.com/amkozlov/raxml-ng
Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml

WARNING: This is a BETA release, please use at your own risk!

Usage: raxml-ng [OPTIONS]

Commands (mutually exclusive):
  --help                                     display help information
  --version                                  display version information
  --evaluate                                 evaluate the likelihood of a tree (with model+brlen optimization)
  --search                                   ML tree search.
  --bootstrap                                bootstrapping
  --all                                      all-in-one (ML search + bootstrapping).
  --support                                  compute bipartition support for a given reference tree (e.g., best ML tree)
                                             and a set of replicate trees (e.g., from a bootstrap analysis)
  --bsconverge                               test for bootstrapping convergence using autoMRE criterion
  --bsmsa                                    generate bootstrap replicate MSAs
  --terrace                                  check whether a tree lies on a phylogenetic terrace 
  --check                                    check alignment correctness and remove empty columns/rows
  --parse                                    parse alignment, compress patterns and create binary MSA file
  --start                                    generate parsimony/random starting trees and exit
  --rfdist                                   compute pair-wise Robinson-Foulds (RF) distances between trees

[...]

More comprehensive documentation is available in GitHub wiki. Further information and benchmarks can be found in biorxiv preprint and in Chapter 4 of Alexey's PhD thesis.

If you cannot find an answer to your question in the above sources, or if you think you found a bug, please contact us via RAxML google group.

  • Please use search function before posting, since many questions have been answered before.
  • Please use google group and not personal e-mail for asking questions about RAxML-NG. This will save everybody's time: you might get help sooner from other Exelixis lab members or your fellow users. And whoever might encounter the same problem in the future will benefit from the answer.

Preparing the alignment

Before we get started, let's first check that the MSA can actually be read and doesn't contain sites with only undetermined characters or sequences with undetermined characters or duplicate taxon names, etc. etc.

raxml-ng --check --msa prim.phy --model GTR+G --prefix T1

Doing this check before getting started is super-important as more than 50% of all failed RAxML runs are due to tree or MSA format errors!

We will always also use --prefix to avoid over-writing previous output files.

For large alignments, we also recommend using --parse command after (or instead of) --check:

raxml-ng --parse --msa prim.phy --model GTR+G --prefix T2

In addition to MSA sanity check, this command will perform two useful operations:

  1. Compress alignment patterns and store MSA in the binary format (RAxML Binary Alignment, RBA):
NOTE: Binary MSA file created: T2.raxml.rba

Since pattern compression could take quite some time for large MSAs, loading RBA file is (much) faster compared to FASTA or PHYLIP.

  1. Estimate memory requirements and optimal number of CPUs/threads (see Parallelization section for details)
* Estimated memory requirements                : 2 MB
* Recommended number of threads / MPI processes: 2

Tree inference

Now let's infer a tree under GTR+GAMMA with default parameters. We will use 2 threads as suggested above, and set a fixed random seed to ensure reproducibility:

 raxml-ng --msa prim.phy --model GTR+G --prefix T3 --threads 2 --seed 2 

This command will perform 20 tree searches using 10 random and 10 parsimony-based starting trees, and pick the best-scoring topology:

Analysis options:
  run mode: ML tree search
  start tree(s): random (10) + parsimony (10)

This default settings is a reasonable choice for many practical cases. However, computational resources permitting, we might want to increase the number of starting trees to explore the tree space more thoroughly:

 raxml-ng --msa prim.phy --model GTR+G --prefix T4 --threads 2 --seed 2 --tree pars{25},rand{25}

Conversely, we can perform a quick-and-dirty search from a single random starting tree using the --search1 command:

 raxml-ng --search1 --msa prim.phy --model GTR+G --prefix T5 --threads 2 --seed 2 

Let's compare the results of all three runs:

grep "Final LogLikelihood:" T{3,4,5}.raxml.log
T3.raxml.log:Final LogLikelihood: -5708.923977
T4.raxml.log:Final LogLikelihood: -5708.923977
T5.raxml.log:Final LogLikelihood: -5708.979717

This looks quite good: likelihood surface seems to have a clear peak, which we can find regardless of the search parameters. However, T5 has a slightly worse likelihood, does it mean it has a distinct topology? We can check this by using the --rfdist command to compute the topological Robinson-Foulds (RF) distance between all trees we have inferred:

$ cat T{3,4}.raxml.mlTrees T5.raxml.bestTree > mltrees
$ raxml-ng --rfdist --tree mltrees --prefix RF
[...]
Loaded 71 trees with 12 taxa.

Average absolute RF distance in this tree set: 0.000000
Average relative RF distance in this tree set: 0.000000
Number of unique topologies in this tree set: 1

No, in fact all 71 resulting topologies (one per starting tree) are identical, so we can be pretty sure that we found the globally optimal ML tree.

However, not all datasets are so well-behaved:

$ raxml-ng --msa fusob.phy --model GTR+G --prefix T6 --seed 2 --threads 2
$ grep "ML tree search #" T6.raxml.log
[00:00:03] ML tree search #1, logLikelihood: -9974.668088
[00:00:07] ML tree search #2, logLikelihood: -9974.666644
[00:00:11] ML tree search #3, logLikelihood: -9974.669417
[00:00:15] ML tree search #4, logLikelihood: -9974.664855
[00:00:19] ML tree search #5, logLikelihood: -9974.663779
[00:00:22] ML tree search #6, logLikelihood: -9974.666906
[00:00:26] ML tree search #7, logLikelihood: -9974.668155
[00:00:30] ML tree search #8, logLikelihood: -9974.664340
[00:00:33] ML tree search #9, logLikelihood: -9974.666937
[00:00:37] ML tree search #10, logLikelihood: -9974.666388
[00:00:40] ML tree search #11, logLikelihood: -9980.601114
[00:00:43] ML tree search #12, logLikelihood: -9974.675123
[00:00:46] ML tree search #13, logLikelihood: -9980.602470
[00:00:49] ML tree search #14, logLikelihood: -9974.671637
[00:00:52] ML tree search #15, logLikelihood: -9980.602668
[00:00:54] ML tree search #16, logLikelihood: -9980.601182
[00:00:57] ML tree search #17, logLikelihood: -9974.672801
[00:01:00] ML tree search #18, logLikelihood: -9974.668668
[00:01:03] ML tree search #19, logLikelihood: -9974.669997
[00:01:06] ML tree search #20, logLikelihood: -9980.607281

Here, we see why it is so important to use multiple starting trees: some searches converged to a local optimum with a much lower likelihood (-9980.607281 vs. -9974.669997). Once again, let's check if the resulting trees differ topologically:

$ raxml-ng --rfdist --tree T6.raxml.mlTrees --prefix RF6
[...]
Loaded 20 trees with 38 taxa.

Average absolute RF distance in this tree set: 3.157895
Average relative RF distance in this tree set: 0.045113
Number of unique topologies in this tree set: 2

Pairwise RF distances saved to: <...>/RF6.raxml.rfDistances

So we have 2 distinct topologies in our set of 20 inferred trees, which correspond to two distinct likelihood values we observed in the tree search output. Let's look at the individual pairwise RF distances which are printed to the RF6.raxml.rfDistances file:

$ cat RF6.raxml.rfDistances

0	1	0	0.000000
0	2	0	0.000000
0	3	0	0.000000
0	4	0	0.000000
0	5	0	0.000000
0	6	0	0.000000
0	7	0	0.000000
0	8	0	0.000000
0	9	0	0.000000
0	10	8	0.114286
0	11	0	0.000000
0	12	8	0.114286
0	13	0	0.000000
0	14	8	0.114286
0	15	8	0.114286
0	16	0	0.000000
0	17	0	0.000000
0	18	0	0.000000
0	19	8	0.114286
[...]

As we can see, all 10 searches from the random starting trees (trees 0 to 9) found the best-scoring topology (RF=0, logL=-9974), whereas 5 out of 10 searches from a parsimony starting tree converged to a local optimum (RF = 8, logL = -9980).

Bootstrapping

NOTE: As of v. 0.8.0b, RAxML-NG supports only standard/slow bootstrap algorithm (-b option in RAxML8). It is much slower than rapid bootstrapping implemented in RAxML8 (-x or -f a options), but gives more accurate support values estimates.

Inferring bootstrap trees

RAxML-NG can perform standard non-parametric bootstrap by re-sampling alignment columns and re-inferring tree for each bootstrap replicate MSA:

raxml-ng --bootstrap --msa prim.phy --model GTR+G --prefix T7 --seed 2 --threads 2

By default, RAxML-NG will perform MRE-based bootstopping test after every 50 replicates, and terminate once the diagnostic statistics drops below the cutoff value:

  bootstrap replicates: max: 1000 + bootstopping (autoMRE, cutoff: 0.030000)
[...]
[00:00:15] Bootstrap tree #50, logLikelihood: -5762.777409
[00:00:15] Bootstrapping converged after 50 replicates.

This was fast! Let's manually increase the number of replicates to be on the safe side:

raxml-ng --bootstrap --msa prim.phy --model GTR+G --prefix T8 --seed 2 --threads 2 --bs-trees 200

Bootstrap convergence can also be checked post-hoc using the --bsconverge command, we can also change the cutoff value to make the test more or less stringent:

 raxml-ng --bsconverge --bs-trees T7.raxml.bootstraps --prefix T9 --seed 2 --threads 2 --bs-cutoff 0.01
 # trees        avg WRF       avg WRF in %       # perms: wrf <= 1.00 %     converged?  
      50          7.400              1.644                            0        NO
Bootstopping test did not converge after 50 trees

As we can see, with 1% cutoff 50 replicates are not enough. What about 200?

raxml-ng --bsconverge --bs-trees T8.raxml.bootstraps --prefix T10 --seed 2 --threads 2  --bs-cutoff 0.01
 # trees        avg WRF       avg WRF in %       # perms: wrf <= 1.00 %     converged?  
      50          7.400              1.644                            0        NO
     100         11.702              1.300                          245        NO
     150         13.960              1.034                          457        NO
     200         16.484              0.916                          648        NO
Bootstopping test did not converge after 200 trees

Still no convergence, but weighted RF distance (avg WRF in %) is steadily decreasing as we add more replicates, and now lies below the 1% cutoff for 648 out of 1000 permutations (convergence requirement: >990/1000). This looks promising, and we can expect convergence after few hundreds replicates. Luckily, bootstraps are independent, thus we can reuse 200 bootstrap trees we have already inferred. So let's add 400 more replicate trees. It is however important to specify a distinct random seed for the second run, otherwise first 200 trees will be identical to the first batch!

raxml-ng --bootstrap --msa prim.phy --model GTR+G --prefix T11 --seed 333 --threads 2 --bs-trees 400

Now, we simply concatenate replicate trees from both runs, and re-assess the convergence:

$ cat T8.raxml.bootstraps T11.raxml.bootstraps > allbootstraps
$ raxml-ng --bsconverge --bs-trees allbootstraps --prefix T12 --seed 2 --threads 1 --bs-cutoff 0.01
 # trees        avg WRF       avg WRF in %       # perms: wrf <= 1.00 %     converged?  
      50          7.400              1.644                            0        NO
     100         11.702              1.300                          245        NO
     150         13.960              1.034                          457        NO
     200         16.484              0.916                          648        NO
     250         17.410              0.774                          841        NO
     300         18.900              0.700                          927        NO
     350         20.060              0.637                          942        NO
     400         22.076              0.613                          969        NO
     450         23.856              0.589                          973        NO
     500         26.164              0.581                          985        NO
     550         27.844              0.563                          985        NO
     600         28.462              0.527                          991        YES
Bootstopping test converged after 600 trees

Perfect, now we have convergence even with a more stringent cutoff, although we had to conduct 600 replicate searches instead of just 50. On large datasets, this quickly gets very expensive computationally. Hence in practice, the default cutoff value of 3% should be sufficient in most cases (see Pattengale et al., 2010).

Computing branch support

So now, what do we do with the bootstrap trees? We can either summarize them via some sort of consensus tree (strict, majority, majority rule extended, e.g. using standard RAxML or some other tool) or we can map them onto the best-scoring ML tree on the original MSA. It's debatable what might be the best way of doing it, but there seems to be a trend toward mapping the BS support values onto the best-scoring/best-known ML tree, so let's do that.

We will use the ML tree obtained in the run T3:

raxml-ng --support --tree T3.raxml.bestTree --bs-trees allbootstraps --prefix T13 --threads 2 

Now we can actually look at the tree with supports,

T13.raxml.support

using some tree viewer.

Beware of the branch support visualization issues that can arise with the NEWICK format!!!

Alternatively, we can compute so-called Transfer Bootstrap Expectation support metric (Lemoine et al. 2018), which is supposedly more appropriate for very large trees:

raxml-ng --support --tree T3.raxml.bestTree --bs-trees allbootstraps --prefix T14 --threads 2 --bs-metric tbe

But wait, we could have been lazier:

raxml-ng --all --msa prim.phy --model GTR+G --prefix T15 --seed 2 --threads 2 --bs-metric fbp,tbe

This will do all of the above steps (20 ML inferences on the original MSA, inferring bootstrap replicate trees, and drawing support values on the best-scoring tree) with just one command :-)

$ ls T15.*
T15.raxml.bestModel  T15.raxml.bestTree    T15.raxml.bootstraps  T15.raxml.log  T15.raxml.mlTrees
T15.raxml.rba        T15.raxml.startTree   T15.raxml.supportFBP  T15.raxml.supportTBE

Tree likelihood evaluation

Another standard task is to evaluate trees, i.e., compute the likelihood of a given fixed tree topology by just optimizing model and/or branch length parameters on that fixed tree. This is frequently needed in model and hypothesis testing :-)

The basic option is --evaluate

With --opt-model on/off you can enable/disable model parameter optimization.

With --opt-branches on/off you can enable/disable branch length optimization.

Let's do some small tests that also show how the likelihood improves as we add more and more free parameters to our model. We will use the best-scoring ML tree again:

Let's first evaluate it under the most simple model, Jukes-Cantor (JC):

raxml-ng --evaluate --msa prim.phy --threads 2 --model JC --tree T3.raxml.bestTree --prefix E1

Now, let's add rate heterogeneity to this:

raxml-ng --evaluate --msa prim.phy --threads 2 --model JC+G --tree T3.raxml.bestTree --prefix E2

Now let's take a simple GTR model (without rate heterogeneity):

raxml-ng --evaluate --msa prim.phy --threads 2 --model GTR --tree T3.raxml.bestTree --prefix E3

GTR with the Gamma model of rate heterogeneity, but empirical base frequencies:

raxml-ng --evaluate --msa prim.phy --threads 2 --model GTR+G+FC --tree T3.raxml.bestTree --prefix E4

And now also doing a ML estimate of the base frequencies:

raxml-ng --evaluate --msa prim.phy --threads 2 --model GTR+G+FO --tree T3.raxml.bestTree --prefix E5

Finally, using 4 free rates instead of GAMMA-distributed rates:

raxml-ng --evaluate --msa prim.phy --threads 2 --model GTR+R4+FO --tree T3.raxml.bestTree --prefix E6

Let's check the results:

$  grep logLikelihood E*.raxml.log
E1.raxml.log:[00:00:00] Tree #1, final logLikelihood: -6424.203056 <- JC 
E2.raxml.log:[00:00:00] Tree #1, final logLikelihood: -6272.469063 <- JC+GAMMA
E3.raxml.log:[00:00:00] Tree #1, final logLikelihood: -5934.158958 <- GTR
E4.raxml.log:[00:00:00] Tree #1, final logLikelihood: -5719.408956 <- GTR + GAMMA + empirical base freqs
E5.raxml.log:[00:00:00] Tree #1, final logLikelihood: -5709.002997 <- GTR + GAMMA + estimated base freqs
E6.raxml.log:[00:00:01] Tree #1, final logLikelihood: -5706.008654 <- GTR + FreeRate + estimated base freqs

Not surprisingly, models with more free parameters yield better likelihood scores. However, it does not mean that we should always use the most parameter-rich model. Instead, it is common to use information theoretical criteria such as AIC or BIC to penalize parameter-rich models and thereby avoid overfitting (lower=better):

grep "AIC score" E*.raxml.log
E1.raxml.log:AIC score: 12890.406112 / AICc score: 12891.460907 / BIC score: 12991.209684  <- JC
E2.raxml.log:AIC score: 12588.938126 / AICc score: 12590.094698 / BIC score: 12694.541868  <- JC+G
E3.raxml.log:AIC score: 11926.317917 / AICc score: 11928.322525 / BIC score: 12065.522849  <- GTR
E4.raxml.log:AIC score: 11498.817912 / AICc score: 11500.963241 / BIC score: 11642.823014  <- GTR+G+FC
E5.raxml.log:AIC score: 11478.005995 / AICc score: 11480.151323 / BIC score: 11622.011097  <- GTR+G+FO
E6.raxml.log:AIC score: 11482.017308 / AICc score: 11484.940742 / BIC score: 11650.023260  <- GTR+R4+FO

As we can see, the GTR+G+FO model scores best according to all three criteria evaluated, even though it yields lower likelihood than GTR+R4+FO. This example illustrates the importance of formal model selection. In practice, one should use specialized tools such as ModelTest-NG, IQTree/ModelFinder or PartitionFinder for this task.

Partitioned analysis

So far, we always used single evolutionary model for all alignment sites. This is rather unrealistic biologically, since different genes and/or codon positions typically exhibit distinct substitution patterns. Therefore, it is common to divide alignment sites into subsets or partitions, which can be assigned individual evolutionary models. In the simplest case, we can assign identical models for all partitions, but allow independent model parameter estimates:

$ cat prim.part 
GTR+G+FO, NADH4=1-504
GTR+G+FO, tRNA=505-656
GTR+G+FO, NADH5=657-898

Partition file format is similar to RAxML 8.x and ExaML. Each line defines a partition, and contains evolutionary model specification, partition name and alignment site range(s). An important difference to RAxML/ExaML: rate heterogeneity has to be defined for each partition individually, i.e. we specify GTR+G for every partition with GAMMA model instead of using a global -m GTRGAMMA switch on the command line.

A more sophisticated example where we use different substitution matrices and rate heterogeneity models, and also split first gene by codon position:

$ cat prim2.part 
GTR+G+FO, NADH4=1-504/3,2-504/3
JC+I, tRNA=505-656
GTR+R4+FC, NADH5=657-898
HKY, NADH4p3=3-504/3

Likelihood evaluation with partitioned models

Now let's try to evaluate likelihood on a fixed tree topology like in the previous section, but now using a partitioned model (we will also increase log verbosity to see estimated parameter values):

raxml-ng --evaluate --msa prim.phy --threads 2 --model prim.part --tree T3.raxml.bestTree --prefix P1 -log verbose
Optimized model parameters:

   Partition 0: NADH4
   Speed (ML): 1.045481
   Rate heterogeneity: GAMMA (4 cats, mean),  alpha: 0.320532 (ML),  weights&rates: (0.250000,0.007108) (0.250000,0.120533) (0.250000,0.628725) (0.250000,3.243634) 
   Base frequencies (ML): 0.347608 0.343620 0.074289 0.234483 
   Substitution rates (ML): 1.110014 16.895228 0.903118 0.001000 11.861976 1.000000 

   Partition 1: tRNA
   Speed (ML): 0.505287
   Rate heterogeneity: GAMMA (4 cats, mean),  alpha: 0.300774 (ML),  weights&rates: (0.250000,0.005358) (0.250000,0.105097) (0.250000,0.597100) (0.250000,3.292444) 
   Base frequencies (ML): 0.362527 0.230093 0.151307 0.256073 
   Substitution rates (ML): 66.393654 308.024274 43.477166 37.411363 671.608883 1.000000 

   Partition 2: NADH5
   Speed (ML): 1.216009
   Rate heterogeneity: GAMMA (4 cats, mean),  alpha: 0.614255 (ML),  weights&rates: (0.250000,0.056061) (0.250000,0.320104) (0.250000,0.888421) (0.250000,2.735414) 
   Base frequencies (ML): 0.360963 0.322304 0.061324 0.255409 
   Substitution rates (ML): 67.157660 1000.000000 56.903929 148.358484 530.324413 1.000000 

As we can see from the output above, even though we assigned GTR+G+FO model to all three partitions, each of them got independent estimates of parameter values (alpha parameter of GAMMA distribution, base frequencies, and GTR substitution rates).

Let's repeat the evaluation under the 2nd partition scheme:

raxml-ng --evaluate --msa prim.phy --threads 2 --model prim2.part --tree T3.raxml.bestTree --prefix P2 -log verbose

and compare the likelihoods of P2 vs. P1 vs. single GTR+G+FO model:

grep logLikelihood {E5,P1,P2}.raxml.log
E5.raxml.log:[00:00:00] Tree #1, final logLikelihood: -5709.002997
P1.raxml.log:[00:00:00] Tree #1, final logLikelihood: -5673.027260
P2.raxml.log:[00:00:00] Tree #1, final logLikelihood: -5673.868809

So P1 has the best likelihood score, closely followed by P2. But both P1 and P2 also introduce more free parameters compared to GTR+G+FO:

grep "Free parameters" {E5,P1,P2}.raxml.log
E5.raxml.log:Free parameters (model + branch lengths): 30
P1.raxml.log:Free parameters (model + branch lengths): 50
P2.raxml.log:Free parameters (model + branch lengths): 52

Hence, we can again use AIC/BIC criteria to assess the trade-off (lower=better):

grep "AIC score" {E5,P1,P2}.raxml.log
E5.raxml.log:AIC score: 11478.005995 / AICc score: 11480.151323 / BIC score: 11622.011097
P1.raxml.log:AIC score: 11446.054521 / AICc score: 11452.075772 / BIC score: 11686.063024
P2.raxml.log:AIC score: 11451.737617 / AICc score: 11458.260694 / BIC score: 11701.346461

The situation is less clear now: AIC and AICc favor P1, whereas GTR+G+FO has better BIC score. Unfortunately, there seems to be no general consensus which information criterion is superior, and so the decision whether to use AIC, AICc or BIC is left to the user.

Branch length linkage

You might have noticed that there is an extra parameter called Speed estimated for each partition:

Optimized model parameters:

   Partition 0: NADH4
   Speed (ML): 1.045481
[..]
   Partition 1: tRNA
   Speed (ML): 0.505287
[..]
   Partition 2: NADH5
   Speed (ML): 1.216009

What does it mean? In partitioned analyses, there are three common ways to estimate branch lengths (sometimes called branch linkage modes):

  • linked: all partitions share a common set of (global) branch lengths. This is the simplest model with fewest parameters (#branches). However, it is often considered too unrealistic, since it is known that genes (or genome regions) evolve at different speeds.
  • unlinked: each partition has its own, independent set of branch lengths. This model allows for the highest flexibility, but it also introduces a huge number of free parameters (#branches * #partitions), which makes it prone to overfitting.
  • scaled (proportional): a global set of branch lengths is estimated like in linked mode, but each partition has an individual scaling factor; per-partition branch lengths are obtained by multiplying the global branch lengths with these individual scalers. This approach is a compromise that allows to model distinct evolutionary rates across partitions while introducing only a moderate number of free parameters (#branches + #partitions).

RAxML-NG supports all three branch linkage modes described above; they can be set using the --brlen option. A recent simulation study by Duchêne et al. shows that scaled branch linkage mode the best fit for most typical datasets, confirming the intuition about its good flexibility/overfitting trade-off. Hence, RAxML-NG uses scaled mode for partitioned analyses by default. Please note, that RAxML 8.x and ExaML use linked branch length mode by default, and this should be considered when comparing likelihoods and resulting topologies with RAxML-NG!

So let's check how linked and unlinked models work for our toy dataset:

raxml-ng --evaluate --msa prim.phy --threads 2 --model prim.part --tree T3.raxml.bestTree --prefix P3 --brlen linked
raxml-ng --evaluate --msa prim.phy --threads 2 --model prim.part --tree T3.raxml.bestTree --prefix P4 --brlen unlinked

As could be expected, more complex models yield better likelihood scores (unlinked > scaled > linked):

grep logLikelihood {P1,P3,P4}.raxml.log
P1.raxml.log:[00:00:00] Tree #1, final logLikelihood: -5673.027260 <- scaled
P3.raxml.log:[00:00:00] Tree #1, final logLikelihood: -5678.429054 <- linked
P4.raxml.log:[00:00:00] Tree #1, final logLikelihood: -5648.348677 <- unlinked

However, the difference is not always large enough to justify additional parameters:

grep "AIC score" {P1,P3,P4}.raxml.log
P1.raxml.log:AIC score: 11446.054521 / AICc score: 11452.075772 / BIC score: 11686.063024  <- scaled
P3.raxml.log:AIC score: 11452.858107 / AICc score: 11458.398743 / BIC score: 11683.266270  <- linked
P4.raxml.log:AIC score: 11476.697354 / AICc score: 11496.994752 / BIC score: 11908.712661  <- unlinked

Once again, we have disagreement between AIC(c) and BIC, which choose scaled und linked branch length models, respectively. However, all three criteria exclude the extremely parameter-rich unlinked model.

Tree search with partitioned models

In the previous subsection, we used partitioned models to re-evaluate likelihood of the ML tree obtained under the GTR+G model. But what if we re-run tree search from scratch under partitioned model, will it change the resulting likelihoods and/or topology?

raxml-ng --msa prim.phy --model prim.part --prefix P5 --threads 2 --seed 2 --brlen scaled
raxml-ng --msa prim.phy --model prim.part --prefix P6 --threads 2 --seed 2 --brlen linked
raxml-ng --msa prim.phy --model prim.part --prefix P7 --threads 2 --seed 2 --brlen unlinked

Checking the new likelihood scores

grep "Final LogLikelihood" {P5,P6,P7}.raxml.log
P5.raxml.log:Final LogLikelihood: -5672.951995
P6.raxml.log:Final LogLikelihood: -5678.301081
P7.raxml.log:Final LogLikelihood: -5648.204296

shows that they are almost identical to the "old" values obtained on the T3 topology. Moreover, all three partitioned runs converged to the same ML tree topology as the unpartitioned T3 run. Of course, this observation will not hold for all datasets. However, there is some evidence that the choice of DNA substitution models has rather limited influence on the resulting tree topology (Hoff et al. 2016). This does not mean that we can ignore model selection altogether. But probably we should not worry too much about subtle details such as AIC(c) vs. BIC conflicts or sample size definition.

What's next?

Please have a look at our Advanced Tutorial!