-
Notifications
You must be signed in to change notification settings - Fork 6
API Reference
pll_set_tip_states()
pll_set_tip_clv()
pll_set_subst_params()
pll_set_frequencies()
pll_set_pattern_weights()
pll_set_category_rates()
pll_fasta_open()
pll_fasta_getnext()
pll_fasta_close()
pll_fasta_rewind()
pll_fasta_getfilesize()
pll_fasta_getfilepos()
pll_utree_parse_newick()
pll_utree_destroy()
pll_utree_show_ascii()
pll_utree_export_newick()
pll_utree_traverse()
pll_utree_query_tipnodes()
pll_utree_query_innernodes()
pll_utree_create_operations()
pll_utree_clone()
pll_utree_reset_template_indices()
pll_utree_check_integrity()
pll_utree_every()
pll_utree_spr()
pll_utree_nni()
pll_utree_rollback()
pll_rtree_parse_newick()
pll_rtree_destroy()
pll_rtree_show_ascii()
pll_rtree_export_newick()
pll_rtree_traverse()
pll_rtree_query_tipnodes()
pll_rtree_query_innernodes()
pll_rtree_create_operations()
pll_utree_t * pll_rtree_unroot()
pll_partition_t * pll_partition_create(unsigned int tips,
unsigned int clv_buffers,
unsigned int states,
unsigned int sites,
unsigned int rate_matrices,
unsigned int prob_matrices,
unsigned int rate_cats,
unsigned int scale_buffers,
unsigned int attributes);
Creates a partition with either tips
character arrays or tips
CLV arrays (depending on attributes
, see Partition Attributes), and, additionally, clv_buffers
CLV vectors, for storing conditional probabilities at inner nodes. The partition structure is constructed for states
number of states (e.g. 4 for nucleotide and 20 for amino-acid data) and sufficient space is allocated to host an alignment of size sites*tips
. The number of rate matrices that can be used is given by rate_matrices
. Additionally, the function allocates space for hosting rate_matrices
arrays of substitution parameters, frequencies, and auxiliary eigen-decomposition arrays (transparent to the user). The parameter prob_matrices
dictates the number of probability matrices for which space will be allocated. This parameter is typically set to the number of branches the tree has (e.g., 2n-3 for unrooted and 2n-2 for rooted, where n is the number of tips/leaves). libpll
will automatically create space for prob_matrices*rate_cats
, where rate_cats
is the number of different rate categories. The array of probability matrices is indexed from 0 to prob_matrices
-1. Each matrix entry consists of sufficient space to accommodate rate_cats
matrices, which are stored consecutively in memory. Note that libpll
will not allocate space for the different substitution matrices specified by rate_matrices
. The user must indicate that to libpll
by multiplying prob_matrices
with the corresponding factor. Finally, scale_buffers
sets the number of scaling buffers to be allocated, and attributes
states the hardware acceleration options to be used (see Partition Attributes). The function returns a pointer to the allocated pll_partition_t
structure.
Note that, rate_matrices
are used to address heterotachy, i.e. transition probability matrices computed from different rate matrices. For more information, see [Updating transition probability matrices](see https://github.com/xflouris/libpll/wiki/Updating-transition-probability-matrices).
void pll_partition_destroy(pll_partition_t * partition);
Deallocates all data associated with the partition pointed by partition
.
int pll_set_tip_states(pll_partition_t * partition,
unsigned int tip_index,
const unsigned int * map,
const char * sequence);
Set the tip CLV (or tip character array) with index tip_index
of instance partition
, according to the character sequence sequence
and the conversion table map
, which translates (maps) characters to states. For an example see Setting CLV vectors at tips from sequences and maps.
void pll_set_tip_clv(pll_partition_t * partition,
unsigned int tip_index,
const double * clv);
Set the tip CLV with index tip_index
of instance partition
, to the contents of the array clv
. For an example see Setting CLV vectors manually. Note, this function cannot be used in conjuction with the PLL_ATTRIB_PATTERN_TIP
(see Partition Attributes).
void pll_set_subst_params(pll_partition_t * partition,
unsigned int params_index,
const double * params);
Set the parameters for substitution model with index params_index
, where params_index
ranges from 0 to rate_matrices
-1, as specified in the pll_partition_create()
call. Array params
should contain exactly (states x states - states) / 2
parameters of type double
. These values correspond to the upper triangle elements (above the main diagonal) of the rate matrix.
void pll_set_frequencies(pll_partition_t * partition,
unsigned int params_index,
const double * frequencies);
Sets the base frequencies for the substitution model with index params_index
, where params_index
ranges from 0 to rate_matrices
-1, as specified in the pll_partition_create()
call. The array of base frequencies (frequencies
) is copied into the instance. The order of bases in the array depends on the encoding used when converting tip sequences to CLV. For example, if the pll_map_nt
map was used with the pll_set_tip_states()
function to describe nucleotide data, then the order is A, C, G, T
. However, this can be arbitrarily set by adjusting the provided map.
void pll_set_pattern_weights(pll_partition_t * partition,
const unsigned int * pattern_weights);
Sets the vector of pattern weights (pattern_weights
) for partition
. The function reads and copies the first partition->sites
elements of pattern_weights
into partition->pattern_weights
.
void pll_set_category_rates(pll_partition_t * partition,
const double * rates);
Sets the rate categories for partition
. The function reads and copies the first partition->rate_cats
elements of array rates
into partition->rates
.
int pll_update_invariant_sites(pll_partition_t * partition);
Updates the invariant sites array partition->invariant
, according to the sequences in the partition. This function is implicitly called by pll_update_invariant_sites_proportion
when the specified proportion of invariant sites is greater than zero, but it must be explicitly called by the client code if the sequences change.
int pll_update_invariant_sites_proportion(pll_partition_t * partition,
unsigned int params_index,
double prop_invar);
Updates the proportion of invariant sites for the rate matrix of partition
with index params_index
. Note that, this call will not implicitly update the transition probability matrices computed from the particular rate matrix, but
must be done explicitly for example with a call to pll_update_prob_matrices()
.
void pll_update_prob_matrices(pll_partition_t * partition,
const unsigned int * params_indices,
const unsigned int * matrix_indices,
const double * branch_lengths,
unsigned int count);
Computes the transition probability matrices specified by the count
indices in matrix_indices
, for all rate categories. A matrix with index matrix_indices[i]
will be computed using the branch length branch_lengths[i]
. To compute the matrix for rate category j, the function uses the rate matrix with index params_indices[j]
. Matrices
are stored in partition->pmatrix[matrix_indices[i]]
. Note that, each such entry holds the matrices for all rate categories, stored consecutively in memory.
void pll_show_pmatrix(pll_partition_t * partition,
unsigned int index,
unsigned int float_precision);
Prints the transition probability matrices for each rate category of partition
associated with index
to standard output. The floating point precision is dictated by float_precision
.
void pll_update_eigen(pll_partition_t * partition,
unsigned int params_index);
Updates the eigenvectors (partition->eigenvecs[params_index]
), inverse eigenvectors (partition->eigenvecs[params_index]
), and eigenvalues (partition->eigenvals[params_index]
) using the substitution parameters (partition->subst_params[params_index]
) and base frequencies (partition->frequencies[params_index]
) specified by params_index
.
void pll_count_invariant_sites(pll_partition_t * partition,
unsigned int * state_inv_count);
Returns the number of invariant sites in the sequence alignment from partition
. The array state_inv_count
must be of size partition->states
and is filled such that entry i contains the count of invariant sites for state i.
void pll_update_partials(pll_partition_t * partition,
const pll_operation_t * operations,
unsigned int count);
Updates the count
conditional probability vectors (CPV) defined by the entries of operations
, in the order they appear in the array. Each operations
entry describes one CPV from partition
. See also pll_operation_t.
double pll_compute_root_loglikelihood(pll_partition_t * partition,
unsigned int clv_index,
int scaler_index,
const unsigned int * freqs_index,
double * persite_lnl);
Evaluates the log-likelihood of a rooted tree, for the vector of conditional probabilities (partials) with index clv_index
, scale buffer with index scaler_index
(or PLL_SCALE_BUFFER_NONE
), and base frequencies arrays with
indices freqs_index
(one per rate category). If persite_lnl
is not NULL
, then it must be large enough to hold partition->sites
double-precision values, and will be filled with the per-site log-likelihoods.
double pll_compute_edge_loglikelihood(pll_partition_t * partition,
unsigned int parent_clv_index,
int parent_scaler_index,
unsigned int child_clv_index,
int child_scaler_index,
unsigned int matrix_index,
const unsigned int * freqs_index,
double * persite_lnl);
Evaluates the log-likelihood of an unrooted tree, by providing the conditional probability vectors (partials) for two nodes that share an edge with indices parent_clv_index
resp. child_clv_index
, scale buffers with indices parent_scaler_index
resp. child_clv_index
(or PLL_SCALE_BUFFER_NONE
), the transition probability matrix with index matrix_index
and base frequencies arrays with indices freqs_index
(one per rate category). If persite_lnl
is not NULL
, then it must be large enough to hold partition->sites
double-precision values, and will be filled with the per-site log-likelihoods.
int pll_update_sumtable(pll_partition_t * partition,
unsigned int parent_clv_index,
unsigned int child_clv_index,
const unsigned int * params_indices,
double *sumtable);
Updates the sumtable
(required for computing the likelihood derivatives) at the specified branch.
-
sumtable
must be allocated beforehand forpartition->sites
xpartition->rates
xpartition->states_padded
xsizeof(double)
. - Partial likelihoods at
parent_clv_index
andchild_clv_index
must be updated.
int pll_compute_likelihood_derivatives(pll_partition_t * partition,
int parent_scaler_index,
int child_scaler_index,
double branch_length,
const unsigned int * params_indices,
const double * sumtable,
double * d_f,
double * dd_f);
Computes the first and second likelihood derivatives.
-
sumtable
must be updated (pll_update_sumtable
) and consistent withparent_scaler_index
andchild_scaler_index
. - Derivatives are returned in
d_f
(first) anddd_f
(second).
int pll_set_asc_bias_type(pll_partition_t * partition,
int asc_bias_type);
Set the ascertainment bias correction type to one of the following:
PLL_ATTRIB_ASC_BIAS_LEWIS
PLL_ATTRIB_ASC_BIAS_FELSENSTEIN
PLL_ATTRIB_ASC_BIAS_STAMATAKIS
Partition must have been created with PLL_ATTRIB_ASC_BIAS_MASK
flag active. Otherwise, this function returns an error.
Set the number of invariant sites for each of the states when ascertainment bias correction is active.
void pll_set_asc_state_weights(pll_partition_t * partition,
const unsigned int * state_weights);
pll_fasta_t * pll_fasta_open(const char * filename,
const unsigned int * map);
int pll_fasta_getnext(pll_fasta_t * fd, char ** head,
long * head_len, char ** seq,
long * seq_len, long * seqno);
void pll_fasta_close(pll_fasta_t * fd);
int pll_fasta_rewind(pll_fasta_t * fd);
long pll_fasta_getfilesize(pll_fasta_t * fd);
long pll_fasta_getfilepos(pll_fasta_t * fd);
pll_utree_t * pll_utree_parse_newick(const char * filename,
unsigned int * tip_count);
void pll_utree_destroy(pll_utree_t * root);
void pll_utree_show_ascii(pll_utree_t * tree, int options);
char * pll_utree_export_newick(pll_utree_t * root);
int pll_utree_traverse(pll_utree_t * root,
int (*cbtrav)(pll_utree_t *),
pll_utree_t ** outbuffer,
unsigned int * trav_size);
unsigned int pll_utree_query_tipnodes(pll_utree_t * root,
pll_utree_t ** node_list);
unsigned int pll_utree_query_innernodes(pll_utree_t * root,
pll_utree_t ** node_list);
void pll_utree_create_operations(pll_utree_t ** trav_buffer,
unsigned int trav_buffer_size,
double * branches,
unsigned int * pmatrix_indices,
pll_operation_t * ops,
unsigned int * matrix_count,
unsigned int * ops_count);
pll_utree_t * pll_utree_clone(pll_utree_t * root);
void pll_utree_reset_template_indices(pll_utree_t * node,
unsigned int tip_count);
int pll_utree_every(pll_utree_t * node,
int (*cb)(pll_utree_t *));
int pll_utree_check_integrity(pll_utree_t * root);
PLL_EXPORT int pll_utree_spr(pll_utree_t * p, pll_utree_t * r, pll_utree_rb_t * rb, double * branch_lengths, unsigned int * matrix_indices);
int pll_utree_spr_safe(pll_utree_t * p,
pll_utree_t * r,
pll_utree_rb_t * rb,
double * branch_lengths,
unsigned int * matrix_indices);
int pll_utree_nni(pll_utree_t * p,
int type,
pll_utree_rb_t * rb);
int pll_utree_rollback(pll_utree_rb_t * rollback,
double * branch_lengths,
unsigned int * matrix_indices);
pll_rtree_t * pll_rtree_parse_newick(const char * filename,
unsigned int * tip_count);
void pll_rtree_destroy(pll_rtree_t * root);
void pll_rtree_show_ascii(pll_rtree_t * tree, int options);
char * pll_rtree_export_newick(pll_rtree_t * root);
int pll_rtree_traverse(pll_rtree_t * root,
int (*cbtrav)(pll_rtree_t *),
pll_rtree_t ** outbuffer,
unsigned int * trav_size);
unsigned int pll_rtree_query_tipnodes(pll_rtree_t * root,
pll_rtree_t ** node_list);
unsigned int pll_rtree_query_innernodes(pll_rtree_t * root,
pll_rtree_t ** node_list);
void pll_rtree_create_operations(pll_rtree_t ** trav_buffer,
unsigned int trav_buffer_size,
double * branches,
unsigned int * pmatrix_indices,
pll_operation_t * ops,
unsigned int * matrix_count,
unsigned int * ops_count);
pll_utree_t * pll_rtree_unroot(pll_rtree_t * root);
int pll_compute_gamma_cats(double alpha,
unsigned int categories,
double * output_rates);
void pll_show_clv(pll_partition_t * partition,
unsigned int clv_index,
int scaler_index,
unsigned int float_precision);
Prints to standard output the conditional probability vector for index clv_index
from partition
, using the scale buffer with index scaler_index
. If no scale buffer was used, then scaler_index
must be passed the value PLL_SCALE_BUFFER_NONE
. The floating precision (number of digits after decimal point) is dictated by float_precision
. The output contains brackets, curly braces and round brackets to separate the values as sites, rate categories and states related, respectively.
int pll_dlist_append(pll_dlist_t ** dlist, void * data);
int pll_dlist_prepend(pll_dlist_t ** dlist, void * data);
int pll_dlist_remove(pll_dlist_t ** dlist, void * data);
int pll_query_utree_tipnodes(pll_utree_t * root,
pll_utree_t ** node_list);
Allocates and returns an array tips
elements long of aliases to the tip labels in the pll_utree_t
structs of the tree
. The caller must free this array. The order of labels comes from a traversal of the tree
typedef struct
{
unsigned int tips;
unsigned int clv_buffers;
unsigned int states;
unsigned int sites;
unsigned int pattern_weight_sum;
unsigned int rate_matrices;
unsigned int prob_matrices;
unsigned int rate_cats;
unsigned int scale_buffers;
unsigned int attributes;
const unsigned int * map;
/* vectorization options */
size_t alignment;
unsigned int states_padded;
double ** clv;
double ** pmatrix;
double * rates;
double * rate_weights;
double ** subst_params;
unsigned int ** scale_buffer;
double ** frequencies;
double * prop_invar;
int * invariant;
unsigned int * pattern_weight;
int * eigen_decomp_valid;
double ** eigenvecs;
double ** inv_eigenvecs;
double ** eigenvals;
/* tip-tip precomputation data */
unsigned int maxstates;
unsigned char ** tipchar;
unsigned char * charmap;
double * ttlookup;
unsigned int * tipmap;
/* ascertainment bias correction */
int asc_bias_alloc;
} pll_partition_t;
Argument | Description |
---|---|
tips |
Number of tip sequences (one CLV per tip will be allocated) |
clv_buffers |
Number of extra CLV arrays to allocate (typically for inner nodes) |
states |
Number of states for the data in this partition will have (typically 4 for nucleotide, 20 for aminoacid) |
sites |
Number of sites that this partition will hold (length of alignment) |
`pattern_weight_sum | Sum of pattern weights (pattern_weights vector) |
rate_matrices |
Number of different substitution matrices that can be used simultaneously |
prob_matrices |
Number of probability matrices to create. Each entry will hold rate_cats matrices. Each matrix has size states x states
|
rate_cats |
Number of rate categories |
scale_buffers |
Number of scale buffers to allocate (not yet implemented) |
attributes |
Hardware acceleration options (not yet implemented) |
map |
TODO |
alignment |
The address of allocated memory for CLVs/matrices will always be a multiple of alignment (libpll inner use) |
states_padded |
Number of states padded according to the vectorization |
clv |
Array of conditional likelihood vectors for tips and inner nodes. The array has exactly tips + clv_buffers CLVs. Each CLV has sites x states x rate_cats elements of size double
|
pmatrix |
Array of exactly prob_matrices probability matrix arrays. Each probability matrix array is of size states x states x rate_cats elements of size double
|
rates |
Array of rate_matrices rate arrays. Each rate array is of size rate_cats elements of size double
|
rate_weights |
Weight of each rate category, summing to 1. For regular discrete gamma rates heterogeneity, the weights are equal |
subst_params |
Array of rate_matrices substitution rate arrays. Each substitution rate array holds exactly (states x states - states) / 2 substitution rates |
scale_buffer |
Array of scale buffers (not yet implemented) |
frequencies |
Array of rate_matrices arrays of base frequencies. Each array contains exactly states elements of size double
|
prop_invar |
Array of rate_matrices proportions of invariant/invariable sites of size double
|
invariant |
Array of size sites where i-th element indicates whether i-th site is invariant. If it is invariant it is set to the index of frequency array that the character of the site is represented by. Otherewise, it is set to -1 |
pattern_weight |
Weight of each pattern |
eigen_decomp_valid |
indicates whether the eigen decomposition is cached and valid. If it is, then eigen_decomp_valid is set to 1, 0 otherwise |
eigenvecs |
Array of rate_matrices linearized arrays of eigen vectors. Entry i contains the eigenvectors (in linearized form) of the i-th instantaneous substitution matrix and is of size states x states
|
inv_eigenvecs |
Array of rate_matrices linearized arrays of inverse eigen vectors. Entry i contains the inverse eigenvectors (in linearized form) of the i-th instantaneous substitution matrix and is of size states x states
|
eigenvals |
Array of rate_matrices arrays of eigen values. Entry i contains the eigenvalues of the i-th instantaneous substitution matrix and is of size states
|
maxstates |
TODO |
tipchars |
TODO |
charmap |
TODO |
ttlookup |
TODO |
tipmap |
TODO |
asc_bias_alloc |
Flag to determine whether the partition has been allocated with support for ascertainment bias correction |
typedef struct
{
unsigned int parent_clv_index;
int parent_scaler_index;
unsigned int child1_clv_index;
unsigned int child1_matrix_index;
int child1_scaler_index;
unsigned int child2_clv_index;
unsigned int child2_matrix_index;
int child2_scaler_index;
} pll_operation_t;
typedef struct pll_dlist
{
struct pll_dlist * next;
struct pll_dlist * prev;
void * data;
} pll_dlist_t;
typedef struct pll_msa_s
{
int count;
int length;
char ** sequence;
char ** label;
} pll_msa_t;
typedef struct
{
FILE * fp;
char line[PLL_LINEALLOC];
unsigned int * chrstatus;
long no;
long filesize;
long lineno;
long stripped_count;
long stripped[256];
} pll_fasta_t;
typedef struct tree_noderec
{
char * label;
double length;
unsigned int node_index;
unsigned int clv_index;
int scaler_index;
unsigned int pmatrix_index;
struct tree_noderec * next;
struct tree_noderec * back;
void * data;
} pll_utree_t;
typedef struct pll_rtree
{
char * label;
double length;
unsigned int node_index;
unsigned int clv_index;
int scaler_index;
unsigned int pmatrix_index;
struct pll_rtree * left;
struct pll_rtree * right;
struct pll_rtree * parent;
void * data;
} pll_rtree_t;
typedef struct pll_utree_rb_s
{
int move_type;
union
{
struct
{
pll_utree_t * p;
pll_utree_t * r;
pll_utree_t * rb;
pll_utree_t * pnb;
pll_utree_t * pnnb;
double r_len;
double pnb_len;
double pnnb_len;
} spr;
struct
{
pll_utree_t * p;
int nni_type;
} nni;
};
} pll_utree_rb_t;