Capsidgraph is a python library for generating and studying the graphs representing the protein subunit interaction networks of viral capsids.
- Python 3.10 or higher
- The
networkx
library is used to handle operations on graphs. - The
matplotlib
andnumpy
libraries are used to create figures. - To use the capsidgraph library, you need to add the capsidgraph folder to your PYTHONPATH environment variable. This can be done by adding the following line to your .bashrc file (replace the path with the path to the capsidgraph folder on your computer):
export PYTHONPATH=$PYTHONPATH:/path/to/capsidgraph
The capsidgraph.analyser
module provides tools to compute information on the bond strength of different capsid graphs. This module requires networkx graphs with weighted nodes and can give insight into the ability of a capsid to resist both the removal of bonds (edges in the graph) and capsomers, i.e. protein units corresponding to the building blocks of the capsid (nodes in the graph).
For a graph that does not have weighted edges, the functions get_fragmentation_probability_random_node_removal
and get_fragmentation_probability_random_edge_removal
approximate the probability random
function, and then check the connectivity of the graph using the networkx function nx.is_connected
.
The function probability_fragment
implements the random edge or node removal process. It takes as argument the graph to fragment and a Dict contaning two entries :
fragmentation
: float between 0 and 1, probability of removalfragmentation_type
: a string equals to "nodes" or "edges", determine wether to remove nodes or edges from the graph.
The functions get_fragmentation_probability_threshold_node
and get_fragmentation_probability_threshold_edge
use a bisection method to approximate the threshold probability of removal for which the probability of fragmentation is 0.5, i.e the percolation threshold.
The parameter error_probability
provides an upper bound for the probability of the predicted value being correct. This routine uses get_fragmentation_probability
to determine whether or not the probability of fragmentation for a given probability of removal is above or below 0.5 with a high enough probability.
To ensure that the probability of the value being incorrect is below error_probability
, we stop a bisection step if the following inequality is true
Minimum and maximum numbers of iterations for each bisection step can be provided as well. If the maximum number of simulations is reached before the probability condition is met, the bisection algorithm stops.
The strength of each bond is stored as the strength
attribute of the corresponding edge in the graph. When removing edges in the graph we take into account this strength by attributing probability weights to each edge. This weight is defined as being proportional to the inverse of the strength.
In order to remove a certain “amount of strength” from the capsid, we define the notion of fragmentation probability by repeating following procedure:
- We first randomly pick an edge from the graph using the build it
choices
function from therandom
library. - We then remove the strength of this bond from the total, thus computing the amount of strength that is left to remove.
- We remove the corresponding bond from the list of remaining bonds.
- We repeat this process as long as there is a bond in the graph that has less strength than the amount we have yet to remove.
- Once we are no longer able to remove edges, we determine whether or not the graph is fragmented using the
networkx
method for both removing edges and checking the connectivity of the graph.
The function strength_edges_fragment
implements the edge removal process, and takes as argument the graph to remove, and a Dict which fragmentation
entry represents the amount of strength to remove from the graph.
The function get_fragmentation_probability_strength_edge_removal
computes the probability of fragmentation given an amount of strength to remove by using a Monte Carlo method: the previously described procedure is repeated to approximate the fragmentation probability.
The number of iterations and the edge removal probability are given as parameters.
The function get_fragmentation_strength_threshold_edge
approximates the strength percolation threshold (i.e., the “strength” to remove in order to obtain a probability of fragmentation of 0.5) by using the same bisection method as get_fragmentation_probability_threshold_node
, using get_fragmentation_probability_strength_edge_removal
for each step.
The “strength of a node” is defined as the “strength” needed to remove it, i.e as the sum of the energies of all edges adjacent to it. The method for removing nodes is similar to the one for removing edges: the probability weight of each node is proportional to the inverse of its strength. Here the strength of each node is stored as a networkx attribute. The main difference being that as a node gets removed, the energies of the neighbouring nodes have to be decreased by the strength of the edges that were linked to them via the removed node. This process can also leave isolated nodes that have an strength of zero, with undefined probability weights. This situation is avoided by removing isolated neighbours as well as the chosen node.
Note that the get_fragmentation_strength_threshold_node
and get_fragmentation_probability_strength_node_removal
functions are, respectively, similar to get_fragmentation_strength_threshold_edge
and get_fragmentation_probability_strength_edge_removal
in terms of parameters.
The function strength_nodes_fragment
implements the node removal process and has the same argments as the strength_edges_fragment
function.
Note that for these functions to work properly, the strength
attribute of the nodes needs to be initialized before passing the graph to the functions. The init_nodes_strength
function sets the nodes attribute as previously described, given a graph where the edge strength attributes are already defined. See the examples for more details.
The capsidgraph.generator
modules provides methods to compute the statistic destribution of "hole sizes" in graph under fragmentation.
Let
Consider the set of connected components of maximal size
let
let
Then the largest hole size of
We say that a graph is hole-fragmented if
If
The get_hole_size
function implements the above-mentioned algorithm to determine the size of the largest hole of a graph.
The get_hole_size_distribution
function estimate the probability distribution of the "Hole size" random variable. The fragment function used to remove edges or nodes from the graph is passed as a parameter.
The capsidgraph.generator
module provides tools for generating the graphs representing the protein subunit interaction networks of viral capsids. This module can generate graphs representing the interaction network of a viral capsid.
A face in the capsid graph is generated by starting with a list of edges forming a tile, as well as the two translation vectors create_icosahedral_face_edges
automates this algorithm. The three parameters required are the edges defining the tile, the translation vectors
For instance, the following code generated a face tiled with triangles.
from capsidgraph.generator import create_icosahedral_face_edges
tile = [
((0,0),(0,1)),
((0,0),(-1,1)),
((0,0),(-1,0)),
((0,0),(0,-1)),
((0,0),(1,-1)),
((0,0),(1,0)),
]
Tx = (1, 0)
Ty = (0,1)
face_edges, axis = create_icosahedral_face_edges(tile,Tx,Ty,1,2)
The tile drawn in Cartesian coordinates can be seen below
Which gives the following tiling
The function returns only the edges inside the defined triangle
which corresponds to the following edges and vertices delimiting the face
faceEdges = [((1, 0), (1, 1)), ((1, 1), (1, 2)), ((1, -1), (1, 0)), ((2, 0), (2, 1)), ((2, -1), (2, 0)), ((1, 0), (0, 1)), ((1, 1), (0, 2)), ((2, 0), (1, 1)), ((2, -1), (1, 0)), ((3, -1), (2, 0)), ((1, 0), (0, 0)), ((1, 1), (0, 1)), ((2, 0), (1, 0)), ((2, 1), (1, 1)), ((3, 0), (2, 0)), ((1, 0), (1, -1)), ((1, 1), (1, 0)), ((1, 2), (1, 1)), ((2, 0), (2, -1)), ((2, 1), (2, 0)), ((0, 1), (1, 0)), ((0, 2), (1, 1)), ((1, 0), (2, -1)), ((1, 1), (2, 0)), ((2, 0), (3, -1)), ((0, 0), (1, 0)), ((0, 1), (1, 1)), ((1, 0), (2, 0)), ((1, 1), (2, 1)), ((2, 0), (3, 0))]
faceVertices = ((0, 0), (1, 2), (3, -1))
Some patterns are predefined in the generator.icosahedral_patterns
module.
Given one triangular face of an icosahedron as described above, the graph of an icosahedral surface lattice is obtained as follows. By copying this face 20 times and merging nodes such that those 20 triangles match at the icosahedral edges, we generate the graph of the corresponding capsid. This construction is done with the help of a dictionary that associates to each node the face(s) associated with it, along with its coordinates inside each face. Initially every node only belongs to one triangular face, and this information is stored in terms of its coordinates inside that face. To "glue" two triangular faces
Once all 20 triangular faces have been correctly assembled into an icosahedral surface lattice, the associated graph is fully assembled. In our example, the graph looks like this (graph visualised with the draw
function of networkx)
This construction is implemented in the function create_icosahedral_capsid_graph
which takes the following parameters as input:
face_edges
: a list of edges corresponding to the edges of a triangular face;triangle_vertices
: the three vertices delimiting the faces (tuple of 3 points);bond_strength
: optional, this list needs to have the same length asface_edges
. This parameter defines the bond strength of every vertex of the capsid graph, wherebond_strength[i]
corresponds to the strength of the bondface_edges[i]
.
Some nanoparticle interaction networks can be constructed through a process similar to the icosahedral generation algorithm, where the icosahedral pattern is replaced by a cubic one. The faces are now squares which vertices sit in 4-fold symetry axis of the lattice.
The create_cubic_face_edges
function is similar to create_icosahedral_face_edges
, with the exeption of the h and k parameters that have been replaced with the coordinates of one vertex of the square face (given that one of them is sitting at the origin this defined the square).
Some patterns are predefined in the generator.cubic_patterns
module.
The following code
from capsidgraph.generator import (
cubic_patterns,
create_cubic_face_edges,
create_cubic_capsid_graph,
)
P = cubic_patterns.AALS_48_PATTERN
[edges, Tx, Ty, face_side_edge] = P
face_edges, face_square_vertices = create_cubic_face_edges(edges, Tx, Ty, face_side_edge)
G = create_cubic_capsid_graph(face_edges, face_square_vertices)
will create a graph G, isomorphic to the following graph :