Skip to content

MurrellGroup/AssigningSecondaryStructure.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AssigningSecondaryStructure

Latest Release MIT license Build Status Coverage

AssigningSecondaryStructure provides a way to assign loops, helices, and strands to protein backbones using a simplified version of the DSSP algorithm.

Both the BioStructures.jl and ProteinSecondaryStructures.jl packages provide interfaces for more sophisticated secondary structure assignment, but they both call the DSSP_jll.jl binary under the hood, which requires writing structures to a file with significant overhead.

Installation

The package is registered in the General registry, and can be installed from the REPL with ]add AssigningSecondaryStructure.

Usage

The assign_secondary_structure function takes a vector of atom coordinate arrays of size (3, 3, L). The first axis is for the x, y, and z coordinates, the second axis is for the atom types (N, CA, C), and the third axis is for the residues.

julia> using BioStructures

julia> coords_vector = map(collectchains(read("test/data/1ZAK.pdb", PDBFormat))) do chain
        reshape(coordarray(chain, backboneselector), 3, 4, :)[:, 1:3, :] # get N, CA, C atoms only
    end

julia> using AssigningSecondaryStructure

julia> assign_secondary_structure(coords_vector) # 2 chains
2-element Vector{Vector{Int64}}:
 [1, 1, 1, 1, 3, 3, 3, 3, 3, 3    2, 2, 2, 2, 2, 2, 2, 1, 1, 1]
 [1, 1, 1, 1, 3, 3, 3, 3, 3, 3    2, 2, 2, 2, 2, 2, 2, 1, 1, 1]

The output is vectors of integers:

  • 1: loop
  • 2: helix
  • 3: strand

Acknowledgements

This package was originally ported from the PyDSSP package, created by Shintaro Minami. The code has since been rewritten to look more like the 1983 paper (Kabsch W and Sander C), and to be more Julian, understandable, and efficient, at the cost of it no longer being differentiable like the PyDSSP version. The time complexity of assigning the secondary structure of a chain with $n$ residues is $O(n \log n)$.