Skip to content

Commit

Permalink
Better documentation. Fixed testing of swig bindings. Added examples …
Browse files Browse the repository at this point in the history
…for swig bindings.
  • Loading branch information
Guillaume Marcais committed Dec 18, 2014
1 parent 339afaf commit 5d3d057
Show file tree
Hide file tree
Showing 11 changed files with 90 additions and 10 deletions.
29 changes: 26 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Overview

Jellyfish is a tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence. Jellyfish can count k-mers using an order of magnitude less memory and an order of magnitude faster than other k-mer counting packages by using an efficient encoding of a hash table and by exploiting the "compare-and-swap" CPU instruction to increase parallelism.

JELLYFISH is a command-line program that reads FASTA and multi-FASTA files containing DNA sequences. It outputs its k-mer counts in an binary format, which can be translated into a human-readable text format using the "jellyfish dump" command. See the documentation below for more details.
JELLYFISH is a command-line program that reads FASTA and multi-FASTA files containing DNA sequences. It outputs its k-mer counts in a binary format, which can be translated into a human-readable text format using the "jellyfish dump" command, or queried for specific k-mers with "jellyfish query". See the UserGuide provided on [Jellyfish's home page][1] for more details.

If you use Jellyfish in your research, please cite:

Expand All @@ -15,7 +15,7 @@ If you use Jellyfish in your research, please cite:
Installation
------------

To get packaged tar ball of the source code, see the [home page of Jellyfish at the University of Maryland](http://www.genome.umd.edu/jellyfish.html "University of Maryland website").
To get an easier to compiled packaged tar ball of the source code, see the [home page of Jellyfish at the University of Maryland][1].

To compile from the git tree, you will need autoconf/automake, make, g++ 4.4 or newer and [yaggo](https://github.com/gmarcais/yaggo "Yaggo on github"). Then compile with:

Expand All @@ -29,4 +29,27 @@ sudo make install
Extra / Examples
----------------

In the examples directory are potentially useful extra programs to query/manipulates output files from Jellyfish. The examples are not compiled by default. Each subdirectory of examples is independent and is compiled with a simple invocation of 'make'.
In the examples directory are potentially useful extra programs to query/manipulates output files of Jellyfish, using the shared library of Jellyfish in C++ or with scripting languages. The examples are not compiled by default. Each subdirectory of examples is independent and is compiled with a simple invocation of 'make'.


Binding to script languages
---------------------------

Bindings to Ruby, Python and Perl are provided. This binding allows to read the output file of Jellyfish directly in a scripting language. Compilation of the bindings is easier from the tarball provided on [Jellyfish's home page][1].

Compilation of the bindings from the git tree requires [SWIG](http://swig.org) version 3, and the development files of the scripting languages. To compile all three bindings, configure with:

```Shell
./configure --enable-swig --enable-ruby-binding --enable-python-binding --enable-perl-binding
```

Note that the headers of older version of Perl 5 do not compile with recent compilers (g++ > 4.4, clang++) and C++11 mode enable. One may have to specify in addition `CXX=g++4.4` to compile the perl binding.

The binding can installed in a different location than the default (which may require root privileges for example) by passing a path to the `--enable` switches. Then, for Python, Ruby or Perl to find the binding, an environment variable may need to be adjusted (`PYTHONPATH`, `RUBYLIB` and `PERL5LIB` respectively). For example:

```Shell
./configure --prefix=$HOME --enable-swig --enable-python-binding=$HOME/lib/python
export PYTHONPATH=$HOME/lib/python
```

[1]: http://www.genome.umd.edu/jellyfish.html "Genome group at University of Maryland"
1 change: 1 addition & 0 deletions examples/swig/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Simple examples on how to implement (simplified versions of) 'jellyfish dump' and 'jellyfish query' in Python, Ruby and Perl.
8 changes: 8 additions & 0 deletions examples/swig/dump.pl
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#! /usr/bin/env perl

use jellyfish;

my $mf = jellyfish::ReadMerFile->new($ARGV[0]);
while($mf->next_mer) {
print($mf->mer, " ", $mf->count, "\n");
}
8 changes: 8 additions & 0 deletions examples/swig/dump.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#! /usr/bin/env python

import jellyfish
import sys

mf = jellyfish.ReadMerFile(sys.argv[1])
for mer, count in mf:
print("%s %d" % (mer, count))
9 changes: 9 additions & 0 deletions examples/swig/dump.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#! /usr/bin/env ruby

require 'jellyfish'

mf = Jellyfish::ReadMerFile.new(ARGV[0])
mf.each { |mer, count|
print(mer, " ", count, "\n")
}

8 changes: 8 additions & 0 deletions examples/swig/query.pl
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#! /usr/bin/env perl

use jellyfish;

my $qf = jellyfish::QueryMerFile->new(shift @ARGV);
for my $s (@ARGV) {
print($s, " ", $qf->get(jellyfish::MerDNA->new($s)), "\n");
}
9 changes: 9 additions & 0 deletions examples/swig/query.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#! /usr/bin/env python

import jellyfish
import sys

qf = jellyfish.QueryMerFile(sys.argv[1])
for str in sys.argv[2:]:
print("%s %d" % (str, qf[jellyfish.MerDNA(str)]))

9 changes: 9 additions & 0 deletions examples/swig/query.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#! /usr/bin/env ruby

require 'jellyfish'

qf = Jellyfish::QueryMerFile.new(ARGV[0])
ARGV[1..-1].each { |s|
print(s, " ", qf[Jellyfish::MerDNA.new(s)], "\n")
}

5 changes: 3 additions & 2 deletions tests/swig_perl.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,9 @@ cd tests

LOADPATH="$BUILDDIR/swig/perl5"
K=$($PERL -e 'print(int(rand(16)) + 6)')
$JF count -m $K -s 10M -t $nCPUs -C -o ${pref}.jf
$JF dump ${pref}.jf > ${pref}.dump
I=$($PERL -e 'print(int(rand(5)))')
$JF count -m $K -s 10M -t $nCPUs -C -o ${pref}.jf seq1m_$I.fa
$JF dump -c ${pref}.jf > ${pref}.dump
$JF histo ${pref}.jf > ${pref}.histo

for i in test_mer_file.t test_hash_counter.t; do
Expand Down
5 changes: 3 additions & 2 deletions tests/swig_python.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,9 @@ cd tests

export PYTHONPATH="$BUILDDIR/swig/python/.libs:$BUILDDIR/swig/python${PYTHONPATH+:$PYTHONPATH}"
K=$($PYTHON -c 'import random; print(random.randint(6, 20))')
$JF count -m $K -s 10M -t $nCPUs -C -o ${pref}.jf
$JF dump ${pref}.jf > ${pref}.dump
I=$($PYTHON -c 'import random; print(random.randint(0, 4))')
$JF count -m $K -s 10M -t $nCPUs -C -o ${pref}.jf seq1m_$I.fa
$JF dump -c ${pref}.jf > ${pref}.dump
$JF histo ${pref}.jf > ${pref}.histo

for i in test_mer_file.py test_hash_counter.py; do
Expand Down
9 changes: 6 additions & 3 deletions tests/swig_ruby.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,14 @@ cd tests
[ -z "$ENABLE_RUBY_BINDING" ] && exit 77

LOADPATH="$BUILDDIR/swig/ruby/.libs"
K=$($RUBY -e 'print(rand(15) + 1)')
$JF count -m $K -s 10M -t $nCPUs -C -o ${pref}.jf
$JF dump ${pref}.jf > ${pref}.dump
K=$($RUBY -e 'print(rand(15) + 6)')
I=$($RUBY -e 'print(rand(5))')
$JF count -m $K -s 10M -t $nCPUs -C -o ${pref}.jf seq1m_$I.fa
$JF dump -c ${pref}.jf > ${pref}.dump
$JF histo ${pref}.jf > ${pref}.histo



for i in test_mer_file.rb test_hash_counter.rb; do
echo Test $i
$RUBY "-I$LOADPATH" "$SRCDIR/swig/ruby/$i" .
Expand Down

0 comments on commit 5d3d057

Please sign in to comment.