Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] MinHash class refactoring #1128

Merged
merged 59 commits into from
Aug 5, 2020
Merged
Show file tree
Hide file tree
Changes from 58 commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
883732c
move sourmash._minhash to sourmash.minhash
ctb Jul 25, 2020
80f9bef
deprecate max_hash throughout
ctb Jul 25, 2020
1584283
change MinHash.add(...) to MinHash.add_kmer(...)
ctb Jul 25, 2020
72b3ab9
remove update and is_molecule_type from MinHash
ctb Jul 25, 2020
c3567ed
remove subtract_mins
ctb Jul 25, 2020
19a4365
rename downsample_n to downsample_num
ctb Jul 25, 2020
01e8a45
switch to hashes property instead of using get_mins()
ctb Jul 25, 2020
de589ea
replace get_mins(...) with hashes thruout
ctb Jul 25, 2020
c3b4416
change deprecated 'compare' usage to 'similarity' in test_jaccard
ctb Jul 26, 2020
7a7bba9
elminate most of the deprecation warnings in test__minhash by switchi…
ctb Jul 26, 2020
f97bf39
fix remaining tests in test__minhash
ctb Jul 26, 2020
1bd9df8
fix compat message
ctb Jul 26, 2020
fe1384f
Merge branch 'master' of github.com:dib-lab/sourmash into remove_comp…
ctb Jul 26, 2020
b1b5dd4
Merge branch 'master' of github.com:dib-lab/sourmash into minhash_cla…
ctb Jul 27, 2020
7743a95
restore removed functions, sigh :)
ctb Jul 27, 2020
70edc78
minor upd
ctb Jul 27, 2020
eb6b971
add deprecations
ctb Jul 28, 2020
ccfe8b8
Merge branch 'master' into remove_compare_from_tests
ctb Jul 28, 2020
031f91a
Merge branch 'remove_compare_from_tests' of github.com:dib-lab/sourma…
ctb Jul 28, 2020
624b005
Merge branch 'master' of github.com:dib-lab/sourmash into minhash_cla…
ctb Jul 28, 2020
464dcca
use a wrapper object for .hashes and make it read-only
ctb Jul 28, 2020
23171d9
refactor to use downsample(num/scaled=
ctb Jul 28, 2020
02239d9
refactor to use downsample(scaled=...)
ctb Jul 29, 2020
07cb474
return two deleted tests
ctb Jul 29, 2020
9d178c9
fixed test that was masked by another test
ctb Jul 29, 2020
3b2b35b
add explicit check for length of kmer in add_kmer
ctb Jul 29, 2020
f6faf89
fix ordering in hash retrieval
ctb Jul 29, 2020
aa5441f
fix more tests for py2 <khaaaaaaaaaan>
ctb Jul 29, 2020
64f99e3
add 'flatten' method to MinHash
ctb Jul 30, 2020
d6222b6
add test for MinHash.flatten
ctb Jul 30, 2020
372f4ec
add tests for add and add_kmer
ctb Jul 30, 2020
4ef2505
remove nonsense test
ctb Jul 30, 2020
7b77b77
test the (now deprecated) get_mins function
ctb Jul 30, 2020
1cb391c
test (deprecated) get_hashes
ctb Jul 30, 2020
899ec4c
add tests for downsample and is_molecule_type
ctb Jul 30, 2020
6b33685
test moltype properties more explicitly
ctb Jul 30, 2020
09068a6
fix py27
ctb Jul 30, 2020
2f6909c
move translate_codon to module level
ctb Aug 1, 2020
8d3c083
put a stub in place of _minhash with a FutureWarning
ctb Aug 2, 2020
9a8cf64
adjust import req
ctb Aug 2, 2020
8d38fe3
Merge branch 'master' of github.com:dib-lab/sourmash into minhash_cla…
ctb Aug 3, 2020
f8c9c00
remove __future__ imports
ctb Aug 4, 2020
5d86020
remove sys.version checks for py 2
ctb Aug 4, 2020
274be2e
remove requirement for enum34
ctb Aug 4, 2020
cace054
remove __reduce__ from MinHash class (#1144)
ctb Aug 4, 2020
3d6961c
remove __reduce__ again, here :)
ctb Aug 4, 2020
f10d632
avoid the DeprecationWarning
ctb Aug 5, 2020
57679d2
update docs: only python 3.7 and 3.8
ctb Aug 5, 2020
d00e77d
remove 2.7 from travis
ctb Aug 5, 2020
fef2c64
remove _compat from signature.py
ctb Aug 5, 2020
5745db1
remove _compat from exceptions.py
ctb Aug 5, 2020
1fe7668
remove _compat from index and sbt_storage
ctb Aug 5, 2020
26e4c0d
remove _compat from nodegraph
ctb Aug 5, 2020
c5f1c43
remove _compat completely
ctb Aug 5, 2020
69252ab
Merge branch 'remove_py2' into minhash_class_munge
ctb Aug 5, 2020
ac6e2fc
make signature -> sig in CLI using py3 'aliases'
ctb Aug 5, 2020
310b267
put back assert that didn't work in py2
ctb Aug 5, 2020
a1fbf9f
Merge branch 'remove_py2' into minhash_class_munge
ctb Aug 5, 2020
91de874
Update sourmash/minhash.py
ctb Aug 5, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -110,8 +110,6 @@ jobs:
python: 3.7
env:
- TOXENV=docs
- <<: *test
python: 2.7

- &wheel
stage: build wheel and send to github releases
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ A quickstart tutorial [is available](https://sourmash.readthedocs.io/en/latest/t

### Requirements

sourmash runs under both Python 2.7.x and Python 3.5+. The base
sourmash runs under Python 3.7 and later. The base
requirements are screed, cffi, numpy, matplotlib, and scipy. Conda
(see below) will install everything necessary, and is our recommended
installation method.
Expand Down
6 changes: 1 addition & 5 deletions benchmarks/benchmarks.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,7 @@
from __future__ import unicode_literals
import random


try:
from sourmash._minhash import MinHash
except:
from sourmash.minhash import MinHash
from sourmash.minhash import MinHash


def load_sequences():
Expand Down
2 changes: 1 addition & 1 deletion doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -301,4 +301,4 @@
# If true, do not generate a @detailmenu in the "Top" node's menu.
#texinfo_no_detailmenu = False

autodoc_mock_imports = ["sourmash._minhash"]
autodoc_mock_imports = ["sourmash.minhash"]
4 changes: 2 additions & 2 deletions doc/developer.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ You can get the latest development master branch with:
```
git clone https://github.com/dib-lab/sourmash.git
```
sourmash runs under both Python 2.7.x and Python 3.5+. The base
sourmash runs under Python 3.7 and later. The base
requirements are screed and cffi, together with a Rust environment (for the
extension code). We suggest using `rustup` to install the Rust environment:

Expand All @@ -34,7 +34,7 @@ run the Rust tests.

### If you're having trouble installing or using the development environment

If you are getting an error that contains `ImportError: cannot import name 'to_bytes' from 'sourmash._minhash'`, then it's likely you need to update Rust and clean up your environment. Some installation issues can be solved by simply removing the intermediate build files with:
If you are getting an error that contains `ImportError: cannot import name 'to_bytes' from 'sourmash.minhash'`, then it's likely you need to update Rust and clean up your environment. Some installation issues can be solved by simply removing the intermediate build files with:

```
make clean
Expand Down
5 changes: 2 additions & 3 deletions doc/requirements.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
# Computational requirements


sourmash has no particular memory requirements; it will need to hold
the largest single sequence you have in memory, but the individual
signatures are quite small and we do no special buffer allocation.
Expand All @@ -11,8 +10,8 @@ in a second or so on a rather slow 2016 Mac laptop.

MinHash sketches and signatures are quite small on disk.

sourmash should run with little modification on Linux and Mac OS X,
under Python 2.7.11 and Python 3.5. Please see [the development repository README][0]
sourmash should run with no modification on Linux and Mac OS X,
under Python 3.7 and later. Please see [the development repository README][0]
for
information on source code, tests, and continuous integration.
[0]:https://github.com/dib-lab/sourmash/blob/master/README.md
7 changes: 2 additions & 5 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
from __future__ import print_function
import os
from setuptools import setup, find_packages
import sys
Expand Down Expand Up @@ -37,9 +36,8 @@ def build_native(spec):
"Operating System :: POSIX :: Linux",
"Operating System :: MacOS :: MacOS X",
"Programming Language :: Rust",
"Programming Language :: Python :: 2.7",
"Programming Language :: Python :: 3.5",
"Programming Language :: Python :: 3.6",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Topic :: Scientific/Engineering :: Bio-Informatics",
]

Expand All @@ -63,7 +61,6 @@ def build_native(spec):
]
},
"install_requires": ['screed>=0.9', 'cffi>=1.14.0', 'numpy',
'enum34; python_version < "3.4"',
'matplotlib', 'scipy', 'deprecation>=2.0.6'],
"setup_requires": [
"setuptools>=38.6.0",
Expand Down
3 changes: 1 addition & 2 deletions sourmash/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
"""
An implementation of a MinHash bottom sketch, applied to k-mers in DNA.
"""
from __future__ import print_function
import re
import math
import os
Expand All @@ -25,7 +24,7 @@
"use the PyPI ones."
)

from ._minhash import MinHash, get_minhash_default_seed, get_minhash_max_hash
from .minhash import MinHash, get_minhash_default_seed, get_minhash_max_hash

DEFAULT_SEED = get_minhash_default_seed()
MAX_HASH = get_minhash_max_hash()
Expand Down
1 change: 0 additions & 1 deletion sourmash/__main__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
from __future__ import print_function
import sourmash


Expand Down
27 changes: 0 additions & 27 deletions sourmash/_compat.py

This file was deleted.

Loading