Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 1.0 #639

Merged
merged 64 commits into from
Mar 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
237f43d
DEV: Add lightning as dependency in pyproject.toml
NickleDave Nov 24, 2022
27ea7df
Add TweetyNet as a LightningModule subclass
NickleDave Nov 25, 2022
26a173c
Remove engine sub-package
NickleDave Nov 25, 2022
29592a5
Add TweetyNet as vak.models entry point
NickleDave Nov 25, 2022
c721a57
Rewrite TeenyTweetyNet as LightningModule also
NickleDave Nov 25, 2022
0b60ce5
Add labelmap parameter to models.from_config_model_map function
NickleDave Nov 25, 2022
9244f95
Remove tweetynet from test dependencies in pyproject.toml
NickleDave Nov 25, 2022
0ed420b
Fix how vak/config/models loads model config sections
NickleDave Nov 25, 2022
3ad04f9
Add src/vak/trainer.py with get_trainer function
NickleDave Nov 26, 2022
c9c1ec7
Rewrite core/train to use vak.trainer.get_trainer
NickleDave Nov 25, 2022
f6d9b96
Import net and model classes in vak/models/__init__.py
NickleDave Nov 26, 2022
ea45ffc
Rewrite core/eval.py to use lightning trainer
NickleDave Nov 26, 2022
f531055
Rewrite core/predict.py to use lightning
NickleDave Nov 27, 2022
748a776
TST: Fix test asserts that check for log files
NickleDave Dec 2, 2022
60b026f
DEV: Fix session in noxfile.py: test_data_download_generated_all
NickleDave Dec 4, 2022
9d002d1
Fix tests/scripts/fix_prep_csv_paths.py
NickleDave Dec 4, 2022
5a7fde8
DEV: Update generated test data URLs in noxfile.py
NickleDave Dec 4, 2022
580a6ce
DOC: Update CHANGELOG after merging #598 [skip ci]
NickleDave Dec 5, 2022
234ae97
Remove forward method from TweetyNet LightningModule
NickleDave Dec 5, 2022
4b19ed7
Add vak/nets/
NickleDave Dec 5, 2022
8a8613a
Add src/vak/models/base.py
NickleDave Dec 5, 2022
8d7f47a
Add src/vak/models/definition.py
NickleDave Dec 5, 2022
c6cd9c0
Add src/vak/models/decorator.py
NickleDave Dec 5, 2022
753daa5
Add vak/models/windowed_frame_classification_model.py
NickleDave Dec 7, 2022
2a29cd6
Rewrite models using `model` decorator
NickleDave Dec 7, 2022
5191e2a
Fix entry points in pyproject.toml
NickleDave Jan 21, 2023
3d5a19a
Add/fix/remove imports in models/__init__.py
NickleDave Dec 7, 2022
1241038
Depend on pytorch_lightning, not lightning
NickleDave Dec 12, 2022
6855ba6
Use model from model_config_map in core/eval.py
NickleDave Dec 12, 2022
2f29f9b
Use model from models_map in core/predict.py
NickleDave Dec 12, 2022
bbbc845
Use model.load_state_dict_from_path in core/train
NickleDave Dec 18, 2022
bb3cc7a
Add tests/test_nets/
NickleDave Dec 26, 2022
03b11b2
Add tests/test_models/test_teenytweetynet + test_tweetynet
NickleDave Dec 26, 2022
609f27c
Add tests/test_models/conftest.py
NickleDave Jan 21, 2023
86ce674
Add tests/test_models/test_base.py
NickleDave Dec 26, 2022
10c5707
Add tests/test_models/test_decorator.py
NickleDave Jan 17, 2023
32a897a
Add tests/test_models/test_definition.py
NickleDave Jan 18, 2023
c6ae12c
WIP: Add doc/reference/models.md
NickleDave Jan 9, 2023
11db71e
Add tests/test_models/test_windowed_frame_classification_model.py
NickleDave Jan 21, 2023
279b906
CI: Test if using 'ubuntu-20.04' fixes inscrutable errors
NickleDave Jan 22, 2023
ae66038
Update CHANGELOG after merging #605 [skip ci]
NickleDave Jan 22, 2023
bf06658
BUG: Fix WindowedFrameClassificationModel attribute
NickleDave Feb 11, 2023
66436e9
CLN: Remove unused import from src/vak/models/teenytweetynet.py
NickleDave Feb 12, 2023
89c0ba1
CLN: Remove entry points for models, metrics. Fixes #601
NickleDave Feb 11, 2023
5317545
CI: Run nox session as verbose in ci-linux.yml
NickleDave Feb 12, 2023
bd195c1
TST: Fix test failures in tests/test_models for CI
NickleDave Feb 12, 2023
f735c9c
DOC: Update CHANGELOG after merging #621 [skip ci]
NickleDave Feb 12, 2023
1bf4b6a
CLN: Remove unused multiple model functionality, fix #538
NickleDave Feb 12, 2023
8a4b1b4
DEV: Fix filenames of generated test data tars to be '1.x'
NickleDave Feb 13, 2023
5b46b24
DEV/CI: Update GENERATED_TEST_DATA_CI_URL in noxfile.py
NickleDave Feb 13, 2023
4747735
TST: Fix script tests/scripts/fix_prep_csv_paths.py
NickleDave Feb 13, 2023
9139c27
DOC: Update CHANGELOG after merging #625 [skip ci]
NickleDave Feb 13, 2023
40f312e
BUG: Fix post_tfm in version 1.0
NickleDave Feb 13, 2023
b4f21d9
DOC: Update CHANGELOG after merging #626 [skip ci]
NickleDave Feb 13, 2023
43ef613
BUG: Fix how `model` decorator sets subclass' module
NickleDave Feb 14, 2023
5950aa5
DEV: Specify python=3.10 in two sessions in noxfile.py
NickleDave Feb 14, 2023
b12966f
CLN: Remove unused import from models/decorator.py [skip ci]
NickleDave Feb 14, 2023
48ce9be
CLN: Remove vak/engine/model.py, no longer used
NickleDave Feb 18, 2023
8e36d04
DOC: Update CHANGELOG after merging #627 [skip ci]
NickleDave Feb 18, 2023
1724d9e
BUG: Fix default for post_tfm_kwargs in config classes
NickleDave Feb 21, 2023
d74bb1f
ENH: Log training time in core.train, fix #2
NickleDave Feb 23, 2023
9ea478d
DOC: Update CHANGELOG after merging #628 [skip ci]
NickleDave Feb 24, 2023
a96d1a9
ENH: Rename config option `csv_path` -> `dataset_path`, fix #549
NickleDave Feb 24, 2023
a913a60
DOC: Update CHANGELOG after merging #632 [skip ci]
NickleDave Feb 25, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/ci-linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ jobs:
fail-fast: false
matrix:
python-version: [3.8, 3.9, "3.10"]
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
Expand All @@ -24,6 +24,6 @@ jobs:
run: |
nox -s test-data-download-source
nox -s test-data-download-generated-ci
nox -s coverage -- running-on-ci
nox -s coverage --verbose -- running-on-ci
- name: upload code coverage
uses: codecov/codecov-action@v3
56 changes: 52 additions & 4 deletions doc/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,54 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## Unreleased (1.0.0)
### Added
- Use `lightning` framework as back end, replaces
`vak.engine.Model`
[#598](https://github.com/NickleDave/vak/pull/598).
Fixes [#597](https://github.com/NickleDave/vak/issues/597).
See discussion in [#359](https://github.com/NickleDave/vak/issues/359).
- Make it easier to make an instance of a model
[#605](https://github.com/NickleDave/vak/pull/605).
Fixes [#362](https://github.com/NickleDave/vak/issues/362).
- Add ways to define models and families of models
[#605](https://github.com/NickleDave/vak/pull/605).
Fixes [#406](https://github.com/NickleDave/vak/issues/406),
[#536](https://github.com/NickleDave/vak/issues/536), and
[#603](https://github.com/NickleDave/vak/issues/603).
- Add built-in TweetyNet model
[#605](https://github.com/NickleDave/vak/pull/605).
Fixes [#596](https://github.com/NickleDave/vak/issues/596).
- Add logging of training time
[#628](https://github.com/NickleDave/vak/pull/628).
Fixes [#2](https://github.com/NickleDave/vak/issues/2).

### Changed
- Rename config file option `csv_path` to `dataset_path`,
since it is more specific and allows for the possibility
that a dataset is not always a csv file
[#632](https://github.com/NickleDave/vak/pull/632).
Fixes [#549](https://github.com/NickleDave/vak/issues/549).

### Removed
- Remove entry points since they are not being unused
outside the project but require maintenance and testing
[#621](https://github.com/NickleDave/vak/pull/621).
Fixes [#601](https://github.com/NickleDave/vak/issues/601).
- Remove unused/incomplete functionality for training multiple models
[#625](https://github.com/NickleDave/vak/pull/625).
Fixes [#538](https://github.com/NickleDave/vak/issues/538).
- Remove `engine` with `Model` class
[#627](https://github.com/NickleDave/vak/pull/627).
No longer used after switching to Lightning as backend in
[#598](https://github.com/NickleDave/vak/pull/598).

### Fixed
- Fix functionality to evaluate model with and without
post-processing transform that was added in
[#621](https://github.com/NickleDave/vak/pull/621).
Fixed in [#626](https://github.com/NickleDave/vak/pull/626).

## 0.8.1 -- 2023-03-02
### Fixed
- Fix transform that converts labeled timebins to segments
Expand Down Expand Up @@ -95,8 +143,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Refactor and speed up logic for determining whether a
dataset with sequence annotations has unlabeled segments
that should be assigned a "background" label
[#559](https://github.com/NickleDave/vak/pull/559).
Fixes [#243](https://github.com/NickleDave/vak/issues/243).
[#559](https://github.com/NickleDave/vak/pull/559).
Fixes [#243](https://github.com/NickleDave/vak/issues/243).
- Adds a new sub-sub-package, `datasets.seq`
with a `validators` module, which is where the
re-written `has_unlabeled` function now lives.
Expand All @@ -110,8 +158,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
so that the purpose of the functions is clearer,
and add clearer error messages with links to documentation
about file naming conventions
[#566](https://github.com/NickleDave/vak/pull/566).
Fixes [#525](https://github.com/NickleDave/vak/issues/525).
[#566](https://github.com/NickleDave/vak/pull/566).
Fixes [#525](https://github.com/NickleDave/vak/issues/525).
- Revise "autoannotate" tutorial to use .wav audio and .csv
annotation files from new release of Bengalese Finch Song
Repository, and to suggest that Windows users unpack
Expand Down
36 changes: 36 additions & 0 deletions doc/reference/models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
(reference-models)=

# Declaring models in vak

This section of the reference explains the design
of the abstractions in vak for representing
deep learning and neural network models,
and the rationale behind that design.

Goals for the design include:
- make it easy to test a particular model
that was developed for a specified task,
- make it easy to instantiate
and work with a model interactively,
e.g. by feeding in a single input
and then visualizing the output
to directly inspect performance
- to rely on a "backend"
that allows us to achieve these goals
and at the same time
provide more low-level, fine-grained control
when needed

Since that last goal permits the first two,
we discuss how we achieved it first.
We have chosen to rely on the lightning framework.

## Declaring a model

To make it easy to declare a model we provide the following abstractions:
- A model definition
- Classes that represent a family of models, all developed for a specific task
- A base model class, that knows how to make an isntance of a model given a definition

## Instatiating a model

13 changes: 7 additions & 6 deletions noxfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ def build(session: nox.Session) -> None:
session.run("flit", "build")


@nox.session
@nox.session(python="3.10.7")
def dev(session: nox.Session) -> None:
"""
Sets up a python development environment for the project.
Expand Down Expand Up @@ -119,7 +119,7 @@ def test_data_download_source(session) -> None:
TEST_DATA_GENERATE_SCRIPT = './tests/scripts/generate_data_for_tests.py'


@nox.session(name='test-data-generate')
@nox.session(name='test-data-generate', python="3.10")
def test_data_generate(session) -> None:
"""
Produced 'generated' test data, by running TEST_DATA_GENERATE_SCRIPT on 'source' test data.
Expand Down Expand Up @@ -151,10 +151,10 @@ def make_tarfile(name: str, to_add: list):

PREP_CI = sorted(pathlib.Path(PREP_DIR).glob('*/*/teenytweetynet'))
RESULTS_CI = sorted(pathlib.Path(RESULTS_DIR).glob('*/*/teenytweetynet'))
GENERATED_TEST_DATA_CI_TAR = f'{GENERATED_TEST_DATA_DIR}generated_test_data-version-0.x.ci.tar.gz'
GENERATED_TEST_DATA_CI_TAR = f'{GENERATED_TEST_DATA_DIR}generated_test_data-version-1.x.ci.tar.gz'
GENERATED_TEST_DATA_CI_DIRS = [CONFIGS_DIR] + PREP_CI + RESULTS_CI

GENERATED_TEST_DATA_ALL_TAR = f'{GENERATED_TEST_DATA_DIR}generated_test_data-version-0.x.tar.gz'
GENERATED_TEST_DATA_ALL_TAR = f'{GENERATED_TEST_DATA_DIR}generated_test_data-version-1.x.tar.gz'
GENERATED_TEST_DATA_ALL_DIRS = [CONFIGS_DIR, PREP_DIR, RESULTS_DIR]


Expand All @@ -176,7 +176,7 @@ def test_data_tar_generated_ci(session) -> None:
make_tarfile(GENERATED_TEST_DATA_CI_TAR, GENERATED_TEST_DATA_CI_DIRS)


GENERATED_TEST_DATA_ALL_URL = 'https://osf.io/532cs/download'
GENERATED_TEST_DATA_ALL_URL = 'https://osf.io/uvgjt/download'


@nox.session(name='test-data-download-generated-all')
Expand All @@ -191,12 +191,13 @@ def test_data_download_generated_all(session) -> None:
with tarfile.open(GENERATED_TEST_DATA_ALL_TAR, "r:gz") as tf:
tf.extractall(path='.')
session.log('Fixing paths in .csv files')
session.install("pandas")
session.run(
"python", "./tests/scripts/fix_prep_csv_paths.py"
)


GENERATED_TEST_DATA_CI_URL = 'https://osf.io/g79sx/download'
GENERATED_TEST_DATA_CI_URL = 'https://osf.io/un2zs/download'


@nox.session(name='test-data-download-generated-ci')
Expand Down
12 changes: 2 additions & 10 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ dependencies = [
"dask >=2.10.1",
"evfuncs >=0.3.4",
"joblib >=0.14.1",
"pytorch-lightning >=1.8.4.post0",
"matplotlib >=3.3.3",
"numpy >=1.18.1",
"scipy >=1.4.1",
Expand All @@ -49,7 +50,6 @@ dev = [
test = [
"pytest >=6.2.1",
"pytest-cov >=2.11.1",
"tweetynet >=0.7.0",
]
doc = [
"furo >=2022.1.2",
Expand All @@ -68,18 +68,10 @@ Documentation = "https://vak.readthedocs.io"
[project.scripts]
vak = 'vak.__main__:main'

[project.entry-points."vak.models"]
TeenyTweetyNetModel = 'vak.models.teenytweetynet:TeenyTweetyNetModel'

[project.entry-points."vak.metrics"]
Accuracy = 'vak.metrics.Accuracy'
Levenshtein = 'vak.metrics.Levenshtein'
SegmentErrorRate = 'vak.metrics.SegmentErrorRate'

[tool.flit.sdist]
exclude = [
"tests/data_for_tests"
]

[tool.pytest.ini_options]
filterwarnings = ["ignore:::.*torch.utils.tensorboard",]
filterwarnings = ["ignore:::.*torch.utils.tensorboard",]
10 changes: 3 additions & 7 deletions src/vak/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,14 @@
curvefit,
datasets,
device,
engine,
entry_points,
files,
io,
labeled_timebins,
labels,
logging,
metrics,
models,
nets,
nn,
paths,
plot,
Expand All @@ -37,13 +36,12 @@
tensorboard,
timebins,
timenow,
trainer,
transforms,
typing,
validators,
)

from .engine.model import Model


__all__ = [
"__main__",
Expand All @@ -55,15 +53,12 @@
"csv",
"datasets",
"device",
"engine",
"entry_points",
"files",
"io",
"labeled_timebins",
"labels",
"logging",
"metrics",
"Model",
"models",
"nn",
"paths",
Expand All @@ -73,6 +68,7 @@
"tensorboard",
"timebins",
"timenow",
"trainer",
"transforms",
"typing",
"validators",
Expand Down
12 changes: 7 additions & 5 deletions src/vak/cli/eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,18 +43,20 @@ def eval(toml_path):

logger.info("Logging results to {}".format(cfg.eval.output_dir))

model_config_map = config.models.map_from_path(toml_path, cfg.eval.models)
model_name = cfg.eval.model
model_config = config.model.config_from_toml_path(toml_path, model_name)

if cfg.eval.csv_path is None:
if cfg.eval.dataset_path is None:
raise ValueError(
"No value is specified for 'csv_path' in this .toml config file."
"No value is specified for 'dataset_path' in this .toml config file."
f"To generate a .csv file that represents the dataset, "
f"please run the following command:\n'vak prep {toml_path}'"
)

core.eval(
cfg.eval.csv_path,
model_config_map,
model_name=model_name,
model_config=model_config,
dataset_path=cfg.eval.dataset_path,
checkpoint_path=cfg.eval.checkpoint_path,
labelmap_path=cfg.eval.labelmap_path,
output_dir=cfg.eval.output_dir,
Expand Down
12 changes: 7 additions & 5 deletions src/vak/cli/learncurve.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,20 +50,22 @@ def learning_curve(toml_path):
log_version(logger)
logger.info("Logging results to {}".format(results_path))

model_config_map = config.models.map_from_path(toml_path, cfg.learncurve.models)
model_name = cfg.learncurve.model
model_config = config.model.config_from_toml_path(toml_path, model_name)

if cfg.learncurve.csv_path is None:
if cfg.learncurve.dataset_path is None:
raise ValueError(
"No value is specified for 'csv_path' in this .toml config file."
"No value is specified for 'dataset_path' in this .toml config file."
f"To generate a .csv file that represents the dataset, "
f"please run the following command:\n'vak prep {toml_path}'"
)

core.learning_curve(
model_config_map,
model_name=model_name,
model_config=model_config,
train_set_durs=cfg.learncurve.train_set_durs,
num_replicates=cfg.learncurve.num_replicates,
csv_path=cfg.learncurve.csv_path,
dataset_path=cfg.learncurve.dataset_path,
labelset=cfg.prep.labelset,
window_size=cfg.dataloader.window_size,
batch_size=cfg.learncurve.batch_size,
Expand Down
12 changes: 7 additions & 5 deletions src/vak/cli/predict.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,20 +38,22 @@ def predict(toml_path):
log_version(logger)
logger.info("Logging results to {}".format(cfg.prep.output_dir))

model_config_map = config.models.map_from_path(toml_path, cfg.predict.models)
model_name = cfg.predict.model
model_config = config.model.config_from_toml_path(toml_path, model_name)

if cfg.predict.csv_path is None:
if cfg.predict.dataset_path is None:
raise ValueError(
"No value is specified for 'csv_path' in this .toml config file."
"No value is specified for 'dataset_path' in this .toml config file."
f"To generate a .csv file that represents the dataset, "
f"please run the following command:\n'vak prep {toml_path}'"
)

core.predict(
csv_path=cfg.predict.csv_path,
model_name=model_name,
model_config=model_config,
dataset_path=cfg.predict.dataset_path,
checkpoint_path=cfg.predict.checkpoint_path,
labelmap_path=cfg.predict.labelmap_path,
model_config_map=model_config_map,
window_size=cfg.dataloader.window_size,
num_workers=cfg.predict.num_workers,
spect_key=cfg.spect_params.spect_key,
Expand Down
Loading