Skip to content

Commit

Permalink
test: Make changes to reflect minor differences in behavior
Browse files Browse the repository at this point in the history
  • Loading branch information
victorlin committed Jul 30, 2022
1 parent a928406 commit 0385409
Show file tree
Hide file tree
Showing 5 changed files with 14 additions and 5 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,10 @@ Pandas engine
SQLite engine
-------------

The output is slightly different because the SQLite engine only reports one row
per strain, in contrast to the pandas engine which reports the same strain
being both excluded then re-included.

$ ${AUGUR} filter --engine sqlite \
> --sequence-index filter/data/sequence_index.tsv \
> --metadata filter/data/metadata.tsv \
Expand All @@ -45,10 +49,10 @@ SQLite engine
> --output-log "$TMP/filtered_log.tsv"
4 strains were dropped during filtering
\t1 had no metadata (esc)
\t2 of these were filtered out by the query: "country != 'Colombia'" (esc)
\t1 had no sequence data (esc)
\t3 of these were filtered out by the query: "country != 'Colombia'" (esc)
\t1 strains were added back because they were in filter/data/include.txt (esc)
9 strains passed all filters

$ diff -u <(sort -k 1,1 filter/data/filtered_log.tsv) <(sort -k 1,1 "$TMP/filtered_log.tsv")
$ diff -u <(sort -k 1,1 filter/data/filtered_log_sqlite_engine.tsv) <(sort -k 1,1 "$TMP/filtered_log.tsv")
$ rm -f "$TMP/filtered_strains.txt"
2 changes: 1 addition & 1 deletion tests/functional/filter/cram/filter-min-max-date-output.t
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,6 @@ SQLite engine
> --max-date 2016-02-01 \
> --output-metadata "$TMP/filtered_metadata.tsv"
8 strains were dropped during filtering
\t1 of these were dropped because they were earlier than 2015.0 or missing a date (esc)
\t7 of these were dropped because they were later than 2016.09 or missing a date (esc)
\t1 of these were dropped because they were earlier than 2015.0 or missing a date (esc)
4 strains passed all filters
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ SQLite engine
> --output-metadata "$TMP/filtered_metadata.tsv"
Sampling at 10 per group.
2 strains were dropped during filtering
\t1 were dropped during grouping due to ambiguous year information (esc)
\t1 were dropped during grouping due to ambiguous month information (esc)
\t1 were dropped during grouping due to ambiguous year information (esc)
\t0 of these were dropped because of subsampling criteria, using seed 314159 (esc)
10 strains passed all filters
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ SQLite engine
WARNING: Asked to provide at most 3 sequences, but there are 8 groups.
Sampling probabilistically at 0.3633 sequences per group, meaning it is possible to have more than the requested maximum of 3 sequences after filtering.
10 strains were dropped during filtering
\t1 were dropped during grouping due to ambiguous year information (esc)
\t1 were dropped during grouping due to ambiguous month information (esc)
\t1 were dropped during grouping due to ambiguous year information (esc)
\t8 of these were dropped because of subsampling criteria, using seed 314159 (esc)
2 strains passed all filters
5 changes: 5 additions & 0 deletions tests/functional/filter/data/filtered_log_sqlite_engine.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
strain filter kwargs
HND/2016/HU_ME59 filter_by_sequence_index []
Colombia/2016/ZC204Se filter_by_query "[[""query"", ""country != 'Colombia'""]]"
COL/FLR_00024/2015 filter_by_query "[[""query"", ""country != 'Colombia'""]]"
COL/FLR_00008/2015 force_include_strains "[[""include_file"", ""filter/data/include.txt""]]"

0 comments on commit 0385409

Please sign in to comment.