Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesigned preprocessing #215

Merged
merged 80 commits into from
Dec 18, 2023
Merged
Show file tree
Hide file tree
Changes from 58 commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
57e11ee
first test
Taepper Aug 29, 2023
b09a631
tmp
Taepper Sep 4, 2023
8c354f6
tmp2
Taepper Sep 5, 2023
9e20e74
wip
Taepper Sep 8, 2023
56da99c
WIP
Taepper Sep 11, 2023
23e676b
WIP
Taepper Sep 12, 2023
9fbe59c
WIP
Taepper Sep 14, 2023
7aa27da
draft for ndjson preprocessing, that should also be ready for further…
Taepper Sep 18, 2023
6fb0f87
fix
Taepper Sep 18, 2023
04625d6
fix concurrency issue
Taepper Sep 18, 2023
b62675c
revert extensive logging
Taepper Sep 18, 2023
0b775b8
integration fixes
Taepper Sep 18, 2023
1e58a52
fix memory leaks
Taepper Sep 19, 2023
9d00745
refactoring
Taepper Sep 19, 2023
1f49096
supply primary_key
Taepper Sep 19, 2023
43960cf
polishing
Taepper Sep 19, 2023
3472280
filename robustness
Taepper Sep 20, 2023
9c4688f
Reworking preprocessing without csv reader
Taepper Sep 22, 2023
e9899f5
wip
Taepper Oct 2, 2023
c4ea65b
WIP
Taepper Oct 18, 2023
a2cd1ec
Running end to end
Taepper Oct 23, 2023
86aca65
Running end to end and few tests
Taepper Oct 23, 2023
9ee8e28
Missing data types
Taepper Oct 23, 2023
7def798
Apply partitioning to sequence tables
Taepper Oct 23, 2023
3d43f1d
Update Dockerfile
Taepper Oct 23, 2023
93dee6c
Alpine version less specific
Taepper Oct 23, 2023
4e244e7
Remove pango partitioning logic
Taepper Oct 23, 2023
707832c
Adhoc fix of zstd table reader test
Taepper Oct 23, 2023
fab6b7f
update test to reflext new preprocessing_config
Taepper Oct 23, 2023
a7193c3
correctly deal with all Null values when building from duckdb
Taepper Oct 23, 2023
b130082
disable database test because of missing backwards compatibility
Taepper Oct 23, 2023
90169fc
Disable metadata validator until backwards compatibility reestablished
Taepper Oct 23, 2023
2c2354e
fix: specifying apk versions
fengelniederhammer Oct 23, 2023
d8f9614
wip
Taepper Oct 23, 2023
bce7fe2
fix includes
Taepper Oct 23, 2023
279bcba
various logging and error handling
Taepper Oct 23, 2023
9864d9f
various error handling and do not die when duckdb inferred SQL values…
Taepper Oct 23, 2023
43d33a3
More error handling and logging
Taepper Oct 23, 2023
36617bb
support insertions
Taepper Oct 23, 2023
66f2a65
statically linked duckdb
Taepper Oct 23, 2023
e43faca
add ordering of tables by parameter predicate
Taepper Oct 24, 2023
dedaccb
fix bug of not resetting row when fetching new chunk
Taepper Oct 24, 2023
b244d8a
fix duckdb sort order, sort sequences and make one test deterministic
Taepper Oct 24, 2023
f93b1e3
reintroduce unit tests, make 2 more tests deterministic and a bugfix …
Taepper Oct 24, 2023
13c8ab6
update endToEnd info test numbers
Taepper Oct 24, 2023
44fc0ea
improved error messages
Taepper Oct 24, 2023
ebb67f3
catch all block for logging around preprocessing
Taepper Oct 24, 2023
3bcad8b
introduce limit of 10000 for FastaAligned action
Taepper Oct 24, 2023
724dea0
Display error when loading data
Taepper Oct 24, 2023
ea2bd09
Additional check for float column in FloatBetween
Taepper Oct 24, 2023
7cb7738
Save preprocessing duckdb in output directory
Taepper Oct 24, 2023
d83b03f
More trace logging in detailed db info
Taepper Oct 24, 2023
3d3944c
exit > 0 when an error happens in preprocessing
fengelniederhammer Nov 3, 2023
6b7c9e9
refactor: preprocessing into own directory while encapsulating logic …
Taepper Nov 20, 2023
4782ca7
CI: update versions
Taepper Nov 20, 2023
7772ae3
feat: add more tests, make less flaky and viable with large dataset
Taepper Nov 24, 2023
5e3f7bd
Alphabetical dependency order
Taepper Dec 4, 2023
b256bcf
cleaning up unused files and some code edits
Taepper Dec 4, 2023
fc9a063
Better error return codes
Taepper Dec 6, 2023
60c80c2
Remove refactored code
Taepper Dec 6, 2023
67c9ac5
Remove catch blocks with identical behavior
Taepper Dec 6, 2023
a33ac0d
Remove using directives
Taepper Dec 6, 2023
f72dfd5
Code refactoring
Taepper Dec 6, 2023
fa99325
Remove unused functions
Taepper Dec 6, 2023
a677a29
Split up buildPartitioningTable functino
Taepper Dec 6, 2023
cbfbc2c
More concise typing
Taepper Dec 6, 2023
a1fbba8
Remove outdated TODO item
Taepper Dec 6, 2023
1ef4a76
Fail earlier with malformed database config
Taepper Dec 6, 2023
4926906
Refactor preprocessing
Taepper Dec 6, 2023
2867c4f
Code edits
Taepper Dec 11, 2023
61bfa07
Remove obsolete partitioning, preprocessing_config and MetadataReader…
Taepper Dec 12, 2023
5ad66ec
More centralized db logging and error checking, clearer control-flow …
Taepper Dec 13, 2023
5393742
Add ndjson dataset with identical data for endToEndTests
Taepper Dec 13, 2023
274e14d
Split up database validation step after build
Taepper Dec 13, 2023
15b8f17
Extra imports
Taepper Dec 13, 2023
2fea9b2
MetadataInfo proper constructor
Taepper Dec 15, 2023
7f57c4c
Code edits
Taepper Dec 15, 2023
37a1294
Minor polishing edits
Taepper Dec 15, 2023
fabf265
ci: also execute e2e tests with NDJSON as preprocessing input
fengelniederhammer Dec 18, 2023
619035b
Last edits
Taepper Dec 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,13 @@ add_compile_definitions(SPDLOG_ACTIVE_LEVEL=SPDLOG_LEVEL_TRACE)
# ---------------------------------------------------------------------------

find_package(Boost REQUIRED COMPONENTS system serialization iostreams)
find_package(Poco REQUIRED COMPONENTS Net Util JSON)
find_package(duckdb REQUIRED)
find_package(LibLZMA REQUIRED)
find_package(TBB REQUIRED)
find_package(nlohmann_json REQUIRED)
find_package(Poco REQUIRED COMPONENTS Net Util JSON)
find_package(roaring REQUIRED)
find_package(spdlog REQUIRED)
find_package(vincentlaucsb-csv-parser REQUIRED)
Taepper marked this conversation as resolved.
Show resolved Hide resolved
find_package(TBB REQUIRED)
find_package(yaml-cpp REQUIRED)
find_package(zstd REQUIRED)

Expand Down Expand Up @@ -84,10 +84,10 @@ target_link_libraries(
TBB::tbb
${roaring_LIBRARIES}
${spdlog_LIBRARIES}
${vincentlaucsb-csv-parser_LIBRARIES}
${yaml-cpp_LIBRARIES}
nlohmann_json::nlohmann_json
zstd::libzstd_static
${duckdb_LIBRARIES}
)

add_executable(siloApi src/silo_api/api.cpp ${SRC_SILO_API})
Expand Down
16 changes: 8 additions & 8 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
FROM alpine:3.17.0 AS dep_builder
FROM alpine:3.18 AS dep_builder

RUN apk update && apk add --no-cache py3-pip \
build-base=0.5-r3 \
cmake=3.24.4-r0 \
linux-headers=5.19.5-r0 \
boost-build=1.79.0-r0 \
libtbb=2021.7.0-r0
cmake=3.26.5-r0 \
linux-headers=6.3-r0 \
boost-build=1.82.0-r0 \
libtbb=2021.9.0-r0

RUN pip install conan==2.0.8
RUN pip install conan==2.0.14

WORKDIR /src
COPY conanfile.py conanprofile.docker ./
Expand All @@ -32,14 +32,14 @@ RUN \
&& cp build/siloApi .


FROM alpine:3.17.0 AS server
FROM alpine:3.18 AS server

WORKDIR /app
COPY docker_default_preprocessing_config.yaml ./default_preprocessing_config.yaml
COPY docker_runtime_config.yaml ./runtime_config.yaml
COPY --from=builder /src/siloApi ./

RUN apk update && apk add libtbb=2021.7.0-r0 curl jq
RUN apk update && apk add libtbb=2021.9.0-r0 curl jq

# call /info, extract "seqeunceCount" from the JSON and assert that the value is not 0. If any of those fails, "exit 1".
HEALTHCHECK --start-period=20s CMD curl --fail --silent localhost:8081/info | jq .sequenceCount | xargs test 0 -ne || exit 1
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile_linter
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ WORKDIR /src
RUN apt update \
&& apt install -y \
cmake=3.22.1-1ubuntu1.22.04.1 \
python3-pip=22.0.2+dfsg-1ubuntu0.3 \
python3-pip=22.0.2+dfsg-1ubuntu0.4 \
software-properties-common=0.99.22.7 \
wget=1.21.2-2ubuntu1 \
gnupg=2.2.27-3ubuntu2.1 \
Expand All @@ -14,7 +14,7 @@ RUN apt update \
&& add-apt-repository 'deb http://apt.llvm.org/jammy/ llvm-toolchain-jammy main' \
&& apt install -y clang-tidy

RUN pip install conan==2.0.8
RUN pip install conan==2.0.11

COPY conanfile.py conanprofile.docker ./
RUN mv conanprofile.docker conanprofile
Expand Down
2 changes: 1 addition & 1 deletion build_with_conan.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ def main(args):
parser.add_argument("--clean", action="store_true", help="Clean build directory before building")
parser.add_argument("--release", action="store_true", help="Trigger RELEASE build")
parser.add_argument("--build_without_clang_tidy", action="store_true", help="Build without clang-tidy")
parser.add_argument("--parallel", type=int, default=1, help="Number of parallel jobs")
parser.add_argument("--parallel", type=int, default=16, help="Number of parallel jobs")

args_parsed = parser.parse_args()
main(args_parsed)
20 changes: 14 additions & 6 deletions conanfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,14 @@ class SiloRecipe(ConanFile):

requires = [
"boost/1.82.0",
"duckdb/0.8.1",
"poco/1.12.4",
"hwloc/2.9.3",
Taepper marked this conversation as resolved.
Show resolved Hide resolved
"onetbb/2021.9.0",
"nlohmann_json/3.11.2",
"gtest/cci.20210126",
"roaring/1.0.0",
"spdlog/1.11.0",
"vincentlaucsb-csv-parser/2.1.3",
"yaml-cpp/0.7.0",
"zstd/1.5.5",
]
Expand All @@ -23,6 +24,10 @@ class SiloRecipe(ConanFile):

"zstd/*:shared": False,

"duckdb/*:shared": False,
"duckdb/*:with_json": True,
"duckdb/*:with_parquet": True,

"roaring/*:shared": False,

"gtest/*:no_main": True,
Expand All @@ -31,6 +36,8 @@ class SiloRecipe(ConanFile):
"boost/*:zstd": True,
"boost/*:shared": False,

"hwloc/*:shared": False,

"boost/*:without_iostreams": False,
"boost/*:without_serialization": False,
"boost/*:without_system": False,
Expand Down Expand Up @@ -88,15 +95,16 @@ class SiloRecipe(ConanFile):
def generate(self):
deps = CMakeDeps(self)
deps.set_property("boost", "cmake_find_mode", "both")
deps.set_property("onetbb", "cmake_find_mode", "both")
deps.set_property("poco", "cmake_find_mode", "both")
deps.set_property("nlohmann_json", "cmake_find_mode", "both")
deps.set_property("duckdb", "cmake_find_mode", "both")
deps.set_property("fmt", "cmake_find_mode", "both")
deps.set_property("gtest", "cmake_find_mode", "both")
deps.set_property("hwloc", "cmake_find_mode", "both")
deps.set_property("nlohmann_json", "cmake_find_mode", "both")
deps.set_property("onetbb", "cmake_find_mode", "both")
deps.set_property("pcre2", "cmake_find_mode", "both")
deps.set_property("poco", "cmake_find_mode", "both")
deps.set_property("roaring", "cmake_find_mode", "both")
deps.set_property("spdlog", "cmake_find_mode", "both")
deps.set_property("fmt", "cmake_find_mode", "both")
deps.set_property("vincentlaucsb-csv-parser", "cmake_find_mode", "both")
deps.set_property("yaml-cpp", "cmake_find_mode", "both")
deps.set_property("zstd", "cmake_find_mode", "both")
deps.generate()
46 changes: 23 additions & 23 deletions endToEndTests/test/info.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ describe('The /info endpoint', () => {
.expect(200)
.expect('Content-Type', 'application/json')
.expect(headerToHaveDataVersion)
.expect({ nBitmapsSize: 3898, sequenceCount: 100, totalSize: 60054981 })
.expect({ nBitmapsSize: 3898, sequenceCount: 100, totalSize: 26589432 })
.end(done);
});

Expand All @@ -27,15 +27,15 @@ describe('The /info endpoint', () => {
'bitmapContainerSizeStatistic'
);
expect(returnedInfo.bitmapContainerSizePerGenomeSection.bitmapContainerSizeStatistic).to.deep.equal({
numberOfArrayContainers: 43540,
numberOfArrayContainers: 48524,
numberOfBitsetContainers: 0,
numberOfRunContainers: 83,
numberOfValuesStoredInArrayContainers: 59577,
numberOfRunContainers: 284,
numberOfValuesStoredInArrayContainers: 66620,
numberOfValuesStoredInBitsetContainers: 0,
numberOfValuesStoredInRunContainers: 2354,
totalBitmapSizeArrayContainers: 119154,
numberOfValuesStoredInRunContainers: 2875,
totalBitmapSizeArrayContainers: 133240,
totalBitmapSizeBitsetContainers: 0,
totalBitmapSizeRunContainers: 3170,
totalBitmapSizeRunContainers: 4824,
});

expect(returnedInfo.bitmapContainerSizePerGenomeSection).to.have.property(
Expand All @@ -62,22 +62,22 @@ describe('The /info endpoint', () => {

expect(returnedInfo).to.have.property('bitmapSizePerSymbol');
expect(returnedInfo.bitmapSizePerSymbol).to.deep.equal({
'-': 6003470,
'A': 6112653,
'B': 5980600,
'C': 6064589,
'D': 5980600,
'G': 6067672,
'H': 5980600,
'K': 5980630,
'M': 5980620,
'N': 5980600,
'R': 5980620,
'S': 5980600,
'T': 6125253,
'V': 5980600,
'W': 5980600,
'Y': 5980620,
'-': 2661831,
'A': 2775910,
'B': 2631464,
'C': 2725728,
'D': 2631464,
'G': 2728118,
'H': 2631464,
'K': 2631594,
'M': 2631554,
'N': 2631464,
'R': 2631514,
'S': 2631464,
'T': 2791923,
'V': 2631464,
'W': 2631514,
'Y': 2631494,
});
})
.expect(headerToHaveDataVersion)
Expand Down
13 changes: 7 additions & 6 deletions endToEndTests/test/queries/fastaAligned_multiple.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
"query": {
"action": {
"type": "FastaAligned",
"sequenceName": ["testSecondSequence", "S"]
"sequenceName": ["testSecondSequence", "S"],
"orderByFields": ["gisaid_epi_isl"]
},
"filterExpression": {
"type": "IntBetween",
Expand All @@ -18,16 +19,16 @@
"gisaid_epi_isl": "EPI_ISL_1408408",
"testSecondSequence": "ACGT"
},
{
"S": "MFVFLVLLPLVSSQCVNLITRTQ---SYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLGRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFDEVFNATRFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKXXXXXXXXXXXXXXNKLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGFNCYFPLRSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT*",
"gisaid_epi_isl": "EPI_ISL_1749899",
"testSecondSequence": "AAGN"
},
{
"S": "MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXSNIXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXLSETKCTLKSFTVEKXXXXTSNFRVQPTESIVRFPNITNLCPFDEVFNATKFASVYAWNRKRIXXXXADYSVLYNLAPFFTFKCYGVSPTKLNDLXXXXXXXDSFVIRGDEVRQIAPGQXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFKGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXAQALNTLVKQLSSKFGAISSVLNDIFSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRXAEIRASANLAATKMSECVLGQSKRVDFCXXXXXLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPRXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT*",
"gisaid_epi_isl": "EPI_ISL_1749892",
"testSecondSequence": "ACGT"
},
{
"S": "MFVFLVLLPLVSSQCVNLITRTQ---SYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLGRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFDEVFNATRFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKXXXXXXXXXXXXXXNKLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGFNCYFPLRSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT*",
"gisaid_epi_isl": "EPI_ISL_1749899",
"testSecondSequence": "AAGN"
},
{
"S": "MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAI--SGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGV-YHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTYGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIDDTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPINFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILARLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTHNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT*",
"gisaid_epi_isl": "EPI_ISL_2016901",
Expand Down
29 changes: 15 additions & 14 deletions endToEndTests/test/queries/nOf_2of3_details.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
"testCaseName": "N-Of query requesting 2 of 3 mutations with details action",
"query": {
"action": {
"type": "Details"
"type": "Details",
"orderByFields": ["gisaid_epi_isl"]
},
"filterExpression": {
"type": "N-Of",
Expand All @@ -28,6 +29,19 @@
}
},
"expectedQueryResult": [
{
"aaInsertions": null,
"age": 50,
"country": "Switzerland",
"date": "2020-11-13",
"division": "Solothurn",
"gisaid_epi_isl": "EPI_ISL_1005148",
"insertions": "25701:CCC",
"pango_lineage": "B.1.221",
"qc_value": 0.92,
"region": "Europe",
"unsorted_date": "2020-12-17"
},
{
"aaInsertions": null,
"age": 50,
Expand Down Expand Up @@ -66,19 +80,6 @@
"qc_value": 0.9,
"region": "Europe",
"unsorted_date": "2021-01-22"
},
{
"aaInsertions": null,
"age": 50,
"country": "Switzerland",
"date": "2020-11-13",
"division": "Solothurn",
"gisaid_epi_isl": "EPI_ISL_1005148",
"insertions": "25701:CCC",
"pango_lineage": "B.1.221",
"qc_value": 0.92,
"region": "Europe",
"unsorted_date": "2020-12-17"
}
]
}
11 changes: 6 additions & 5 deletions endToEndTests/test/queries/nOf_2of3_details_selection.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
"query": {
"action": {
"type": "Details",
"fields": ["age", "pango_lineage"]
"fields": ["age", "pango_lineage"],
"orderByFields": ["age", "pango_lineage"]
},
"filterExpression": {
"type": "N-Of",
Expand Down Expand Up @@ -33,17 +34,17 @@
"age": 50,
"pango_lineage": "B.1.1.7"
},
{
"age": 50,
"pango_lineage": "B.1.221"
},
{
"age": 54,
"pango_lineage": "B.1.1.7"
},
{
"age": 58,
"pango_lineage": "B.1.1.7"
},
{
"age": 50,
"pango_lineage": "B.1.221"
}
]
}
2 changes: 2 additions & 0 deletions include/silo/common/date.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ namespace silo::common {

typedef uint32_t Date;

const Date NULL_DATE = 0;
Taepper marked this conversation as resolved.
Show resolved Hide resolved

silo::common::Date stringToDate(const std::string& value);

std::optional<std::string> dateToString(silo::common::Date date);
Expand Down
30 changes: 0 additions & 30 deletions include/silo/common/zstd_compressor.h

This file was deleted.

Loading
Loading