EES-5152 Add Public API Data Processor tests for ImportMetadata
and ImportData
#4971
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds further tests to the Public API Data Processor's
ImportMetadata
andImportData
activities.It makes a change to the
ProcessorTestData
class to support adding sets of expected test data. At present there is only one data set -AbsenceSchool
. The intention is that further sets can easily be added if required without writing new tests as no tests are tied to specific data.ImportMetadata
testsTwo different sets of tests were added here:
Also assert the version has the correct
DataSetVersionMetaSummary
and correctTotalResults
.ImportData
testsTests are added to assert the DuckDb data table has the correct row count, correct expected columns (base columns
Id
,GeographicLevel
andTimePeriodId
and all of the flexible location, filter and indicator columns), and that the data table contains the correct distinct set of values for time periods, geographic levels, location options and and filter options.WriteDataFiles
testsNo further tests were written for the
WriteDataFiles
activity. The reason being that we are now testing the contents of the DuckDb metadata and data tables, and the existingProcessInitialDataSetVersionFunctionTests.WriteDataFilesTests.Success
test is already testing that the table files which should be exported are present on the filesystem.Any additional tests to check the data would be duplication of the tests added for the
ImportMetadata
orImportData
activities, or testing the correctness of DuckDb's own export feature.Further considerations
This PR fixes a problem where the
PublicId
Sqids which were offset from the Id/Index that we expect them to be generated from by a value of one. However thinking about this has identified another problem that due to the way they are using the table counts as the starting index for inserts, any deletions from these tables can result in newer data set versions reusing oldPublicId
's. This will be addressed by EES-5235.The test data set is not large enough to fully exercise the batching that's used to to generate filter options and location options so I was unsure if dedicated tests should be written for this or if a larger test data set should be added. This has been left for now.
The tests are not tied to any specific data set but this means the
ImportData
tests are not testing the integrity of specific rows. They are only testing that the columns (excluding indicator columns) contain the expected distinct set of labels and id's. We might want to do additional work to compare expected row values either for all rows or a small subset of rows, e.g. the first row.