-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON Catalog Reader for Integrations #1288
Conversation
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1288 +/- ##
==========================================
+ Coverage 52.92% 53.72% +0.79%
==========================================
Files 302 314 +12
Lines 10587 11267 +680
Branches 2777 2941 +164
==========================================
+ Hits 5603 6053 +450
- Misses 4939 5166 +227
- Partials 45 48 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
The existing serialization implementation was not able to recover all data, as mapping information associating file names with the content was lost in appending the file content at the end of the JSON object. In order to work around this, a new serialization method needs to add this association. The most viable solution I see is to add encoded file data right next to the file names within the config, which will make it slightly harder to read if word wrap is enabled, but otherwise solves the issue. Going to clear this method out entirely for now and restore old functionality with TDD. Alternative solution: We can make a loose association by making heavy assumptions about array ordering and with complicated parsing logic, but I'm not sure it's worth it. Better to just go back to the drawing board and make a serialization method that's more flexible. Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
When I noticed that adaptors had to handle version sorting, I started writing a test to verify the versions were being sorted. While verifying the test, I noticed a weird result where 2.0.0 was being sorted after both 1.0.0 and 2.1.0. After moving the comparison function over to integration_reader where it should be and writing tests, I saw the method was falsely saying 2.0.0 and 2.1.0 were equal. Debugging the comparison, I found that the version '1.2.3' was being parsed as [1, NaN, NaN]. Several more minutes of banging my head against my desk, and I realized that the suspect was this line: `a.split('.').map(Number.parseInt)`. Because parseInt takes two arguments (number and radix), map was giving it the part *and the index*. The error went away after fixing the method to use only the part. Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
@Swiddis Thanks for the detailed overview and especially for the self retrospective - a great reflection on the working process and lessons learned during the development and review |
@Swiddis can you also please add a dedicated markdown doc describing in a very broad terms what is the purpose of this dynamic catalog and how one should be able to create such a catalog zip file ? |
I've put this in its own issue #1354 and will address it in a future PR. |
savedObjects?: { | ||
name: string; | ||
version: string; | ||
data: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to keep savedObjects as strings or save them with type and data schema for more clarity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saving them as strings allows for more uniformity in parsing logic and has less validation complexity. We can fall back to OSD core's validation for parsing saved objects since I've switched to their parser, and the schema format would also be tricky to define (we generally just parse it and send it to the backend -- maybe the validation logic for that is exported too?). It might be more clear and play more nicely with autoformatting, but I'm not sure if it's worth the added effort given our use case. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In manager test.ts I see lint issues not related to changes from this PR, but is there a way to formalize type for the responses (instead of any) as these are from node server and not OpenSearch?
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor changes requested.
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
I've created a separate issue for addressing remaining dormant lints in Integrations #1363, will look more at these later. I also have a task in my queue for formalizing the API types, so I'll address both of these points in future PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Swiddis, thanks for putting this together. In generally this PR LGTM. I just left some minor conceptual questions. Feel free to prioritize the other code change requests first.
*/ | ||
async deepCheck(): Promise<Result<IntegrationConfig>> { | ||
const configResult = await this.getConfig(); | ||
if (!configResult.ok) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious what are the configResult
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type Result<T, E = Error> =
| { ok: true; value: T; error?: undefined }
| { ok: false; error: E; value?: undefined };
Originally introduced to use for validation though it's spread a bit to the rest of the code, this result is for whether the provided config is correct after validation. It either stores the config, or an error describing what's wrong with it.
return { ok: false, error: new Error(`${this.directory} is not a valid integration`) }; | ||
} | ||
|
||
const maybeVersion: string | null = version ? version : await this.getLatestVersion(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reference for this maybeVersion
? I saw that the raw config was trying to handle the case of this version is not valid, but I would just like to know the idea behind this. Is it possible to have a version declared but not valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just validation. Integrations should always have at least one valid version, but there are configuration error scenarios where this might not be true (where we return an error on the next line).
} | ||
|
||
async findIntegrations(dirname: string = '.'): Promise<Result<string[]>> { | ||
if (dirname !== '.') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that we are currently only support finding integrations in the current directory. Just for my knowledge what is the path of this directory in the project?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Directories are relative to the root where the reader is defined which can vary. This is more relevant with FS readers. JSON readers don't technically support directories, but it's provided by the interface, so we filter out any directory changes. It's part of the issue of poor abstraction that I mentioned in the PR description.
join(filename: string): JsonCatalogDataAdaptor { | ||
// In other adaptors, joining moves from directories to integrations. | ||
// Since for JSON catalogs we use a flat structure, we just filter. | ||
return new JsonCatalogDataAdaptor(this.integrationsList.filter((i) => i.name === filename)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I was wondering do we have to deal with the case of having a huge list of integrationList
? Cuz I saw that we are creating a new instance each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IntegrationReaders are pointers, so they're relatively light to keep in memory and filter. I don't predict ever seeing more than a thousand or so integrations being handled in one JSON read, so it shouldn't have any significant issues with performance. Can revisit later if better scaling proves useful, but I don't think this will be the bottleneck if we start to run into issues.
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
* Add full statics scan method Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add sample data to serialization method Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Finish integration serialization Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Rename integration reader and add stub json adaptor Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add JsonCatalogDataAdaptor directory type checking Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add integration version filtering to json_data_adaptor Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Remove unviable serialization implementation The existing serialization implementation was not able to recover all data, as mapping information associating file names with the content was lost in appending the file content at the end of the JSON object. In order to work around this, a new serialization method needs to add this association. The most viable solution I see is to add encoded file data right next to the file names within the config, which will make it slightly harder to read if word wrap is enabled, but otherwise solves the issue. Going to clear this method out entirely for now and restore old functionality with TDD. Alternative solution: We can make a loose association by making heavy assumptions about array ordering and with complicated parsing logic, but I'm not sure it's worth it. Better to just go back to the drawing board and make a serialization method that's more flexible. Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add new static serialization Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add component and sample data serialization Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Fix serialization test for new serialization approach Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add new asset serialization Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Fix broken compareVersions comparison and version sorting When I noticed that adaptors had to handle version sorting, I started writing a test to verify the versions were being sorted. While verifying the test, I noticed a weird result where 2.0.0 was being sorted after both 1.0.0 and 2.1.0. After moving the comparison function over to integration_reader where it should be and writing tests, I saw the method was falsely saying 2.0.0 and 2.1.0 were equal. Debugging the comparison, I found that the version '1.2.3' was being parsed as [1, NaN, NaN]. Several more minutes of banging my head against my desk, and I realized that the suspect was this line: `a.split('.').map(Number.parseInt)`. Because parseInt takes two arguments (number and radix), map was giving it the part *and the index*. The error went away after fixing the method to use only the part. Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add ability to read config file from json integration Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add rudimentary integration search to JSONAdaptor Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Create readRawConfig method to avoid data pruning Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add localized config parsing to getAssets Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Convert getSampleData to use localized config Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Move localized reader logic to helper method Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Convert getSchemas to use new data retrieval Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add localized config handling for getStatic Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add basic validation test for JSON reader Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add static validation to integration reader Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add more tests for deepCheck on JSON catalog Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add tests for json adaptor edge cases Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add reserialization tests Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add tests and remove redundant error checks for 90% coverage Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Fix lints in repository.test.ts Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Fix lints for integrations_builder.ts Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Update snapshots Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Revert "Update snapshots" This reverts commit 62b238d. Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Use semver library for comparison Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Switch to upstream NDJson parser Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add inline documentation for IntegrationPart Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Move deepCheck to utils Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add further utility methods to utils Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Refactor repository tests to not rely on result.ok == false Signed-off-by: Simeon Widdis <sawiddis@amazon.com> --------- Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
* Add full statics scan method Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add sample data to serialization method Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Finish integration serialization Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Rename integration reader and add stub json adaptor Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add JsonCatalogDataAdaptor directory type checking Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add integration version filtering to json_data_adaptor Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Remove unviable serialization implementation The existing serialization implementation was not able to recover all data, as mapping information associating file names with the content was lost in appending the file content at the end of the JSON object. In order to work around this, a new serialization method needs to add this association. The most viable solution I see is to add encoded file data right next to the file names within the config, which will make it slightly harder to read if word wrap is enabled, but otherwise solves the issue. Going to clear this method out entirely for now and restore old functionality with TDD. Alternative solution: We can make a loose association by making heavy assumptions about array ordering and with complicated parsing logic, but I'm not sure it's worth it. Better to just go back to the drawing board and make a serialization method that's more flexible. Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add new static serialization Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add component and sample data serialization Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Fix serialization test for new serialization approach Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add new asset serialization Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Fix broken compareVersions comparison and version sorting When I noticed that adaptors had to handle version sorting, I started writing a test to verify the versions were being sorted. While verifying the test, I noticed a weird result where 2.0.0 was being sorted after both 1.0.0 and 2.1.0. After moving the comparison function over to integration_reader where it should be and writing tests, I saw the method was falsely saying 2.0.0 and 2.1.0 were equal. Debugging the comparison, I found that the version '1.2.3' was being parsed as [1, NaN, NaN]. Several more minutes of banging my head against my desk, and I realized that the suspect was this line: `a.split('.').map(Number.parseInt)`. Because parseInt takes two arguments (number and radix), map was giving it the part *and the index*. The error went away after fixing the method to use only the part. Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add ability to read config file from json integration Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add rudimentary integration search to JSONAdaptor Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Create readRawConfig method to avoid data pruning Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add localized config parsing to getAssets Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Convert getSampleData to use localized config Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Move localized reader logic to helper method Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Convert getSchemas to use new data retrieval Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add localized config handling for getStatic Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add basic validation test for JSON reader Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add static validation to integration reader Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add more tests for deepCheck on JSON catalog Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add tests for json adaptor edge cases Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add reserialization tests Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add tests and remove redundant error checks for 90% coverage Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Fix lints in repository.test.ts Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Fix lints for integrations_builder.ts Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Update snapshots Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Revert "Update snapshots" This reverts commit 62b238d. Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Use semver library for comparison Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Switch to upstream NDJson parser Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add inline documentation for IntegrationPart Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Move deepCheck to utils Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Add further utility methods to utils Signed-off-by: Simeon Widdis <sawiddis@amazon.com> * Refactor repository tests to not rely on result.ok == false Signed-off-by: Simeon Widdis <sawiddis@amazon.com> --------- Signed-off-by: Simeon Widdis <sawiddis@amazon.com> (cherry picked from commit 6bd872c) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Description
This PR implements the JSON Catalog Reader described in #1243 as part of the work on dynamic catalogs (#1253). There's a lot of groundwork here mixed in with refactoring from new lint CI, leading to a large diff. The main parts to look at are:
Adding Localized Config handling to
integration_reader.ts
: This is a diff on the previousintegration.ts
that allows working with non-FS-based integration readers, GH diff shows me deleting one file and creating a new one due to lots of additions. The logic is mixed in on top of where file reads previously were to avoid file reading, landing on this solution involved a lot of trial and error with putting file information next to integration content. Future work includes simplifying this flow so that the newfetchDataOrReadFile
method can handle more heavy lifting, but all future adaptors should be able to fit within the behavior offetchDataOrReadFile
.Integration Serialization: This is handled by the
serialize
method inintegration_reader.ts
, and is also a very important addition. This describes the serialization format for integrations that will be used for JSON catalog bundles and storing installed integrations as.kibana
objects. The serialization approach is to add adata
field by every remote resource which can be read byIntegrationReader
. The adaptor fieldisConfigLocalized
dictates whether the reader should use this field or should request resources from the adaptor.Add
json_data_adaptor.ts
: What was supposed to be the core section when starting, but ended up being a relatively small amount of the final logic due to framework reworking needed to accomodate it. This supports some basic integration reading primitives. Originally this was supposed to support everything that the CatalogReader interface handles, but the abstraction ended up being tricky, soIntegrationReader
took most of the damage. Future readers shouldn't be much more complicated than this orfs_data_adaptor
.90% test coverage: Getting here is also a large amount of the line count, many of these tests are for the new functionality but there are also some additional tests on old functionality. Subset of test coverage for these changes here:
Lessons were Learned
This turned out to be a much harder problem than originally anticipated, boiling down primarily to the poor abstraction of
CatalogDataAdaptor
introduced in #947. This was correctly identified as an issue in PR feedback but I didn't understand it at the time, so it ended up being a thorn later on. The current implementation is still something of a hacky workaround to avoid a more significant rewrite. There's still a question on whether to accept the current working solution or keep pushing for something cleaner. To avoid more delays/a bigger PR I think it's worthwhile to merge this as-is and apply the boy scout rule to improve it over time.There also was a fun bug addressed in 2d59505, which took a while to debug and got its own mini-writeup in the commit message. A lesson in managing function parameters.
Issues Resolved
Resolves #1243. Wider context available at #1253.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.