Pipeline registry #3

afoucret · 2023-04-05T14:22:32Z

Still a WIP.

Create a util class PipelineRegistry that can be used to manage ingest pipelines
Implement pipeline registry for behavioral analytics AnalyticsIngestPipelineRegistry
Adapt the pipeline to the new event model.

When the credentials fails to verify, the error message does not tell which API key fails. This makes it hard for users to fix the issue. This PR adds the API key ID to the error message to help with the situation.

…ngTo100ms (elastic#95018) This change enables allocation trace logging to be able to debug ocasional CI test failures.

…version (elastic#92823)" (elastic#95016) This reverts commit 8d60562. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

…lastic#94724)

This PR avoids an extra de-serialization of role descriptors received in a cross cluster access request, by pushing the validation down to the role building step (where we necessarily de-serialize the received role descriptors). This also has the effect that we return a `400` instead of a `401`. I could wrap the exception so that we return a `403` instead, but I think a `400` makes the most sense, since we received a bad payload. Currently, this failure is _not_ audited. I can add logic to detect it in [`authorize()`](https://github.com/elastic/elasticsearch/blob/b17dfc77b9c48313921aaafa9a9e3da3e2739fd8/x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authz/AuthorizationService.java#L317) and emit an audit event, in a follow up, or in this PR. Just didn't want that to block review across time-zones.

* Update privileges.asciidoc with adding some privileges description SF Case - 01341015 Requested to update the cluster privileges which isn't explained https://www.elastic.co/guide/en/elasticsearch/reference/master/security-privileges.html. I typed descriptions but it probably needs to be corrected. - manage_autoscaling - manage_data_frame_transforms - manage_enrich * Update x-pack/docs/en/security/authorization/privileges.asciidoc Co-authored-by: Yang Wang <yang.wang@elastic.co> * Update x-pack/docs/en/security/authorization/privileges.asciidoc Co-authored-by: Yang Wang <yang.wang@elastic.co> * Apply review suggestion --------- Co-authored-by: Yang Wang <yang.wang@elastic.co> Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

…5034) We don't use this constructor parameter any more, so this commit removes it.

* Update release notes to include 8.7.0 Release notes and migration guide from 8.7.0 release ported into main as well as re-generating 8.8.0 release notes. This latter step will be overwritten anyway, multiple times, by more up-to-date regeneration of the 8.8.0 release notes during the release process. * Remove coming 8.7.0 line * Update docs/reference/migration/migrate_8_7.asciidoc Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co> * Make same change to 8.8 --------- Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

In elastic#94325 we introduced another forking step when submitting a publication, so we must extend the timeout in this test (and `DEFAULT_CLUSTER_STATE_UPDATE_DELAY`) by `DEFAULT_DELAY_VARIABILITY`. Closes elastic#94905

A handful of small changes to make the logging output of `CoordinatorTests` even more deterministic, for easier diffing. Relates elastic#94946

…c#95033) These objects take a route to/from their wire representation via an array, which is unnecessary. It's the same bytes on the wire as a list, so we can just use a list instead. Co-authored-by: Ievgen Degtiarenko <ievgen.degtiarenko@gmail.com>

Fixing off by one bug when seeking to first byte in the next page.

…stic#94517) This changes the serialization format for queries - when the index version is >=8.8.0, it serializes the actual transport version used into the stream. For BwC with old query formats, it uses the mapped TransportVersion for the index version. This can be modified later if needed to re-interpret the vint used to store TransportVersion to something else, allowing the format to be further modified if necessary.

If all the reads before the final one happen via `compareAndExchangeRegister` then the final one might find `firstRegisterRead` to be set still, permitting it to fail. This commit treats calls to `compareAndExchangeRegister` as reads too, avoiding this problem. Closes elastic#94664

This test is supposed to trigger a failure by exposing a spurious value for the register, but sometimes it exposes `expectedMax` which is what we expect at the end of the register checks. With this commit we ensure that we don't inadvertently return a correct value. Closes elastic#94410

This change sets the stability of ent-search APIs to beta and visibility to public. It also removes the feature flag link since enabling the module is not considered as a feature flag and the module is enabled by default.

Note that we use the encoding as follows: * for values taking [33, 40] bits per value encode using 40 bits per value * for values taking [41, 48] bits per value encode using 48 bits per value * for values taking [49, 56] bits per value encode using 56 bits per value This is an improvement over the encoding used by ForUtils that does not apply any compression for values taking more than 32 bits per value. Note that 40, 48 and 56 bits per value represent exact multiples of bytes (40 bits per value = 5 bytes, 48 bits per value = 6 bytes and 56 bits per value = 7 bytes). As a result we always write values using 3, 2 or 1 byte less than the 8 bytes required for a long value. We also apply compression to gauge metrics under the assumption that compressing values taking more than 32 bits per value works well for floating point values, because of the way floating point values are represented (IEEE 754 format).

* Remove extra step in manual downsampling docs * create -> view

* Enhanced REST tests for geo and cartesian centroid Coverage increased to cover cases for: * centroid over points * centroid over shapes * centroid over points with filter * centroid over shapes with filter * centroid over points with grouping * centroid over shapes with grouping * centroid over shapes with grouping and filter The last one was not done for points because the purpose of that test was primarily to validate the shape rules where centroids over GEOMETRYCOLLECTION would use only the highest dimensionality geometries for centroid calculation. * Enforce single shard So reduce risk of flakiness in aggregating over multiple documents

…rk (elastic#95048) There's no reason to prefix ops with their size over the network. We can verify the checksum once after reading each op and just streaming write ops.

When parsing role descriptors, we ensure that the FieldPermissions (`"field_security":{ "grant":[ ... ], "except":[ ... ] }`) are valid - that is that any patterns compile correctly, and the "except" is a subset of the "grant". However, the previous implementation would not use the FieldPermissionsCache for this, so it would compile (union, intersect & minimize) automatons every time a role was parsed. This was particularly an issue when parsing roles (from the security index) in the GET /_security/role/ endpoint. If there were a large number of roles with field level security the automaton parsing could have significant impact on the performance of this API.

Pushes the chunking of `GET _nodes/stats` down to avoid creating unboundedly large chunks. With this commit we yield one chunk per shard (if `?level=shards`) or index (if `?level=indices`) and per HTTP client and per transport action. Closes elastic#93985

We have moved away from considering terminate_after a filtered collector when collecting hits, as we already did not when size is set to 0. That means that we may shortcut the total hit count when terminate_after is used, and that makes us return total hit count that are retrieved from the index statistics, that is not early terminated, despite the actual collection of hits does early terminate. The corresponding test needs to be updated based on the new expectations. Closes elastic#94912

All the length implementations are the same so we can dry this up which might provide a speedup here and there since it gets us down to only two possible implementations of `BytesReference.length()` (releasable and normal ref) which should inline in most places.

This adds a QL utility method that parses an IP address into a BytesRef object.

…lastic#95237)

…roduction (elastic#95296)

This extracts a `CIDRUtils#isInRange()` function that will take as argument an IP given directly as bytes array and a single CIDR string specification. This allows code that has the IP to check already parsed in bytes (like being stored as BytesRef) to use it directly and avoid the dip-conversion to string, in order to call the existing `isInRange()`.

This reverts commit 059bfd4.

…tic#95271) Adds a new include flag definition_status to the GET trained models API. When present the trained model configuration returned in the response will have the new boolean field fully_defined if the full model definition is exists.

…astic#95064) Relates elastic#94534

)

…astic#95308)

…ST handlers (elastic#94037) elastic#93607 added the ability to run Elasticsearch in "Serverless" mode, where access to REST endpoints could be restricted so that the full Elasticsearch API is not available (since a lot of it does not make sense in Servlerless). By default no endpoints are available, but they can be exposed with `ServerlessScope` annotations. This PR follows up on elastic#93607 by adding PUBLIC and INTERNAL annotations to the rest handlers owned by the Core Infra team. There are several rest endpoints still under discussion. This PR does not label those, so they remain unavailable in Serverless mode.

* It adds the profiling index pattern profiling-* to the fleet server service privileges. * And adds profiling-* to kibana system role privileges. --------- Co-authored-by: Daniel Mitterdorfer <daniel.mitterdorfer@elastic.co>

… events.

…uginFuncTest builds distribution from branches via archives extractedAssemble [bwcDistVersion: 8.1.3, bwcProject: bugfix2, expectedAssembleTaskName: extractedAssemble, #3] elastic#119261

ywangd and others added 23 commits April 5, 2023 10:42

Add API key ID to error message for invalid credentials (elastic#94999)

fbd2b1e

When the credentials fails to verify, the error message does not tell which API key fails. This makes it hard for users to fix the issue. This PR adds the API key ID to the error message to help with the situation.

Introduce additional logging for testDelayedAllocationChangeWithSetti…

1c2ed56

…ngTo100ms (elastic#95018) This change enables allocation trace logging to be able to debug ocasional CI test failures.

Add extension points to LucenePersistedState (elastic#95030)

694de7b

Revert "[DOCS] Migration guide: link to What's new page for the same …

0e1e4ce

…version (elastic#92823)" (elastic#95016) This reverts commit 8d60562. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

[Transform] Expose authorization failure as transform health issue (e…

8a43667

…lastic#94724)

Remove unused serviceName from FakeThreadPoolMasterService (elastic#9…

ce8d032

…5034) We don't use this constructor parameter any more, so this commit removes it.

Further determinism improvements to CoordinatorTests (elastic#95032)

952b1f4

A handful of small changes to make the logging output of `CoordinatorTests` even more deterministic, for easier diffing. Relates elastic#94946

Fix off-by-one bug in RecyclerBytesStreamOutput (elastic#95036)

77162a7

Fixing off by one bug when seeking to first byte in the next page.

AwaitsFix for elastic#94464

3a90d46

AwaitsFix for elastic#86061

7bd40f1

AwaitsFix for elastic#95047

13e9976

Update rest api spec for ent-search module (elastic#95020)

ea14f15

This change sets the stability of ent-search APIs to beta and visibility to public. It also removes the feature flag link since enabling the module is not considered as a feature flag and the module is enabled by default.

[DOCS] Adds tip to change point agg docs. (elastic#94981)

b0a275d

95017 fix downsampling step (elastic#95054)

9a673ad

* Remove extra step in manual downsampling docs * create -> view

afoucret mentioned this pull request Apr 5, 2023

[Enterprise Search] Add connectors indices and ent-search pipeline on startup elastic/elasticsearch#94986

Closed

craigtaverner and others added 6 commits April 5, 2023 16:49

[DOCS] Add documentation for cat component templates (elastic#95035)

6e0071c

IngestService log registered processor types on startup (elastic#95023)

8d7072c

Remove allocations and copying from writing translog ops to the netwo…

7a83e0c

…rk (elastic#95048) There's no reason to prefix ops with their size over the network. We can verify the checksum once after reading each op and just streaming write ops.

javanna and others added 24 commits April 17, 2023 16:11

Add utility method to parse an IP to BytesRef (elastic#95291)

a3faa87

This adds a QL utility method that parses an IP address into a BytesRef object.

Fix bulk request typo. (elastic#95247)

bf604c9

rollup_user and rollup_admin added (elastic#95289)

6f623ff

Document that DS backing indices can have gaps in the name counter (e…

7b994ba

…lastic#95237)

Extract AtomicRegisterPreVoteCollector and StoreHeartbeatService to p…

da21447

…roduction (elastic#95296)

Add test suite for SingleNodeReconfigurator (elastic#95297)

415e22b

Revert "Remove blocking from RSH#createRetentionLease (elastic#95115)"

e243c22

This reverts commit 059bfd4.

Refactor TransportSearchAction to allow run can_match exclusively (el…

9b73004

…astic#95064) Relates elastic#94534

Remove unused deprecation logger in RestMultiSearchAction (elastic#95290

8d7c5a4

)

Update Search Application API docs to reflect Tech Preview status (el…

8432a5a

…astic#95308)

[Fleet] Support for Profiling symbolization (elastic#95241)

91cd61a

* It adds the profiling index pattern profiling-* to the fleet server service privileges. * And adds profiling-* to kibana system role privileges. --------- Co-authored-by: Daniel Mitterdorfer <daniel.mitterdorfer@elastic.co>

Adding a PipelineRegistry component to utils.

d0901d2

Use pipeline registry to create Analytics ingest events pipeline.

49a7dbb

Adding tests for the AnalyticsIngestPipelineRegistry

0e5d14d

Implementing URL fields into ingest pipeline for behavioral analytics…

cac34a7

… events.

Refactoring: grouping constants in a single place.

7badaca

Update mapping: URL fields are now objects.

35a47d2

Finalizing the events ingest pipeline.

ecc8d4a

Update docs/changelog/95198.yaml

a08a17f

afoucret force-pushed the pipeline-registry branch from a3ba148 to 6233ed5 Compare April 18, 2023 06:34

Fix feedback review.

4d39402

afoucret force-pushed the pipeline-registry branch from 6233ed5 to 4d39402 Compare April 18, 2023 06:36

Fix regression.

e8374bc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline registry #3

Pipeline registry #3

afoucret commented Apr 5, 2023

Pipeline registry #3

Are you sure you want to change the base?

Pipeline registry #3

Conversation

afoucret commented Apr 5, 2023