Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingest: processor stats #34202

Merged
merged 7 commits into from
Oct 20, 2018
Merged

ingest: processor stats #34202

merged 7 commits into from
Oct 20, 2018

Conversation

jakelandis
Copy link
Contributor

@jakelandis jakelandis commented Oct 1, 2018

This change introduces stats per processors. Total, time, failed,
current are currently supported. All pipelines will now show all
top level processors that belong to it. Failure processors are not
displayed, however, the time taken to execute the failure chain is part
of the stats for the top level processor.

The processor name is the type of the processor, ordered as defined in
the pipeline. If a tag for the processor is found, then the tag is
appended to the type (colon separated).

Pipeline processors will have the pipeline name appended (colon separated)
to the name of the pipeline (before the tag if one exists).
If more then one pipeline is used to process the document, then each pipeline
will carry its own stats. The outer most pipeline will also include the
inner most pipeline stats.

Conditional processors will only be included in the stats if the condition
evaluates to true.

Best attempts are made to carry forward processor metrics between
cluster state changes. If no changes are made to the pipeline, metrics
will be carried forward. However, if the pipeline changes and the
number of processors or the order of the processors (determined by type)
changes, the processor metrics will be reset to zero.

Closes #33387


An example pipeline:

"test_pipeline": {
  "count": 0,
  "time_in_millis": 0,
  "current": 0,
  "failed": 0,
  "processors": [
    {
      "grok": {
        "count": 0,
        "time_in_millis": 0,
        "current": 0,
        "failed": 0
      }
    },
    {
      "date": {
        "count": 0,
        "time_in_millis": 0,
        "current": 0,
        "failed": 0
      }
    },
    {
      "remove": {
        "count": 0,
        "time_in_millis": 0,
        "current": 0,
        "failed": 0
      }
    }
  ]
}

^^ The "processors": [ is new.

An example with a tag and a pipeline processor:

"mypipeline1": {
  "count": 0,
  "time_in_millis": 0,
  "current": 0,
  "failed": 0,
  "processors": [
    {
      "set:sets the thing to true": {
        "count": 0,
        "time_in_millis": 0,
        "current": 0,
        "failed": 0
      }
    },
    {
      "pipeline:mypipeline2": {
        "count": 0,
        "time_in_millis": 0,
        "current": 0,
        "failed": 0
      }
    }
  ]
}

Rally with the http_logs with grok was run against the code prior and with this change. No performance impact was noticed.

Also, note this PR does not make any changes to the output of simulate or simulate?verbose. (that will be a different PR).

This change introduces stats per processors. Total, time, failed,
current are currently supported. All pipelines will now show all
top level processors that belong to it. Failure processors are not
displayed, however, the time taken to execute the failure chain is part
of the stats for the top level processor.

The processor name is the type of the processor, ordered as defined in
the pipeline. If a tag for the processor is found, then the tag is
appended to the type.

Pipeline processors will have the pipeline name appended to the name of
the name of the processors (before the tag if one exists). If more
then one pipeline is used to process the document, then each pipeline
will carry its own stats. The outer most pipeline will also include the
inner most pipeline stats.

Conditional processors will only included in the stats if the condition evaluates
to true.
@jakelandis jakelandis added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v7.0.0 v6.5.0 labels Oct 1, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments

super();
this.ignoreFailure = ignoreFailure;
this.processors = processors;
this.onFailureProcessors = onFailureProcessors;
this.clock = clock;
processorsWithMetrics = new ArrayList<>(processors.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use consistent style, setting with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

for (Tuple<Processor, IngestMetric> processorWithMetric : processorsWithMetrics) {
Processor processor = processorWithMetric.v1();
IngestMetric metric = processorWithMetric.v2();
long startTimeInMillis = clock.millis();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is backed by System.currentTimeMillis(). In other areas of the system, we avoid this and use a wrapper on ThreadPool which avoids excessive calls to that method, caching the result across threads. Either we should be using a LongSupplier like we do in other areas, or have a Clock implementation backed by ThreadPool. Additionally, I'm not sure we need absolute time, so probably the other method for relative time would be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rjernst I removed the clock in favor a LongSupplier backed by System:nanoTime 2714338. Can you please confirm that this is the correct fix here ? I made the same mistake in a different PR I will need fix too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rjernst - I also fixed this for other places where I made the same mistake: ff98a82


ConditionalProcessor(String tag, Script script, ScriptService scriptService, Processor processor) {
ConditionalProcessor(String tag, Script script, ScriptService scriptService, Processor processor) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spacing is off here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

}
return ingestDocument;
}

Processor getProcessor() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just going to be used by tests, why not make the member package protected instead of a getter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//Best attempt to populate new processor metrics using a parallel array of the old metrics. This is not ideal since
//the per processor metrics may get reset when the arrays don't match. However, to get to an ideal model, unique and
//consistent id's per processor and/or semantic equals for each processor will be needed.
if(newPerProcessMetrics.size() == oldPerProcessMetrics.size()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: space after if

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

List<Tuple<Processor, IngestMetric>> newPerProcessMetrics = new ArrayList<>();
getProcessorMetrics(originalPipeline.getCompoundProcessor(), oldPerProcessMetrics);
getProcessorMetrics(pipeline.getCompoundProcessor(), newPerProcessMetrics);
//Best attempt to populate new processor metrics using a parallel array of the old metrics. This is not ideal since
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why try to transfer metrics at all? Could we just say when a pipeline's configuration is updated, the metrics are reset?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When 1 pipeline changes, then all pipelines are rebuilt and without this code we would loose all metrics on any pipeline change. For example we have 2 pipelines, and 1 one of them is deleted, we don't want to loose the metrics for the pipeline that had no changes. We don't have easy access to exactly which pipeline changed, and the heuristic to carry forward the metrics is if the pipeline still exists, the count of processors and the types of processor (in order) don't change, then carry forward the metrics.

pipelineStats.writeTo(out);
List<Tuple<String, Stats>> processorStats = entry.getValue().v2();
out.writeVInt(processorStats.size());
for(Tuple<String, Stats> processorTuple : processorStats){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: space after for

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

*/
public Map<String, Stats> getStatsPerPipeline() {
public Map<String, Tuple<IngestStats.Stats, List<Tuple<String, IngestStats.Stats>>>> getStatsPerPipeline() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the type here would be much easier to understand as a dedicated class, rather than a very nested set of Tuple/List/Map

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 8067758. IngestStats now accepts totalStats, List, and Map<String, ProcessorStat> (keyed by pipelineId), and a builder to help build from the Metrics representation. I hope this makes the code more readable.

List<Tuple<String, IngestStats.Stats>> deserializedProcessorStats =
deserializedIngestStats.getProcessorStatsForPipeline(pipelineName);
Iterator<Tuple<String, IngestStats.Stats>> it = deserializedProcessorStats.iterator();
for(Tuple<String, IngestStats.Stats> processorTuple : processorStats){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: space after for

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@jakelandis
Copy link
Contributor Author

@rjernst - All initial comments have been addressed. Mind to take another look ?

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jakelandis jakelandis merged commit 6567729 into elastic:master Oct 20, 2018
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Oct 21, 2018
* master:
  Use trial license in docs tests (elastic#34673)
  Scripting: Convert script fields to use script context (elastic#34164)
  TEST: Mute testDedupByPrimaryTerm
  ingest: processor stats (elastic#34202)
jasontedor added a commit that referenced this pull request Oct 21, 2018
@jasontedor
Copy link
Member

I reverted this from master in 0577703 due to failing tests in the mixed cluster tests.

jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Oct 21, 2018
* master:
  Revert "ingest: processor stats (elastic#34202)"
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Oct 21, 2018
* master:
  Revert "ingest: processor stats (elastic#34202)"
jakelandis added a commit to jakelandis/elasticsearch that referenced this pull request Oct 22, 2018
This change introduces stats per processors. Total, time, failed,
current are currently supported. All pipelines will now show all
top level processors that belong to it. Failure processors are not
displayed, however, the time taken to execute the failure chain is part
of the stats for the top level processor.

The processor name is the type of the processor, ordered as defined in
the pipeline. If a tag for the processor is found, then the tag is
appended to the type.

Pipeline processors will have the pipeline name appended to the name of
the name of the processors (before the tag if one exists). If more
then one pipeline is used to process the document, then each pipeline
will carry its own stats. The outer most pipeline will also include the
inner most pipeline stats.

Conditional processors will only included in the stats if the condition evaluates
to true.
@jakelandis
Copy link
Contributor Author

Re-introduce this change with test fix on #34724

@ruflin
Copy link
Member

ruflin commented Oct 29, 2018

@yaronp68 FYI

kcm pushed a commit that referenced this pull request Oct 30, 2018
This change introduces stats per processors. Total, time, failed,
current are currently supported. All pipelines will now show all
top level processors that belong to it. Failure processors are not
displayed, however, the time taken to execute the failure chain is part
of the stats for the top level processor.

The processor name is the type of the processor, ordered as defined in
the pipeline. If a tag for the processor is found, then the tag is
appended to the type.

Pipeline processors will have the pipeline name appended to the name of
the name of the processors (before the tag if one exists). If more
then one pipeline is used to process the document, then each pipeline
will carry its own stats. The outer most pipeline will also include the
inner most pipeline stats.

Conditional processors will only included in the stats if the condition evaluates
to true.
kcm pushed a commit that referenced this pull request Oct 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants