Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configurable elasticsearch index date format #840

Conversation

pavelnikolov
Copy link

@pavelnikolov pavelnikolov commented May 19, 2018

Which problem is this PR solving?

Short description of the changes

  • This PR allows overriding the Elasticsearch index date format. There is a new flag when using SPAN_STORAGE_TYPE=elasticsearch:
--es.index.date-format string       The date format of the Elasticsearch index names suffix, using Golang date format. https://golang.org/pkg/time/#example_Time_Format (default "2006-01-02")

Signed-off-by: Pavel Nikolov <pavel.nikolov@fairfaxmedia.com.au>
@coveralls
Copy link

coveralls commented May 19, 2018

Coverage Status

Coverage remained the same at 100.0% when pulling 398d971 on pavelnikolov:support-custom-index-date-temaplte into 4402978 on jaegertracing:master.

@yurishkuro
Copy link
Member

overall looks good, but a couple of comments:

  • does this PR require a corresponding change to the cron job script that cleans stale indices?
  • this change does not actually address the issue in Make default time span of ES index configurable #628, which was more about the frequency of index rotation than the format of the index name. In other words, the change would be to add a rotation frequency parameter, expressed as Duration, and then truncate the timestamps to that duration (https://golang.org/pkg/time/#Time.Truncate) before passing it to the date format function.
  • since this PR is already changing the name of the index, we might want to allow changing the rest of the name (prefix) as well, which would help Support archiving traces with ES storage #818

@pavelnikolov
Copy link
Author

pavelnikolov commented May 19, 2018

  1. I have no idea about the cron-job. I'll have to get familiar with its code.
  2. Definitely using time.Truncate gives you ultimate control over the duration of the rolling indices. I was thinking of a very basic configuration:
    • monthly index: 2006-01
    • daily index: 2006-01-02
    • hourly index: 2006-01-02-15
      Unfortunately, we can't, for example, have new index every 4 hours or a weekly index. But I was hoping that the above 3 would cover most cases. Do you insist on having a time.Duration parameter and using time.Truncate? If yes, I believe this needs to be pushed in a separate PR.
  3. I'll add an option to update the prefix as well.

@yurishkuro
Copy link
Member

The cleaner script is here https://github.com/jaegertracing/jaeger/blob/master/plugin/storage/es/esCleaner.py

@yurishkuro yurishkuro requested a review from jpkrohling as a code owner May 29, 2018 23:41
@ghost ghost assigned yurishkuro May 29, 2018
@ghost ghost added the review label May 29, 2018
@@ -76,6 +77,7 @@ type Span struct {
// NewSpanWriter creates a new SpanWriter for use
func NewSpanWriter(
client es.Client,
dateFormat string,
Copy link
Member

@yurishkuro yurishkuro May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're breaking the API, let's use a struct instead

type SpanWriterOptions struct {
 	dateFormat string
 	numShards int64
 	numReplicas int64
}

func NewSpanWriter(
 	client es.Client,
 	logger *zap.Logger,
 	metricsFactory metrics.Factory,
 	options SpanWriterOptions) {
}

It will be easier to extend if we want to make index name configurable


dateFormat string

expectedSpanIndexName string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank lines not needed

@yurishkuro
Copy link
Member

Yes, Spark job will be affected as well. But it would have to be a different PR.

@CpuID
Copy link

CpuID commented May 30, 2018

@pavelnikolov there has been some discussion in #799 re configurable index name prefixes, which is related to the changes being made here (your changes so far focus on date format, but the changes need to be made in the same 3~ places and coordinated). @yurishkuro has suggested his preference would be to make both sets of changes together to save doing the coordinated changes twice.

@pavelnikolov can you take a look over #799 and see if you feel up to making the suggested changes in this PR? I am happy to help/provide input if you need also, was planning to PR separately otherwise :)

@pavelnikolov
Copy link
Author

pavelnikolov commented Jun 11, 2018 via email

@pavolloffay
Copy link
Member

I don't think we should go with this approach - configurable date format option. The biggest downside is the limitation on retention configuration:

  • monthly index: 2006-01
  • daily index: 2006-01-02
  • hourly index: 2006-01-02-15
  • yearly 2006
  • not possible to do weekly or N days/hours retention.

It's also harder to configure and ES cleaner would not look nice either - re date format in python is different.

I think we should go with truncation (span.StartTime.Truncate(d)) with minimal value of 1 day.

The reader would use the es.index.trunc-time to get the list of indexes based on es.max-span-age.

// Returns the array of indices that we need to query, based on query params
func (s *SpanReader) indicesForTimeRange(indexName string, startTime time.Time, endTime time.Time) []string {
	var indices []string
	firstIndex := indexWithDate(indexName, startTime, s.dateFormat)
	currentIndex := indexWithDate(indexName, endTime, s.dateFormat)
	for currentIndex != firstIndex {
		indices = append(indices, currentIndex)
                // use trunc time
		endTime = endTime.Add(-s.truncTime * time.Hour)
		currentIndex = indexWithDate(indexName, endTime, s.dateFormat)
	}
	return append(indices, firstIndex)
}

jaegerIndices := s.indicesForTimeRange(s.serviceIndexPrefix, currentTime.Add(-s.maxSpanAge), currentTime)

@pavolloffay
Copy link
Member

cc @jaegertracing/elasticsearch

@yurishkuro
Copy link
Member

Was there a specific reason someone asked specifically for date format rather than truncation?

@pavolloffay
Copy link
Member

from #628 (comment)

At the moment all time based indices that we have in our ES clusters are named like _YYYY.MM.DD. And we have a single cronjob which deletes indices with data older than some configured number of days. Jaeger indices are the only exception to this naming convention and we have to run/maintain a separate cronjob just for Jaeger. Not a big deal, but it's a little annoying : )

I don't know how @pavelnikolov deletes indices. but IIRC curator should be format agnostic the creation date is retrieved from index properties.

@masteinhauser
Copy link
Member

Was there a specific reason someone asked specifically for date format rather than truncation?

As long as the effect is the same: a span is written to the expected index, the means likely do not matter as long as the span's original timestamp is retained.

IIRC curator should be format agnostic the creation date is retrieved from index properties.

At least in later versions of elasticsearch-curator package, this is correct.

We are actually moving away from jaeger's time-based index naming entirely and choosing to use the Elasticsearch Rollover API as it scales more predictably for us. Example: 25GB per-shard, automatically, which is ideal for our write/read workloads (both human, and automated anomaly detection)
This is also what we use with our FluentD configuration in Kubernetes for per-team centralized logging.

not possible to do weekly or N days/hours retention.

At least in all of our usage, this largely doens't make sense in both our high-usage and low-usage indexes. The Elasticsearch Rollover API is a far better solution, from Elasticsearch's perspective, than sharding on time-based indices.

@yurishkuro
Copy link
Member

Rollover API looks pretty interesting, thanks for pointing it out. Perhaps it can also be used to avoid iterating through indices at query time.

@pavolloffay
Copy link
Member

Perhaps it can also be used to avoid iterating through indices at query time.

I think the reader would have to use a different index - alias https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html. Once the write index is rolled over the old index has to be moved to read alias. Perhaps we could use this to manage retention in archive index?

@pavolloffay
Copy link
Member

@pavelnikolov I will close this PR for two reasons:

  • it does not seem as the right approach to configure retention - not flexible enough. Better is to use rollover API or truncate timestamp on configurable duration.
  • it will complicate the index configuration on index cleaner and spark job - they don't use golang time format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make default time span of ES index configurable
6 participants