Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extension/filestorage] Mark as Beta #9135

Closed
cpheps opened this issue Apr 11, 2022 · 10 comments · Fixed by #9250
Closed

[extension/filestorage] Mark as Beta #9135

cpheps opened this issue Apr 11, 2022 · 10 comments · Fixed by #9250

Comments

@cpheps
Copy link
Contributor

cpheps commented Apr 11, 2022

Proposal

I would like to propose to update the filestorage extension to be in beta rather than alpha. The extension has gone through minimal changes in the past two months and appears fairly stable. Updating it to beta status will give users more assurance and confidence in using the extension.

@BinaryFissionGames and I ran some benchmarking tests against the collector with and without the filestorage extension enabled and we didn't see much difference in performance than without using it.

Benchmarks:

Here's a breakdown of the benchmarks we ran:

We used the aws logbench tool in our testing.

Logs were sent to an in-house gRPC server that tracks throughput. The server was on a separate machine to not interfere with the collector.

Benchmark parameters:
Log Rate: 15000 logs/sec
Num Files: 10 files
Test duration: 30 seconds (we found running longer doesn't show much difference)

Configs

Without filestorage

receivers:
  filelog:
    include: ["/opt/logs/*.log"]

processors:
  batch:
    send_batch_max_size: 200
    send_batch_size: 200

exporters:
  otlp:
    endpoint: throughput-count:9124
    compression: none
    tls:
      insecure: true

service:
  pipelines:
    logs:
      receivers:
        - filelog
      processors:
        - batch
      exporters:
        - otlp

With filestroage

receivers:
  filelog:
    include: ["/opt/logs/*.log"]

processors:
  batch:
    send_batch_max_size: 200
    send_batch_size: 200

exporters:
  otlp:
    endpoint: throughput-count:9124
    compression: none
    tls:
      insecure: true
    sending_queue:
      persistent_storage_enabled: true

extensions:
  file_storage:
    directory: "/opt/collector/storage"

service:
  extensions: [file_storage]
  pipelines:
    logs:
      receivers:
        - filelog
      processors:
        - batch
      exporters:
        - otlp

Results

Without filestorage

Avg CPU: 88.03%
Avg Memory: 1053.83 MB
Log records Processed: 3,875,800
Avg Throughput: 87K logs/sec

Throughput Graph:
file-batch-otlp-rate

With filestorage

Avg CPU: 132.34%
Avg Memory: 328.05 MB
Log records Processed: 3,758,950
Avg Throughput: 86K logs/sec

Throughput Graph:
file-batch-persistent-otlp-rate

Conclusion

The filestorage extension has a slightly lower throughput, uses more CPU, and less memory than a collector running without it. We feel this falls in line with expectations of sending batched log records to disk. Overall there seemed to be a negligible impact on the collector as far as performance or stability was concerned.

@jsirianni
Copy link
Member

👍 The storage extension is critical for log pipelines.

@djaglowski
Copy link
Member

cc: @pmm-sumo @tigrannajaryan

@djaglowski
Copy link
Member

Checking through other components of the collector, I believe the only status that is strictly defined is "stable", so alpha vs beta as I understand it is a good faith description of maturity.

The component has been exercised quite a lot with minimal changes, so I would subjectively describe it as Beta.

@pmm-sumo
Copy link
Contributor

@cpheps great benchmark and it's great to see persistent buffering out in the wild!
Out of curiosity, did you perhaps use the version with recent performance improvements?

I agree it should be marked as Beta. Also, as two followup questions:

@cpheps
Copy link
Contributor Author

cpheps commented Apr 11, 2022

@pmm-sumo

Out of curiosity, did you perhaps use the version with #9003?

Yes we did test with those changes.

what do you think about moving the implementation to core?

I think keeping it as an extension makes sense. There's a good interface in core that allows any type of storage to implement. I don't think filestorage should be special.

do you think we could mark persistent buffering as Beta as well and include it in the build?

I'm honestly not sure on this. I think per @djaglowski's comment of "stable" vs beta if we feel like it's currently working well enough with minimal changes changing the label to beta makes sense. We haven't extensively benchmarked or tested the persistent buffering exclusively.

@tigrannajaryan
Copy link
Member

Checking through other components of the collector, I believe the only status that is strictly defined is "stable", so alpha vs beta as I understand it is a good faith description of maturity.

We have a few components in the core marked "Beta": https://github.com/open-telemetry/opentelemetry-collector#status so there is some precedent there.

I am in favour of declaring the filestorage as "Beta" now that we have one more confirmation that it works well. Before making the declaration I would advise to go over the filestorage config setting to make sure we are happy with setting names, values, etc so that we avoid making breaking changes (if possible) after declaring Beta.

@pmm-sumo
Copy link
Contributor

pmm-sumo commented Apr 11, 2022

what do you think about moving the implementation to core?

I think keeping it as an extension makes sense. There's a good interface in core that allows any type of storage to implement. I don't think filestorage should be special.

Yeah, I actually extracted the interface out of @djaglowski work :) Though I recall our idea back then was that the filestorage extension could follow some time after. I believe one of the reasons is that we could then add persistent buffering to batchprocessor.

We don't need to do it right away but I think it would be good to get a common agreement on what we're planning here (and if persistent buffering in batchprocessor is still relevant. There's an idea to move it to exporthelper though

do you think we could mark persistent buffering as Beta as well and include it in the build?

I'm honestly not sure on this. I think per @djaglowski's comment of "stable" vs beta if we feel like it's currently working well enough with minimal changes changing the label to beta makes sense. We haven't extensively benchmarked or tested the persistent buffering exclusively.

I have some extra capabilities in mind. Like being able to specify the maximum size of the persistent storage in bytes (which should be relatively easy since we marshall the items and can get the size easily). It would be an additive rather than a breaking change.

Before making the declaration I would advise to go over the filestorage config setting to make sure we are happy with setting names, values, etc so that we avoid making breaking changes (if possible) after declaring Beta.

There's one issue on real-time compaction I'm currently working on (though got slightly delayed last week). This would probably end up being an additive change to the current interface.

@djaglowski
Copy link
Member

Before making the declaration I would advise to go over the filestorage config setting to make sure we are happy with setting names, values, etc so that we avoid making breaking changes (if possible) after declaring Beta.

I've given this another look over and it still looks good to me.

@cpheps
Copy link
Contributor Author

cpheps commented Apr 13, 2022

@tigrannajaryan and @pmm-sumo Just wanted to follow up on next steps for this as it seems like there is a general consensus that changing the extension to beta is an acceptable move. The config has been vetted so I'm wondering what are next steps?

@djaglowski
Copy link
Member

@cpheps, open a PR that updates the component's readme. Final consensus can be confirmed via review process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants