Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter_lua: add support for log metadata handling #9702

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

edsiper
Copy link
Member

@edsiper edsiper commented Dec 10, 2024

Introduces a new option called enable_metadata (default: false) to the Lua filter so the Lua scripts provided are able to manipulate the metadata of a log record:

pipeline:
  inputs:
    - name: dummy
      processors:
        logs:
          - name: lua
            enable_metadata: true
            call: test
            code: |
              function test(tag, timestamp, metadata, body)
                metadata['meta_test'] = 'ok'
                body['body_test'] = 'ok'
                return 2, timestamp, metadata, body
              end
  outputs:
    - name : stdout
      match: '*'

output:

bin/fluent-bit -c ../conf/fluent-bit-lua.yaml
Fluent Bit v3.2.3
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _           _____  _____ 
|  ___| |                | |   | ___ (_) |         |____ |/ __  \
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __   / /`' / /'
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \  / /  
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /.___/ /./ /___
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)_____/


[2024/12/09 22:50:57] [ info] [fluent bit] version=3.2.3, commit=412d3ea818, pid=447239
[2024/12/09 22:50:57] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/12/09 22:50:57] [ info] [simd    ] disabled
[2024/12/09 22:50:57] [ info] [cmetrics] version=0.9.9
[2024/12/09 22:50:57] [ info] [ctraces ] version=0.5.7
[2024/12/09 22:50:57] [ info] [input:dummy:dummy.0] initializing
[2024/12/09 22:50:57] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2024/12/09 22:50:57] [ info] [sp] stream processor started
[2024/12/09 22:50:57] [ info] [output:stdout:stdout.0] worker #0 started
[0] dummy.0: [[1733806258.333027665, {"meta_test"=>"ok"}], {"body_test"=>"ok", "message"=>"dummy"}]
[0] dummy.0: [[1733806259.332974330, {"meta_test"=>"ok"}], {"body_test"=>"ok", "message"=>"dummy"}]

update: Jan 09, 2024

We will add this functionality as opt-in for v4, in the meanwhile a new processor is created as described in the updated info of Dec 24, 2024.

update: Dec 24, 2024

I have been thinking in the future of this plugin and while adding metadata support as an extra argument to the Lua callback solves the problem, it seems we need a more flexible solution since we will implement also support for metrics and traces, couple of things I have in mind for API v2 (without breaking compatibility with v1):

  • the new API v2 can only be enabled if the plugin runs as a processor instead of a common filter.
  • instead of receiving a msgpack as an input buffer and since it runs as a processor, the plugin will receive a CFL object that gets converted to a Lua table. In this way we remove Msgpack decode/encode from the equation.
  • return codes of the user Lua function will be simplified.
  • "potentially" allows to receive a list of records instead of one by one. There are pros and cons here, as well we need to consider the concept of groups that we use to set special OpenTelemetry metadata like resource/scope attributes.
  • processing of metrics and traces are out of the scope of this PR, however, the way I think the safest for the user is to expose the C API for metrics/traces manipulation inside Lua as simple Lua function.
  • leverage the work done by @tarruda in the other PRs.

cc: adding @niedbalski for visibility.

This is work in progress.


Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@edsiper edsiper added this to the Fluent Bit Next milestone Dec 10, 2024
@edsiper edsiper changed the title filter_lua: add support for metadata handling filter_lua: add support for log metadata handling Dec 10, 2024
@tarruda
Copy link

tarruda commented Dec 10, 2024

Hi @edsiper !

Not sure if you saw, but a few months ago I created a PR which exposes log metadata to Lua in a backwards compatible way: #9323 . #9323 adds a new processor, so internally it is not compatible with filter API. In any case I thought you might find it useful to look at how the public Lua API looks like there, so it could potentially be implemented in a similar manner.

@edsiper
Copy link
Member Author

edsiper commented Dec 18, 2024

they @tarruda thanks for reviewing this :) , and thanks for the hints to review the other PRs, definitely I will borrow some of it

The current Lua filter only supports the processing of the log body and timestamp per
record. Metadata support in logs was added recently and this patch extends the filter
with a new function prototype and return values to provide metadata manipulation
capabilities.

The new option called 'enable_metadata', boolean (default: off) allows to use a new
prototype for the Lua script which in a new argument receives the metadata as a Lua
table, similar concept as the log body is received. The following is an example of
the use of this new functionality:

  pipeline:
    inputs:
      - name: dummy
        processors:
          logs:
            - name: lua
              enable_metadata: true
              call: test_v2
              code: |
                function test_v2(tag, timestamp, metadata, body)
                  metadata['meta_test'] = 'ok'
                  body['body_test'] = 'ok'
                  return 2, timestamp, metadata, body
                end
    outputs:
      - name : stdout
        match: '*'

For this type of function, is mandatory to return the metadata table, either a new
one or an updated version.

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants