-
Notifications
You must be signed in to change notification settings - Fork 1.8k
config: multiline: in_tail: filter_multiline: Add configurable buffer limit for multiline interface #10653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config: multiline: in_tail: filter_multiline: Add configurable buffer limit for multiline interface #10653
Conversation
578f46a
to
3a8abd9
Compare
WalkthroughAdds a configurable multiline buffer limit and binary-size parser; enforces per-group buffer limits with truncation tracking and metadata; propagates FLB_MULTILINE_TRUNCATED through multiline core and plugins; exposes truncation metrics; introduces a params-based multiline parser API and related tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Config
participant Tail as TailInput
participant ML as MultilineCore
participant Group as StreamGroup
participant Filter
participant Metrics
Config->>ML: init buffer_limit (string -> bytes)
Tail->>ML: flb_ml_append_text/append_object(data)
ML->>Group: flb_ml_group_cat(data, len)
alt appended fully
Group-->>ML: FLB_MULTILINE_OK
ML->>Filter: emit/process event
Filter->>Metrics: inc emitted metric
else truncated or partially appended
Group-->>ML: FLB_MULTILINE_TRUNCATED
ML->>Tail: log warning
ML->>Metrics: inc truncated metric
Note right of ML #aqua: mark group->truncated and attach metadata on flush
end
ML->>Group: flush -> include "multiline_truncated": true if set
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
3a8abd9
to
dec20f1
Compare
/* Return codes */ | ||
#define FLB_MULTILINE_OK 0 | ||
#define FLB_MULTILINE_PROCESSED 1 /* Reserved */ | ||
#define FLB_MULTILINE_TRUNCATED 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing to status code 2 is needed because status code 1 will be collided for FLB_TRUE status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/flb_config_map.c (1)
662-666
: Bug: list-type range check uses||
(always true) — can mis-handle non-list typesThese conditions should use
&&
to bound the range; with||
they evaluate true for almost all types, risking wrong assignments (e.g., VARIANT) and hard-to-trace bugs.Apply this diff:
- else if (m->type >= FLB_CONFIG_MAP_CLIST || - m->type <= FLB_CONFIG_MAP_SLIST_4) { + else if (m->type >= FLB_CONFIG_MAP_CLIST && + m->type <= FLB_CONFIG_MAP_SLIST_4) { m_list = (struct mk_list **) (base + m->offset); *m_list = m->value.val.list; } @@ - else if (m->type >= FLB_CONFIG_MAP_CLIST || - m->type <= FLB_CONFIG_MAP_SLIST_4) { + else if (m->type >= FLB_CONFIG_MAP_CLIST && + m->type <= FLB_CONFIG_MAP_SLIST_4) { list = parse_string_map_to_list(m, kv->val); @@ - else if (m->type >= FLB_CONFIG_MAP_CLIST || - m->type <= FLB_CONFIG_MAP_SLIST_4) { + else if (m->type >= FLB_CONFIG_MAP_CLIST && + m->type <= FLB_CONFIG_MAP_SLIST_4) { list = parse_string_map_to_list(m, kv->val);Also applies to: 740-742, 814-816
🧹 Nitpick comments (1)
src/flb_config_map.c (1)
662-666
: Optional: centralize list-type check to avoid regressionsCreate a tiny helper (static inline) like
is_list_type(int type)
and replace repeated range checks. This reduces future chances of reintroducing||
.Example helper:
static inline int is_list_type(int type) { return (type >= FLB_CONFIG_MAP_CLIST && type <= FLB_CONFIG_MAP_SLIST_4); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/flb_config_map.c
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/flb_config_map.c (1)
src/flb_slist.c (1)
flb_slist_destroy
(327-339)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-centos-7
- GitHub Check: PR - fuzzing test
🔇 Additional comments (1)
src/flb_config_map.c (1)
98-103
: Good catch: free slist entries on error pathCalling flb_slist_destroy(list) before freeing the list head prevents leaks when split functions partially append entries. Looks correct and consistent with destroy_map_val.
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/multiline/flb_ml.c (3)
293-303
: Enforce buffer_limit for ENDSWITH/EQ (currently bypassed).These paths concatenate with
flb_sds_cat_safe
, ignoringml->buffer_limit
. This defeats the new safety cap and can grow unbounded.Apply:
@@ /* ENDSWITH concatenation */ - /* Concatenate value */ - if (val_content) { - flb_sds_cat_safe(&stream_group->buf, - val_content->via.str.ptr, - val_content->via.str.size); - } - else { - flb_sds_cat_safe(&stream_group->buf, buf_data, buf_size); - } + /* Concatenate value with limit enforcement */ + { + int rc; + if (val_content) { + rc = flb_ml_group_cat(stream_group, + val_content->via.str.ptr, + val_content->via.str.size); + } + else { + rc = flb_ml_group_cat(stream_group, buf_data, buf_size); + } + if (rc == FLB_MULTILINE_TRUNCATED) { + truncated = FLB_TRUE; + } + } @@ /* EQ concatenation */ - /* Concatenate value */ - if (val_content) { - flb_sds_cat_safe(&stream_group->buf, - val_content->via.str.ptr, - val_content->via.str.size); - } - else { - flb_sds_cat_safe(&stream_group->buf, buf_data, buf_size); - } + /* Concatenate value with limit enforcement */ + { + int rc; + if (val_content) { + rc = flb_ml_group_cat(stream_group, + val_content->via.str.ptr, + val_content->via.str.size); + } + else { + rc = flb_ml_group_cat(stream_group, buf_data, buf_size); + } + if (rc == FLB_MULTILINE_TRUNCATED) { + truncated = FLB_TRUE; + } + }Also applies to: 327-336
641-647
: Initialize ret on stream lookup failure to avoid undefined return.If
mst
is NULL, wegoto exit;
without settingret
. Ensure-1
is returned.Apply:
- if (!mst) { + if (!mst) { flb_error("[multiline] invalid stream_id %" PRIu64 ", could not " "append content to multiline context", stream_id); - goto exit; + ret = -1; + goto exit; } @@ - return ret; + return ret;Also applies to: 662-662
552-579
: Avoid NULL deref on key_content in map pre-parse path.
len = flb_sds_len(parser->key_content);
is executed before checkingparser->key_content
. Compute length only when non-NULL and use a separatekey_len
.Apply:
@@ - /* lookup key_content */ - - len = flb_sds_len(parser->key_content); + /* lookup key_content */ + int key_len = 0; + if (parser->key_content) { + key_len = flb_sds_len(parser->key_content); + } map_size = map->via.map.size; for(i = 0; i < map_size; i++) { @@ - if (key.type == MSGPACK_OBJECT_STR && - parser->key_content && - key.via.str.size == len && - strncmp(key.via.str.ptr, parser->key_content, len) == 0) { + if (parser->key_content && + key.type == MSGPACK_OBJECT_STR && + key.via.str.size == key_len && + strncmp(key.via.str.ptr, parser->key_content, key_len) == 0) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/multiline/flb_ml.c
(13 hunks)tests/internal/multiline.c
(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-04T12:35:22.872Z
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10825
File: plugins/out_s3/s3.c:1339-1344
Timestamp: 2025-09-04T12:35:22.872Z
Learning: In the Fluent Bit S3 plugin, the user prefers to maintain current retry_limit behavior without special handling for FLB_OUT_RETRY_UNLIMITED (-1), as there's no documentation indicating -1 should be used for infinite retries and consistency with current logic is preferred.
Applied to files:
src/multiline/flb_ml.c
🧬 Code graph analysis (2)
tests/internal/multiline.c (4)
src/flb_config.c (2)
flb_config_init
(216-427)flb_config_exit
(429-606)src/multiline/flb_ml.c (3)
flb_ml_create
(869-921)flb_ml_append_text
(665-755)flb_ml_destroy
(982-1007)src/multiline/flb_ml_parser.c (3)
flb_ml_parser_params_default
(32-44)flb_ml_parser_create_params
(47-129)flb_ml_parser_instance_create
(261-312)src/multiline/flb_ml_stream.c (1)
flb_ml_stream_create
(223-276)
src/multiline/flb_ml.c (3)
src/multiline/flb_ml_stream.c (2)
flb_ml_stream_get
(278-292)flb_ml_stream_group_get
(91-135)src/multiline/flb_ml_group.c (1)
flb_ml_group_cat
(89-122)src/flb_utils.c (1)
flb_utils_size_to_binary_bytes
(610-694)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: PR - fuzzing test
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-centos-7
🔇 Additional comments (13)
tests/internal/multiline.c (5)
114-115
: Confirm expected ordering/merging in container_mix_output.The added outputs "bbccdd-out\n" and "dd-err\n" change the merge/order semantics. Please confirm this matches the new truncation-aware pipeline behavior and is not a regression in per-stream isolation.
395-397
: Good: safe early-return when no expected-result context.Prevents dereferencing NULL in tests that intentionally pass NULL to the flush callback.
1537-1559
: LGTM: disabled buffer limit path is correct."false" → 0 and asserting
ml->buffer_limit == 0
aligns with unlimited behavior.
1571-1572
: LGTM: tests registered.New tests are added to TEST_LIST.
1463-1535
: Fix test to actually cap concatenated 'log' content; remove unused var.As written, the parser doesn't pre-parse JSON, so you're concatenating raw JSON lines, not the "log" field. The assertion may pass by accident and won't catch regressions. Also,
p
is declared but unused.Update to use the docker parser for JSON pre-parse and lower the limit so concatenated log content exceeds it; skip gracefully if docker parser is unavailable.
Apply:
@@ static void test_buffer_limit_truncation() { int ret; uint64_t stream_id; struct flb_config *config; struct flb_ml *ml; struct flb_ml_parser *mlp; struct flb_ml_parser_ins *mlp_i; - struct flb_parser *p; + struct flb_parser *p; struct flb_time tm; @@ - char *line1 = "{\"log\": \"12345678901234567890\", \"stream\": \"stdout\"}"; - char *line2 = "{\"log\": \"abcdefghijklmnopqrstuvwxyz\", \"stream\": \"stdout\"}"; + char *line1 = "{\"log\": \"12345678901234567890\", \"stream\": \"stdout\"}"; + char *line2 = "{\"log\": \"abcdefghijklmnopqrstuvwxyz\", \"stream\": \"stdout\"}"; @@ config = flb_config_init(); - /* The buffer limit is for the concatenated 'log' content, not the full JSON */ + /* The buffer limit is for the concatenated 'log' content, not the full JSON */ + /* Guard: require single-line docker parser availability for JSON pre-parse */ + p = flb_parser_get("docker", config); + TEST_CHECK(p != NULL); + if (!p) { + flb_config_exit(config); + return; + } @@ - config->multiline_buffer_limit = flb_strdup("80"); + /* 40 < length("12345678901234567890" + "abcdefghijklmnopqrstuvwxyz") = 46 */ + config->multiline_buffer_limit = flb_strdup("40"); @@ - /* --- New params-based initializer --- */ + /* --- New params-based initializer --- */ struct flb_ml_parser_params params = flb_ml_parser_params_default("test-concat"); params.type = FLB_ML_REGEX; params.negate = FLB_FALSE; params.flush_ms = 1000; params.key_content = "log"; - params.parser_ctx = NULL; - params.parser_name = NULL; + params.parser_ctx = NULL; + params.parser_name = "docker"; /* pre-parse JSON and extract key_content */ @@ mlp = flb_ml_parser_create_params(config, ¶ms); TEST_CHECK(mlp != NULL); + /* Link params.parser_name -> parser ctx */ + ret = flb_ml_parsers_init(config); + TEST_CHECK(ret == 0); + if (ret != 0) { + flb_ml_destroy(ml); + flb_config_exit(config); + return; + }src/multiline/flb_ml.c (8)
27-27
: LGTM: needed headers added.
flb_utils.h
for size parsing and<stdint.h>
for fixed-width types.Also applies to: 34-34
218-273
: Good: propagate truncation status from rule processing.Capturing FLB_MULTILINE_TRUNCATED and returning it preserves signal to callers.
Also applies to: 348-351
421-424
: LGTM: return raw status for text path.Allows callers to observe truncation vs OK.
671-693
: LGTM: status propagation and fallback single-line packaging.Correctly retains TRUNCATED and flushes pending on non-match before emitting standalone message with limit enforcement.
Also applies to: 697-722, 724-755
766-786
: LGTM: object path mirrors text semantics and preserves status.Validates map type, propagates TRUNCATED, and performs proper flush-before-standalone behavior.
Also applies to: 789-799, 803-827, 830-856
872-894
: LGTM: buffer_limit config parsing honors 0=false (unlimited) and -1=default.Matches the intended semantics from size parser.
1439-1445
: LGTM: emits multiline_truncated metadata flag.Good observability addition; pairs with group->truncated.
1347-1385
: Guard against NULL key_content during repack to avoid crash.
flb_sds_len(parser_i->key_content)
is called unconditionally; ifkey_content
is unset, this derefs NULL.Apply:
- if (flb_sds_len(group->buf) > 0) { - /* Take the first line keys and repack */ - len = flb_sds_len(parser_i->key_content); + if (flb_sds_len(group->buf) > 0) { + /* Take the first line keys and repack */ + int key_len = parser_i->key_content ? flb_sds_len(parser_i->key_content) : 0; size = map.via.map.size; msgpack_pack_map(&mp_pck, size); @@ - if (k.type == MSGPACK_OBJECT_STR && - parser_i->key_content && - k.via.str.size == len && - strncmp(k.via.str.ptr, parser_i->key_content, len) == 0) { + if (parser_i->key_content && + k.type == MSGPACK_OBJECT_STR && + k.via.str.size == key_len && + strncmp(k.via.str.ptr, parser_i->key_content, key_len) == 0) {
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/multiline/flb_ml.c (3)
291-307
: Enforce buffer_limit in ENDSWITH/EQ paths (stop bypassing flb_ml_group_cat).Direct flb_sds_cat_safe writes skip the new buffer limit and won’t set group->truncated. Route concatenations through flb_ml_group_cat and propagate TRUNCATED status.
Apply:
@@ - /* Concatenate value */ - if (val_content) { - flb_sds_cat_safe(&stream_group->buf, - val_content->via.str.ptr, - val_content->via.str.size); - } - else { - flb_sds_cat_safe(&stream_group->buf, buf_data, buf_size); - } + /* Concatenate value honoring buffer limit */ + if (val_content) { + ret = flb_ml_group_cat(stream_group, + val_content->via.str.ptr, + val_content->via.str.size); + } + else { + ret = flb_ml_group_cat(stream_group, buf_data, buf_size); + } + if (ret == FLB_MULTILINE_TRUNCATED) { + truncated = FLB_TRUE; + } @@ - /* Concatenate value */ - if (val_content) { - flb_sds_cat_safe(&stream_group->buf, - val_content->via.str.ptr, - val_content->via.str.size); - } - else { - flb_sds_cat_safe(&stream_group->buf, buf_data, buf_size); - } + /* Concatenate value honoring buffer limit */ + if (val_content) { + ret = flb_ml_group_cat(stream_group, + val_content->via.str.ptr, + val_content->via.str.size); + } + else { + ret = flb_ml_group_cat(stream_group, buf_data, buf_size); + } + if (ret == FLB_MULTILINE_TRUNCATED) { + truncated = FLB_TRUE; + }Also applies to: 327-341
191-199
: Append newline via flb_ml_group_cat to respect buffer_limit.This can overflow the cap by one byte today.
Apply:
- if (stream_group->buf[len - 1] != '\n') { - flb_sds_cat_safe(&stream_group->buf, "\n", 1); - } + if (stream_group->buf[len - 1] != '\n') { + (void) flb_ml_group_cat(stream_group, "\n", 1); + }
552-579
: Avoid NULL deref when key_content is unset in map path.flb_sds_len(parser->key_content) runs without checking key_content.
Apply:
if (parser->ml_parser->parser) { - /* lookup key_content */ - - len = flb_sds_len(parser->key_content); + /* lookup key_content */ + if (parser->key_content == NULL) { + /* No content key to parse; fall back to raw map handling */ + *out_buf = buf; + *out_size = size; + return 0; + } + + len = flb_sds_len(parser->key_content);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/multiline/flb_ml.c
(13 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/multiline/flb_ml.c (3)
src/multiline/flb_ml_stream.c (2)
flb_ml_stream_get
(278-292)flb_ml_stream_group_get
(91-135)src/multiline/flb_ml_group.c (1)
flb_ml_group_cat
(89-122)src/flb_utils.c (1)
flb_utils_size_to_binary_bytes
(610-694)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-centos-7
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: PR - fuzzing test
🔇 Additional comments (6)
src/multiline/flb_ml.c (6)
886-894
: 0 = unlimited semantics implemented correctly.Treating -1 as invalid/unset and 0 as “no limit” is correct.
609-615
: Good: sub-parser failure no longer aborts text processing.Keeps multiline logic robust; lets rules decide on raw text.
688-717
: Good: propagate TRUNCATED and flush-on-no-match in append_text.Status handling and LRU updates look correct; fallback path is clear.
Also applies to: 724-755
793-825
: Good: mirror truncation-aware flow for append_object.Consistent status propagation, LRU update, and fallback flush.
Also applies to: 830-856
1439-1445
: Good: emit multiline_truncated metadata and reset state.Metadata emission and post-flush resets prevent leakage into next records.
Also applies to: 1485-1487
1347-1379
: Guard parser_i->key_content to avoid NULL deref during repack.len = flb_sds_len(parser_i->key_content) can crash when key_content is unset.
Apply:
- if (flb_sds_len(group->buf) > 0) { - /* Take the first line keys and repack */ - len = flb_sds_len(parser_i->key_content); + if (flb_sds_len(group->buf) > 0) { + /* Take the first line keys and repack */ + if (parser_i->key_content) { + len = flb_sds_len(parser_i->key_content); + } + else { + len = 0; /* skip replacement below */ + } @@ - if (k.type == MSGPACK_OBJECT_STR && - parser_i->key_content && + if (k.type == MSGPACK_OBJECT_STR && + parser_i->key_content &&
ml->config = ctx; | ||
limit = flb_utils_size_to_binary_bytes(ml->config->multiline_buffer_limit); | ||
if (limit >= 0) { | ||
ml->buffer_limit = (size_t) limit; | ||
} | ||
else { | ||
ml->buffer_limit = FLB_ML_BUFFER_LIMIT_DEFAULT; | ||
} | ||
ml->last_flush = time_ms_now(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add overflow guard when casting limit to size_t.
Very large values can wrap on 32‑bit or constrained platforms.
Apply:
- limit = flb_utils_size_to_binary_bytes(ml->config->multiline_buffer_limit);
- if (limit >= 0) {
- ml->buffer_limit = (size_t) limit;
- }
- else {
- ml->buffer_limit = FLB_ML_BUFFER_LIMIT_DEFAULT;
- }
+ limit = flb_utils_size_to_binary_bytes(ml->config->multiline_buffer_limit);
+ if (limit == -1) {
+ ml->buffer_limit = FLB_ML_BUFFER_LIMIT_DEFAULT;
+ }
+ else if (limit > 0 && (uint64_t) limit > (uint64_t) SIZE_MAX) {
+ flb_warn("[multiline] buffer limit too large (%" PRId64 "), capping to %zu",
+ limit, (size_t) SIZE_MAX);
+ ml->buffer_limit = SIZE_MAX;
+ }
+ else {
+ ml->buffer_limit = (size_t) limit; /* 0 => unlimited */
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
ml->config = ctx; | |
limit = flb_utils_size_to_binary_bytes(ml->config->multiline_buffer_limit); | |
if (limit >= 0) { | |
ml->buffer_limit = (size_t) limit; | |
} | |
else { | |
ml->buffer_limit = FLB_ML_BUFFER_LIMIT_DEFAULT; | |
} | |
ml->last_flush = time_ms_now(); | |
ml->config = ctx; | |
limit = flb_utils_size_to_binary_bytes(ml->config->multiline_buffer_limit); | |
if (limit == -1) { | |
ml->buffer_limit = FLB_ML_BUFFER_LIMIT_DEFAULT; | |
} | |
else if (limit > 0 && (uint64_t) limit > (uint64_t) SIZE_MAX) { | |
flb_warn("[multiline] buffer limit too large (%" PRId64 "), capping to %zu", | |
limit, (size_t) SIZE_MAX); | |
ml->buffer_limit = SIZE_MAX; | |
} | |
else { | |
ml->buffer_limit = (size_t) limit; /* 0 => unlimited */ | |
} | |
ml->last_flush = time_ms_now(); |
🤖 Prompt for AI Agents
In src/multiline/flb_ml.c around lines 886 to 894, the code casts a signed long
long limit to size_t without checking for overflow which can wrap on
32-bit/constrained platforms; update it to validate the limit before casting:
after computing limit, check that limit is >= 0 and that (unsigned long
long)limit <= (unsigned long long)SIZE_MAX (or compare against (long
long)SIZE_MAX if appropriate), and only then assign ml->buffer_limit = (size_t)
limit; otherwise set ml->buffer_limit = FLB_ML_BUFFER_LIMIT_DEFAULT; keep the
existing else branch and ensure you include <limits.h> if not already available.
This is corresponding to fluent/fluent-bit#10653. Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
This is corresponding to fluent/fluent-bit#10653. Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
This is corresponding to fluent/fluent-bit#10653. Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
We added an interface for configurable buffer limit for multiline.
Also, we implemented robust processing for multiline concatenations.
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Behavior Changes / Bug Fixes
Tests