Skip to content

Commit

Permalink
Merge branch 'anilm3/v2' into anilm3/object_view
Browse files Browse the repository at this point in the history
  • Loading branch information
Anilm3 authored Feb 18, 2025
2 parents 5e2b499 + 6b43aec commit d88bc72
Show file tree
Hide file tree
Showing 78 changed files with 738 additions and 694 deletions.
27 changes: 27 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,31 @@
# libddwaf release

## v1.23.0

### New features ([unstable](https://github.com/DataDog/libddwaf/blob/master/README.md#versioning-semantics))

This new version of `libddwaf` introduces the WAF builder, a new mechanism for generating WAF instances through complete or partial configurations. This new mechanism aims to standardise the WAF update process across all WAF users, eliminating the possibility for incomplete or inconsistent implementations. With the introduction of the WAF builder, the `ddwaf_update` function has been deprecated, as the semantics have been drastically changed. More information about the builder can be found ([here](https://github.com/DataDog/libddwaf/blob/release/1.23.0/UPGRADING.md#waf-builder)).

In addition, diagnostics have now been split into warnings and errors to better differentiate those which can indicate a potential issue from those which may indicate a potential, but expected, incompatibility. More information about the diagnostic changes can be found ([here](https://github.com/DataDog/libddwaf/blob/release/1.23.0/UPGRADING.md#warning-and-error-diagnostics)).

Finally, a small but consequential change has been introduced to the endpoint fingerprint generation, which makes the `query` parameter of the postprocessor optional, meaning that fingerprints may be generated without it.

Since this release introduces breaking changes, a new section has been added to the [upgrading guide](https://github.com/DataDog/libddwaf/blob/release/1.23.0/UPGRADING.md#upgrading-from-1220-to-1230).

### Release changelog
#### Changes
- WAF Builder: independent configuration manager to generate WAF instances ([#363](https://github.com/DataDog/libddwaf/pull/363))
- Change endpoint fingerprint query parameter to optional ([#365](https://github.com/DataDog/libddwaf/pull/365))
- Split diagnostics into warnings and errors ([#368](https://github.com/DataDog/libddwaf/pull/368))
- Pass object limits at evaluation time rather than parsing ([#370](https://github.com/DataDog/libddwaf/pull/370))

#### Fixes
- Wrap containers in the ruleset within shared pointers to reduce copies ([#366](https://github.com/DataDog/libddwaf/pull/366))

#### Miscellaneous
- Rename parameter to `raw_configuration` ([#367](https://github.com/DataDog/libddwaf/pull/367))
- Generate coverage at multiple log levels ([#364](https://github.com/DataDog/libddwaf/pull/364))

## v1.22.0 ([unstable](https://github.com/DataDog/libddwaf/blob/master/README.md#versioning-semantics))
### New features

Expand Down
128 changes: 128 additions & 0 deletions UPGRADING.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,133 @@
# Upgrading libddwaf

## Upgrading from `1.22.0` to `1.23.0`

### WAF Builder
The WAF builder is a new mechanism for generating WAF instances through the use of independent, partial and potentially overlapping configurations, effectively mirroring the process performed by the security libraries when consolidating configurations obtained through remote configuration. The outcome of the builder is equivalent to merging all available configurations into a single one, however the process is tailored towards continuous generation of instances based on the addition, update and removal of partial or complete configurations, while reusing internal objects as much as possible.

> [!WARNING]
> As a consequence of the introduction of this new interface, the `ddwaf_update` function has been deprecated and removed, as the semantics of the configurations expected by this function are incompatible with those used by the new builder API. ***
In previous versions of `libddwaf`, configurations provided during `ddwaf_update` were required to be a map containing at least one of the supported top-level keys (e.g. `rules`, `exclusions`, `processors`, etc) and each of these represented the complete set of primitives of the given type. For example, a configuration containing `rules` was required to contain all rules, meaning that a future configuration update containing `rules` would result in the complete replacement of the old set with the new one. With this model, when generating a single WAF instance with multiple configurations, each of them was required to be non-overlapping.

In this new version, configurations are still required to be a map, containing at least one of the supported top-level keys, however they must also be provided with a "path", which represents a unique identifier for the given configuration and does not need to follow any particular schema; when the configuration is obtained through remote configuration, the path value must be the one obtained through it. In addition, configurations are now assumed to be overlapping, meaning that the top-level key need not represent the complete set of primitives for the given type as they will be treated as though the set is always partial and may be extended through other configurations. For example, two configurations may contribute new rules by providing the `rules` top-level key, trusting that the WAF builder will take care of the merging process.

#### Builder Lifecycle

The lifetime of the WAF builder should be linked to that of the remote configuration client, as its main purpose is to consume configuration additions, updates and removals as they are produced. Generally, a builder will have a consistent state throughout its lifetime, ensuring that memory use is always limited to objects contained within the loaded configurations.

Currently, the instantiation of the builder optionally requires `ddwaf_config`, a structure which allows the user to configure the evaluation limits, the obfuscator regexes and the object free function. The interaction with `ddwaf_config` follows the same principles as `ddwaf_init`, i.e. if provided it'll override existing values, if `NULL`, defaults will be used instead.

> [!NOTE]
> `ddwaf_config` should not be confused with the configurations obtained through remote configuration, which are provided as a `ddwaf_object`. This structure may be removed in the future in favour of passing obfuscator regexes and evaluation limits through configuration.
The following snippet shows the instantiation of a builder:

```cpp
ddwaf_config config{/* limits, obfuscator regexes and free function */};

// Instantiate a new builder using the previously defined ddwaf_config
ddwaf_builder builder = ddwaf_builder_init(config);
```
At this stage, configurations may be added, updated and removed and, once ready, a WAF instance can be created as follows:
```cpp
// Build a new WAF instance, handle any potential failure by checking for NULL
ddwaf_handle handle = ddwaf_builder_build_instance(&builder);
if (handle == NULL) { /* handle failure */ }
```

The generated WAF instance is then available for use, and it's completely independent of the builder itself, meaning that freeing one should have no impact on the other and vice-versa. The builder can continue being used in the background and once all configuration changes have been performed, a new handle can be instantiated:

```cpp
ddwaf_handle new_handle = ddwaf_builder_build_instance(&builder)
if (new_handle == NULL) {
// handle failure
}
ddwaf_destroy(&handle);
```
Note that the two WAF instances can coexist if needed, albeit it's more likely that only one will be required. Finally, at the end of the builder's lifecycle, the memory associated with it must be released as follows:
```cpp
// At the end of application's lifetime, destroy the builder
ddwaf_builder_destroy(&builder);
```

#### Adding, updating and removing configurations

> [!CAUTION]
> Builder access and modification is not thread-safe, users must ensure that it's only used from one thread or they must synchronize uses from separate threads, e.g. using a mutex.
The process of adding or updating configurations is a relatively simple one. Firstly, configurations must be provided as a `ddwaf_object` of type `map` and a separate `path` representing its unique identifier. For example:
```c
// Generate the configuration as a map containing any of the expected top-level keys, such as `rules, exclusions, etc.
ddwaf_object configuration;
ddwaf_object_map(&configuration);
...

// Use a unique path for this configuration
const char *path = "path/to/configuration/rules-c555e7ee647a3d72c2cb60a32767d586";
uint32_t path_len = (uint32_t)strlen(path);
```
With that in hand, configurations can be added or updated with `ddwaf_builder_add_or_update_config`, letting the builder handle any update-specific logic. This function will also provide any diagnostics derived from the parsing and conversion process, this will include any errors and warnings as well as details regarding the IDs of the elements loaded, failed or skipped. Note that once the configuration is loaded, the memory associated with it must be freed by the caller:
```c
ddwaf_object diagnostics;
ddwaf_object_invalid(&diagnostics);
bool result = ddwaf_builder_add_or_update_config(&builder, path, path_len, &configuration, &diagnostics);
ddwaf_object_free(&configuration);
if (!result) { /* Failed to load configuration, check diagnostics */ }
```
The addition or update may fail in certain circumstances, as denoted by the returned boolean value. This may happen when invalid arguments are provided, when the configuration could not be parsed or when it doesn't yield any meaningful results, e.g. none of the primitives within are compatible.

In contrast, the removal process only requires the path and it's performed through the `ddwaf_builder_remove_config` function, as can be seen on the example below:

```c
bool result = ddwaf_builder_remove_config(&builder, path, path_len, &configuration, &diagnostics);
if (!result) { /* Non critical error, possibly indicating a bug in the user's implementation */}
```
The removal process returns a boolean value indicating success or failure, however failure in this instance only indicates that either the arguments provided were invalid or the configuration didn't actually exist.
### Warning and Error Diagnostics
In this new version, diagnostics have been split into two categories: warnings and errors. Warnings represent diagnostics which typically indicate an incompatibility between the provided configuration and the given version of `libddwaf`. On the other hand, errors indicate a more relevant failure, one which may indicate that the configuration is malformed or incomplete. In addition, a new top-level `error` field has been introduced to account for a potential global configuration parsing error.
With that in mind, the schema of the diagnostics is roughly as follows:
```json
{
"error": "<string, when present, no other keys should be available>",
"ruleset_version": "<version string>",
"(exclusions|rules|processors|rules_override|rules_data|custom_rules|actions|scanners)" : {
"error": "<string, when present, no other keys should be available>",
"loaded": [ "<ids>" ],
"failed": [ "<ids>" ],
"skipped": [ "<ids>" ],
"errors": {
"<message>" : [ "<ids>" ],
},
"warnings": {
"<message>" : [ "<ids>" ],
}
}
}
```
#### Rephrased diagnostics

The following diagnostics have been slightly changed or rephrased, any monitors targeting them may need to be updated:
- `unknown type '<name>'` is now `unknown type: '<name>'`.
- `unknown matcher: <name>` is now `unknown operator: '<name>'` and has been demoted to a `warning`.
- `invalid transformer <name>` is now `unknown transformer: '<name>'` and has been demoted to a `warning`.
- `unknown generator '<name>'` is now `unknown generator: '<name>'` and has been demoted to a `warning`.

The following diagnostics have only been downgraded to warnings:
- `unsupported schema version: <number>.x`
- `unsupported operator version <operator name>@<version>, current <operator name>@<current version>`

## Upgrading from `1.16.x` to `1.17.x`

### Action semantics
Expand Down
2 changes: 1 addition & 1 deletion fuzzer/cmdi_detector/src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,7 @@ extern "C" int LLVMFuzzerTestOneInput(const uint8_t *bytes, size_t size)

ddwaf::timer deadline{2s};
condition_cache cache;
(void)cond.eval(cache, store, {}, {}, deadline);
(void)cond.eval(cache, store, {}, {}, {}, deadline);

return 0;
}
2 changes: 1 addition & 1 deletion fuzzer/lfi_detector/src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ extern "C" int LLVMFuzzerTestOneInput(const uint8_t *bytes, size_t size)

ddwaf::timer deadline{2s};
condition_cache cache;
(void)cond.eval(cache, store, {}, {}, deadline);
(void)cond.eval(cache, store, {}, {}, {}, deadline);

return 0;
}
2 changes: 1 addition & 1 deletion fuzzer/shi_detector_array/src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,7 @@ extern "C" int LLVMFuzzerTestOneInput(const uint8_t *bytes, size_t size)

ddwaf::timer deadline{2s};
condition_cache cache;
(void)cond.eval(cache, store, {}, {}, deadline);
(void)cond.eval(cache, store, {}, {}, {}, deadline);

return 0;
}
2 changes: 1 addition & 1 deletion fuzzer/shi_detector_string/src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ extern "C" int LLVMFuzzerTestOneInput(const uint8_t *bytes, size_t size)

ddwaf::timer deadline{2s};
condition_cache cache;
(void)cond.eval(cache, store, {}, {}, deadline);
(void)cond.eval(cache, store, {}, {}, {}, deadline);

return 0;
}
2 changes: 1 addition & 1 deletion fuzzer/sqli_detector/src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ extern "C" int LLVMFuzzerTestOneInput(const uint8_t *bytes, size_t size)

ddwaf::timer deadline{2s};
condition_cache cache;
(void)cond.eval(cache, store, {}, {}, deadline);
(void)cond.eval(cache, store, {}, {}, {}, deadline);

return 0;
}
2 changes: 1 addition & 1 deletion fuzzer/ssrf_detector/src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ extern "C" int LLVMFuzzerTestOneInput(const uint8_t *bytes, size_t size)

ddwaf::timer deadline{2s};
condition_cache cache;
(void)cond.eval(cache, store, {}, {}, deadline);
(void)cond.eval(cache, store, {}, {}, {}, deadline);

return 0;
}
9 changes: 9 additions & 0 deletions include/ddwaf.h
Original file line number Diff line number Diff line change
Expand Up @@ -378,6 +378,10 @@ ddwaf_builder ddwaf_builder_init(const ddwaf_config *config);
* @param diagnostics Optional ruleset parsing diagnostics. (nullable)
*
* @return Whether the operation succeeded (true) or failed (false).
*
* @note if any of the arguments are NULL, the diagnostics object will not be initialised.
* @note The memory associated with the path, config and diagnostics must be freed by the caller.
* @note This function is not thread-safe.
**/
bool ddwaf_builder_add_or_update_config(ddwaf_builder builder, const char *path, uint32_t path_len, ddwaf_object *config, ddwaf_object *diagnostics);

Expand All @@ -391,6 +395,9 @@ bool ddwaf_builder_add_or_update_config(ddwaf_builder builder, const char *path,
* @param path_len The length of the string contained within path.
*
* @return Whether the operation succeeded (true) or failed (false).
*
* @note The memory associated with the path must be freed by the caller.
* @note This function is not thread-safe.
**/
bool ddwaf_builder_remove_config(ddwaf_builder builder, const char *path, uint32_t path_len);

Expand All @@ -402,6 +409,8 @@ bool ddwaf_builder_remove_config(ddwaf_builder builder, const char *path, uint32
* @param builder Builder to perform the operation on. (nonnull)
*
* @return Handle to the new WAF instance or NULL if there was an error.
*
* @note This function is not thread-safe.
**/
ddwaf_handle ddwaf_builder_build_instance(ddwaf_builder builder);

Expand Down
1 change: 1 addition & 0 deletions src/builder/ruleset_builder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,7 @@ std::shared_ptr<ruleset> ruleset_builder::build(
rs->actions = actions_;
rs->free_fn = free_fn_;
rs->event_obfuscator = event_obfuscator_;
rs->limits = limits_;

// An instance is valid if it contains primitives with side-effects, such as
// rules or postprocessors.
Expand Down
2 changes: 1 addition & 1 deletion src/condition/base.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ class base_condition {

virtual eval_result eval(condition_cache &cache, const object_store &store,
const exclusion::object_set_ref &objects_excluded, const matcher_mapper &dynamic_matchers,
ddwaf::timer &deadline) const = 0;
const object_limits &limits, ddwaf::timer &deadline) const = 0;

virtual void get_addresses(std::unordered_map<target_index, std::string> &addresses) const = 0;
};
Expand Down
10 changes: 6 additions & 4 deletions src/condition/cmdi_detector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -437,13 +437,15 @@ std::string generate_string_resource(const ddwaf_object &root)

} // namespace

cmdi_detector::cmdi_detector(std::vector<condition_parameter> args, const object_limits &limits)
: base_impl<cmdi_detector>(std::move(args), limits)
cmdi_detector::cmdi_detector(std::vector<condition_parameter> args)
: base_impl<cmdi_detector>(std::move(args))
{}

// NOLINTNEXTLINE(readability-convert-member-functions-to-static)
eval_result cmdi_detector::eval_impl(const unary_argument<const ddwaf_object *> &resource,
const variadic_argument<const ddwaf_object *> &params, condition_cache &cache,
const exclusion::object_set_ref &objects_excluded, ddwaf::timer &deadline) const
const exclusion::object_set_ref &objects_excluded, const object_limits &limits,
ddwaf::timer &deadline) const
{
if (resource.value->type != DDWAF_OBJ_ARRAY || resource.value->nbEntries == 0) {
return {};
Expand All @@ -452,7 +454,7 @@ eval_result cmdi_detector::eval_impl(const unary_argument<const ddwaf_object *>
std::vector<shell_token> resource_tokens;
for (const auto &param : params) {
auto res = cmdi_impl(
*resource.value, resource_tokens, *param.value, objects_excluded, limits_, deadline);
*resource.value, resource_tokens, *param.value, objects_excluded, limits, deadline);
if (res.has_value()) {
const std::vector<std::string> resource_kp{
resource.key_path.begin(), resource.key_path.end()};
Expand Down
5 changes: 3 additions & 2 deletions src/condition/cmdi_detector.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,13 @@ class cmdi_detector : public base_impl<cmdi_detector> {
static constexpr unsigned version = 1;
static constexpr std::array<std::string_view, 2> param_names{"resource", "params"};

explicit cmdi_detector(std::vector<condition_parameter> args, const object_limits &limits = {});
explicit cmdi_detector(std::vector<condition_parameter> args);

protected:
[[nodiscard]] eval_result eval_impl(const unary_argument<const ddwaf_object *> &resource,
const variadic_argument<const ddwaf_object *> &params, condition_cache &cache,
const exclusion::object_set_ref &objects_excluded, ddwaf::timer &deadline) const;
const exclusion::object_set_ref &objects_excluded, const object_limits &limits,
ddwaf::timer &deadline) const;

friend class base_impl<cmdi_detector>;
};
Expand Down
12 changes: 8 additions & 4 deletions src/condition/exists.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -86,16 +86,18 @@ search_outcome exists(const ddwaf_object *root, std::span<const std::string> key

} // namespace

// NOLINTNEXTLINE(readability-convert-member-functions-to-static)
[[nodiscard]] eval_result exists_condition::eval_impl(
const variadic_argument<const ddwaf_object *> &inputs, condition_cache &cache,
const exclusion::object_set_ref &objects_excluded, ddwaf::timer &deadline) const
const exclusion::object_set_ref &objects_excluded, const object_limits &limits,
ddwaf::timer &deadline) const
{
for (const auto &input : inputs) {
if (deadline.expired()) {
throw ddwaf::timeout_exception();
}

if (exists(input.value, input.key_path, objects_excluded, limits_) ==
if (exists(input.value, input.key_path, objects_excluded, limits) ==
search_outcome::found) {
std::vector<std::string> key_path{input.key_path.begin(), input.key_path.end()};
cache.match = {{.args = {{.name = "input",
Expand All @@ -112,14 +114,16 @@ search_outcome exists(const ddwaf_object *root, std::span<const std::string> key
return {.outcome = false, .ephemeral = false};
}

// NOLINTNEXTLINE(readability-convert-member-functions-to-static)
[[nodiscard]] eval_result exists_negated_condition::eval_impl(
const unary_argument<const ddwaf_object *> &input, condition_cache &cache,
const exclusion::object_set_ref &objects_excluded, ddwaf::timer & /*deadline*/) const
const exclusion::object_set_ref &objects_excluded, const object_limits &limits,
ddwaf::timer & /*deadline*/) const
{
// We need to make sure the key path hasn't been found. If the result is
// unknown, we can't guarantee that the key path isn't actually present in
// the data set
if (exists(input.value, input.key_path, objects_excluded, limits_) !=
if (exists(input.value, input.key_path, objects_excluded, limits) !=
search_outcome::not_found) {
return {.outcome = false, .ephemeral = false};
}
Expand Down
Loading

0 comments on commit d88bc72

Please sign in to comment.