Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement WAL plugin test API #25704

Merged
merged 2 commits into from
Jan 6, 2025
Merged

feat: Implement WAL plugin test API #25704

merged 2 commits into from
Jan 6, 2025

Conversation

pauldix
Copy link
Member

@pauldix pauldix commented Dec 23, 2024

This implements the WAL plugin test API. It also introduces a new API for the Python plugins to be called, get their data, and call back into the database server.

There are some things that I'll want to address in follow on work:

  • CLI tests, but will wait on feat: Cleanup CLI flags for InfluxDB 3 Core #25737 to land for a refactor of the CLI here
  • It uses a _testdb that it injects into the catalog, which might run afoul of the db limits, so need to get around that
  • Would be better to hook the Python logging to call back into the plugin return state like here: https://pyo3.rs/v0.23.3/ecosystem/logging.html#the-python-to-rust-direction
  • We should only load the LineBuilder interface once in a module, rather than on every execution of a WAL plugin'
  • Ideally, we'd be able to validate each line on the write and write_to_db part of the APIs so that a proper error can be returned to the Python code so that plugin authors would be able to handle them. But that will require moving the validator out of the influxdb3_write crate so it's a much larger piece of work
  • More tests all around

But I want to get this in so that the actual plugin and trigger system can get udated to build around this model.

@pauldix pauldix force-pushed the pd/plugin-development branch 3 times, most recently from e43581e to f5e251e Compare January 5, 2025 21:53
@pauldix pauldix changed the title WIP: plugin development flow feat: Implement WAL plugin test API Jan 5, 2025
@pauldix pauldix marked this pull request as ready for review January 5, 2025 21:56
@pauldix pauldix force-pushed the pd/plugin-development branch from f5e251e to 068d0d8 Compare January 5, 2025 22:00
Copy link
Contributor

@jacksonrnewhouse jacksonrnewhouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Had a couple of questions as it seems to indicate a direction we're heading. One spelling mistake I caught.

/// If given, save the output to this file on the server in `<plugin-dir>/<name>_test/<save-output-to-file>`
#[clap(long = "save-output-to-file")]
pub save_output_to_file: Option<String>,
/// If given, validate the output against this file on the server in `<plugin-dir>/<name>_test/<validate-output-file>`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In end-to-end tests like this I've often seen an option on whether the system should sort the output data before comparing or not. Think this is worth adding? I do expect the execution to be sequential so it might not matter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect the execution to be sequential, but if they end up doing multi-threaded Python maybe things could arrive in different orders? Something to worry about for later I think.

pub save_output_to_file: Option<String>,
/// If given, validate the output against this file on the server in `<plugin-dir>/<name>_test/<validate-output-file>`
#[clap(long = "validate-output-file")]
pub validate_output_file: Option<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and save_output_to_file don't actually seem to be used yet. Is that intended?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes, not actually wired up yet. I can remove because I don't think we'll have time to add before release.

use influxdb3_wal::Gen1Duration;

// parse the lp into a write batch
let namespace = NamespaceName::new("_testdb").unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's factor "_testdb" out into a static variable.

}

#[derive(Debug, Default)]
pub struct PluginReturnState {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this to include errors? Currently it is constructed assuming successful execution of the plugin code, and then validation errors are tacked on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it would be great to capture errors in the Python here. Although I'm not sure where to do that. Something for follow on work.

@@ -171,6 +183,7 @@ pub struct WriteBufferImplArgs {
pub wal_config: WalConfig,
pub parquet_cache: Option<Arc<dyn ParquetCacheOracle>>,
pub metric_registry: Arc<Registry>,
pub plugin_dir: Option<PathBuf>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the plan here? Prior to this change we were just storing the python file as a string. Do we want the plugin dir to be mirrored to S3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now this is just for local development (i.e. I have InfluxDB running on my laptop and I'm working on some_plugin.py file that I want the server to just read in quickly to run the test). Users will deploy plugins by using the API or CLI. Although if we find that users would rather have a directory and deploy files to that dir themselves, we could have that as an option.

// constant for the process writes call site string
const PROCESS_WRITES_CALL_SITE: &str = "process_writes";

const LINE_BUILDER_CODE: &str = r#"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a stop-gap for now so the line builder doesn't need to be installed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to use the existing influxdb3 Python client as it's geared towards making remote requests. I just wanted the simplest possible thing to have some Python lib already in the runtime so that plugin authors could use it.

I think there's going to be some fixed, built-in Python API that we ship with the runtime. For everything else it can be an external library that we can install. Ideally the server would install dependencies for the user automatically when they load the plugin.

)])),
};

let reesponse =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: should be response.

@pauldix pauldix force-pushed the pd/plugin-development branch from 80c33ab to 0798fe8 Compare January 6, 2025 22:11
This implements the WAL plugin test API. It also introduces a new API for the Python plugins to be called, get their data, and call back into the database server.

There are some things that I'll want to address in follow on work:
* CLI tests, but will wait on #25737 to land for a refactor of the CLI here
* Would be better to hook the Python logging to call back into the plugin return state like here: https://pyo3.rs/v0.23.3/ecosystem/logging.html#the-python-to-rust-direction
* We should only load the LineBuilder interface once in a module, rather than on every execution of a WAL plugin
* More tests all around

But I want to get this in so that the actual plugin and trigger system can get udated to build around this model.
@pauldix pauldix force-pushed the pd/plugin-development branch from 0798fe8 to 20dcd32 Compare January 6, 2025 22:11
@pauldix pauldix merged commit 1ce6a24 into main Jan 6, 2025
13 checks passed
@pauldix pauldix deleted the pd/plugin-development branch January 6, 2025 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants