-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement WAL plugin test API #25704
Conversation
e43581e
to
f5e251e
Compare
f5e251e
to
068d0d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Had a couple of questions as it seems to indicate a direction we're heading. One spelling mistake I caught.
/// If given, save the output to this file on the server in `<plugin-dir>/<name>_test/<save-output-to-file>` | ||
#[clap(long = "save-output-to-file")] | ||
pub save_output_to_file: Option<String>, | ||
/// If given, validate the output against this file on the server in `<plugin-dir>/<name>_test/<validate-output-file>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In end-to-end tests like this I've often seen an option on whether the system should sort the output data before comparing or not. Think this is worth adding? I do expect the execution to be sequential so it might not matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect the execution to be sequential, but if they end up doing multi-threaded Python maybe things could arrive in different orders? Something to worry about for later I think.
pub save_output_to_file: Option<String>, | ||
/// If given, validate the output against this file on the server in `<plugin-dir>/<name>_test/<validate-output-file>` | ||
#[clap(long = "validate-output-file")] | ||
pub validate_output_file: Option<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and save_output_to_file
don't actually seem to be used yet. Is that intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yes, not actually wired up yet. I can remove because I don't think we'll have time to add before release.
use influxdb3_wal::Gen1Duration; | ||
|
||
// parse the lp into a write batch | ||
let namespace = NamespaceName::new("_testdb").unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's factor "_testdb" out into a static variable.
} | ||
|
||
#[derive(Debug, Default)] | ||
pub struct PluginReturnState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this to include errors? Currently it is constructed assuming successful execution of the plugin code, and then validation errors are tacked on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it would be great to capture errors in the Python here. Although I'm not sure where to do that. Something for follow on work.
@@ -171,6 +183,7 @@ pub struct WriteBufferImplArgs { | |||
pub wal_config: WalConfig, | |||
pub parquet_cache: Option<Arc<dyn ParquetCacheOracle>>, | |||
pub metric_registry: Arc<Registry>, | |||
pub plugin_dir: Option<PathBuf>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the plan here? Prior to this change we were just storing the python file as a string. Do we want the plugin dir to be mirrored to S3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now this is just for local development (i.e. I have InfluxDB running on my laptop and I'm working on some_plugin.py
file that I want the server to just read in quickly to run the test). Users will deploy plugins by using the API or CLI. Although if we find that users would rather have a directory and deploy files to that dir themselves, we could have that as an option.
// constant for the process writes call site string | ||
const PROCESS_WRITES_CALL_SITE: &str = "process_writes"; | ||
|
||
const LINE_BUILDER_CODE: &str = r#" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a stop-gap for now so the line builder doesn't need to be installed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to use the existing influxdb3 Python client as it's geared towards making remote requests. I just wanted the simplest possible thing to have some Python lib already in the runtime so that plugin authors could use it.
I think there's going to be some fixed, built-in Python API that we ship with the runtime. For everything else it can be an external library that we can install. Ideally the server would install dependencies for the user automatically when they load the plugin.
)])), | ||
}; | ||
|
||
let reesponse = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling: should be response
.
80c33ab
to
0798fe8
Compare
This implements the WAL plugin test API. It also introduces a new API for the Python plugins to be called, get their data, and call back into the database server. There are some things that I'll want to address in follow on work: * CLI tests, but will wait on #25737 to land for a refactor of the CLI here * Would be better to hook the Python logging to call back into the plugin return state like here: https://pyo3.rs/v0.23.3/ecosystem/logging.html#the-python-to-rust-direction * We should only load the LineBuilder interface once in a module, rather than on every execution of a WAL plugin * More tests all around But I want to get this in so that the actual plugin and trigger system can get udated to build around this model.
0798fe8
to
20dcd32
Compare
This implements the WAL plugin test API. It also introduces a new API for the Python plugins to be called, get their data, and call back into the database server.
There are some things that I'll want to address in follow on work:
_testdb
that it injects into the catalog, which might run afoul of the db limits, so need to get around thatwrite
andwrite_to_db
part of the APIs so that a proper error can be returned to the Python code so that plugin authors would be able to handle them. But that will require moving the validator out of theinfluxdb3_write
crate so it's a much larger piece of workBut I want to get this in so that the actual plugin and trigger system can get udated to build around this model.