-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regex replace using a closure #628
Comments
I don't currently have time to put together a full PR, but here is a rough draft of a partial implementation of a replace function that uses a closure: fn replace_with<T>(value: Value, pattern: Value, ctx: &mut Context, runner: closure::Runner<T>) -> Resolved {
let value = value.try_bytes_utf8_lossy()?;
// TODO: support count?
match pattern {
Value::Regex(regex) => {
let mut i: usize = 0;
let mut failure: Option<ExpressionError> = None;
let replaced = regex.replace_all(&value, |captures| {
let captures_value = captures_to_value(captures);
let result = runner.run_index_value(ctx, i, captures_value).and_then(|s| Ok(s.try_bytes_utf8_lossy()?));
match result {
Ok(v) => v,
Err(e) => {
failure.get_or_insert(e);
Cow::from("")
}
}
}).as_bytes();
if let Some(err) = failure {
Err(e)
} else {
Ok(replaced.into())
}
}
// TODO: should we also support Value::Bytes?
value => Err(ValueError::Expected {
got: value.kind(),
expected: Kinde.regex()
}.into())
}
}
fn coaptures_to_value(captures: &Captures) -> Value {
if captures.len() == 1 {
// this is garanteed not to panic because the there is always 1 result.
captures[0].into()
} else {
// return an array of the capture groups
captures.iter().map(|m| m.as_str()).into()
}
} |
This is a neat idea. If you are interested in driving this to completion, your draft is in the right direction, it just needs some boilerplate code, unit tests and a VRL test. Otherwise, we will add this to the backlog and prioritize accordingly. |
tmccombs
added a commit
to tmccombs/vrl
that referenced
this issue
Jan 9, 2024
This is similar to `replace`, but takes a closure to compute the replacment from the match and capture groups, instead of taking a replacment string. Fixes: vectordotdev#628
This was referenced Jan 9, 2024
tmccombs
added a commit
to tmccombs/vrl
that referenced
this issue
Jan 17, 2024
This is similar to `replace`, but takes a closure to compute the replacment from the match and capture groups, instead of taking a replacment string. Fixes: vectordotdev#628
github-merge-queue bot
pushed a commit
that referenced
this issue
Jan 24, 2024
* feat(stdlib): Add replace_with function This is similar to `replace`, but takes a closure to compute the replacment from the match and capture groups, instead of taking a replacment string. Fixes: #628 * Pull request feedback * enhancement(replace_with): Pass object instead of array to closure This allows us to expose the named capture groups with names. * Add named capture groups directly to capture object
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I filed #633, but now realize that this might have been a better place to report it.
For convenience I will copy over my text from there:
Use Cases
I want to redact certain personally identifiable information, such as an email address, from logs, but in such a way that if I already know the information (email address), I can search the logs for matching log entries.
Attempted Solutions
I think it would be possible to do this with a lua transform. But the documentation for that says to create an issue if the remap transform doesn't meet my needs.
Lua also doesn't have a built in sha256 function so I would either need to use a lua native sha256 implementation, which I suspect would be slow, or pull in a shared library that adds such a function for lua (is that possible with vector?)
Proposal
I can think of a few ways this could be done:
I think that the last bullet point is the most general and could be useful in other cases as well.
P.s. having a built in redaction pattern for email addresses would be awesome.
The text was updated successfully, but these errors were encountered: