Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement strstr #2393

Merged
merged 1 commit into from
Dec 27, 2022
Merged

Implement strstr #2393

merged 1 commit into from
Dec 27, 2022

Conversation

liutingjieni
Copy link
Contributor

Add a new built-in func like strstr()

Add the strstr () function to make fuzzy comparisons when there is no way to confirm the absolute path of the file or the full name of the process or

usage:
sudo bpftrace -e 't:syscalls:sys_enter_execve /strstr(comm,"sh") == 0/ { printf("%s called %s\n", comm, str(args->filename));}'

Problem background: #2306

Checklist
  • Language changes are updated in man/adoc/bpftrace.adoc and if needed in docs/reference_guide.md
  • User-visible and non-trivial changes updated in CHANGELOG.md
  • The new behaviour is covered by tests

@liutingjieni
Copy link
Contributor Author

Pr was here before ( add strstr() by liutingjieni · Pull Request #2344 · iovisor/bpftrace ). Because of the conflict with master, so based on the latest master commit changes

@liutingjieni
Copy link
Contributor Author

I see a check error, but I don't quite understand。what do I need to do now?

@viktormalik
Copy link
Contributor

Pr was here before ( add strstr() by liutingjieni · Pull Request #2344 · iovisor/bpftrace ). Because of the conflict with master, so based on the latest master commit changes

In future, please keep updating the same PR (by force-push into the branch), opening a new one after each rebase is not a good idea.

I see a check error, but I don't quite understand。what do I need to do now?

Your new test is failing with segmentation fault, so you'll have to debug. If you want to reproduce the CI environment, you can build and use the prepared Docker images. Check .github/workflows/ci.yml for exact Docker commands.

@liutingjieni
Copy link
Contributor Author

I found that the error is build_test (LLVM 15 Release). I tried to remove the commit about it( Latest Commit) and I found that it works.

@viktormalik
Copy link
Contributor

I found that the error is build_test (LLVM 15 Release). I tried to remove the commit about it( Latest Commit) and I found that it works.

Which could mean that your code doesn't work with LLVM 15 (very likely it doesn't work with opaque pointers). Please check and fix that.

@liutingjieni liutingjieni force-pushed the add_strstr branch 2 times, most recently from cdd20ae to b3298fa Compare October 19, 2022 13:26
@liutingjieni
Copy link
Contributor Author

all checks have passed. what else do I need to do now?

@viktormalik
Copy link
Contributor

This is a terrible solution. You can't just arbitrarily revert a commit from master if it doesn't suit you.

You have to make your code work with opaque pointers. Once you do that, the checks will pass.

@liutingjieni
Copy link
Contributor Author

This is a terrible solution. You can't just arbitrarily revert a commit from master if it doesn't suit you.

You have to make your code work with opaque pointers. Once you do that, the checks will pass.

I have changed my code with opaque pointers. The check for llvm15 has passed. The last two commit hashes from master have changed due to cherry-pick, but the code content has not changed.

@liutingjieni
Copy link
Contributor Author

This branch is 1 commit ahead of iovisor:master.

@liutingjieni
Copy link
Contributor Author

liutingjieni commented Oct 21, 2022

I want the code to be reviewed

@viktormalik
Copy link
Contributor

I have changed my code with opaque pointers. The check for llvm15 has passed. The last two commit hashes from master have changed due to cherry-pick, but the code content has not changed.

Ah, right, sorry. I'll review when I have time.

Copy link

@BurntBrunch BurntBrunch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine for what it is but I have a meta question - why are we unrolling the loop?

For a char[32] vs char[32] call, this emits ~500 instructions.

> src/bpftrace -vv -e 't:mac80211:wake_queue {if (strstr(ar
gs->wiphy_name, args->wiphy_name)) {exit()}}'
INFO: node count: 13
Attaching 1 probe...

Program ID: 1422

The verifier log: 
processed 522 insns (limit 1000000) max_states_per_insn 0 total_states 32 peak_states 32 mark_read 24

Given that bpf has had backward branches for ages, why don't we just emit this as an actual loop with known bounds and let the verifier do the hard work? It would dramatically improve complexity while potentially also allowing probed strings to work (by emitting the loop against the probe size limit or even against the runtime string length, assuming we can satisfy the verifier).

src/ast/irbuilderbpf.cpp Outdated Show resolved Hide resolved
src/ast/irbuilderbpf.cpp Outdated Show resolved Hide resolved
src/ast/irbuilderbpf.cpp Outdated Show resolved Hide resolved
docs/reference_guide.md Outdated Show resolved Hide resolved
src/ast/irbuilderbpf.cpp Outdated Show resolved Hide resolved
tests/semantic_analyser.cpp Outdated Show resolved Hide resolved
man/adoc/bpftrace.adoc Outdated Show resolved Hide resolved
man/adoc/bpftrace.adoc Outdated Show resolved Hide resolved
@fbs
Copy link
Member

fbs commented Nov 2, 2022

This looks fine for what it is but I have a meta question - why are we unrolling the loop?

For a char[32] vs char[32] call, this emits ~500 instructions.

> src/bpftrace -vv -e 't:mac80211:wake_queue {if (strstr(ar
gs->wiphy_name, args->wiphy_name)) {exit()}}'
INFO: node count: 13
Attaching 1 probe...

Program ID: 1422

The verifier log: 
processed 522 insns (limit 1000000) max_states_per_insn 0 total_states 32 peak_states 32 mark_read 24

Given that bpf has had backward branches for ages, why don't we just emit this as an actual loop with known bounds and let the verifier do the hard work? It would dramatically improve complexity while potentially also allowing probed strings to work (by emitting the loop against the probe size limit or even against the runtime string length, assuming we can satisfy the verifier).

Agree but maybe we shouldn't block on it.

For a future improvement it would also be nice if we compare more bytes at once

@liutingjieni liutingjieni force-pushed the add_strstr branch 3 times, most recently from 4043278 to c8a5028 Compare November 6, 2022 18:13
@liutingjieni liutingjieni requested review from BurntBrunch and viktormalik and removed request for BurntBrunch and viktormalik November 6, 2022 18:14
Copy link
Contributor

@viktormalik viktormalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also merge the two commits and cleanup the commit message?
Thanks!

src/ast/irbuilderbpf.cpp Show resolved Hide resolved
src/ast/irbuilderbpf.cpp Outdated Show resolved Hide resolved
man/adoc/bpftrace.adoc Outdated Show resolved Hide resolved
@liutingjieni liutingjieni force-pushed the add_strstr branch 2 times, most recently from 11de8cc to 1f29ecc Compare November 24, 2022 18:03
@liutingjieni liutingjieni requested review from viktormalik and fbs and removed request for viktormalik and fbs November 25, 2022 06:47
src/ast/irbuilderbpf.cpp Outdated Show resolved Hide resolved
src/ast/irbuilderbpf.cpp Outdated Show resolved Hide resolved
@liutingjieni liutingjieni requested review from viktormalik and removed request for fbs November 28, 2022 08:32
Copy link
Contributor

@viktormalik viktormalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there has been a lot of work, I'd like to have one more review before merging this.

@viktormalik
Copy link
Contributor

cc'ing @danobi and @ajor explicitly since this was opened before we added the CODEOWNERS file.

@xh4n3
Copy link
Contributor

xh4n3 commented Dec 6, 2022

Since this is similar to #2387, I guess you may need to add the codegen tests and change the CHANELOG.md.
And maybe a little change to the empty lines, see #2387 (comment)

tests/runtime/call Outdated Show resolved Hide resolved
tests/runtime/call Show resolved Hide resolved
Copy link
Member

@fbs fbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! Just a few suggestions on the testing side

tests/runtime/call Outdated Show resolved Hide resolved
This function compares whether the string val1 contains the string val2.
It returns true if val2 is contained by val1, false if not contained.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants