Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use strings.Cut where possible #4470

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Oct 23, 2024

For the most part, this is a switch from strings.Split[N] to strings.Cut.
Using strings.Cut (added in Go 1.18, see 1) results in faster and
cleaner code with less allocations (as we're not using a slice).

There are a few other cleanups and nits here and there.
Please see individual commits for details.

@kolyshkin kolyshkin added the kind/refactor refactoring label Oct 23, 2024
@kolyshkin kolyshkin requested review from dqminh and rata and removed request for dqminh November 2, 2024 00:37
@rata
Copy link
Member

rata commented Nov 5, 2024

@kolyshkin I guess this is due to performance? Did you run some numbers or we just tested before this is faster and we are just using strings.Cut in more places? IMHO, it would be nice to say the reason on the commits.

If something breaks due to this, it would be nice to know that the commit just changed to this due to performance, so a revert is probably safe until a new fixed version is cooked. But that is not that easy to know if we don't explain why we change it in the commit.

@kolyshkin
Copy link
Contributor Author

kolyshkin commented Nov 6, 2024

@kolyshkin I guess this is due to performance? Did you run some numbers or we just tested before this is faster and we are just using strings.Cut in more places? IMHO, it would be nice to say the reason on the commits.

If something breaks due to this, it would be nice to know that the commit just changed to this due to performance, so a revert is probably safe until a new fixed version is cooked. But that is not that easy to know if we don't explain why we change it in the commit.

Thanks, this makes sense. Individual commit messages as well as this PR description updated; PTAL.

@kolyshkin kolyshkin force-pushed the strings-cut branch 2 times, most recently from 819df22 to 62a4183 Compare December 5, 2024 00:04
@kolyshkin
Copy link
Contributor Author

@opencontainers/runc-maintainers PTAL. This is a minor cleanup, aiming for more readable and faster code with less allocations.

@kolyshkin kolyshkin added this to the 1.3.0 milestone Dec 5, 2024
return nil, fmt.Errorf("invalid --cgroup argument: %s", c)
}
if len(cs) == 1 { // no controller: prefix
if ctr, path, ok := strings.Cut(c, ":"); ok {
Copy link
Member

@lifubang lifubang Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I havn't checked that whether this will lead to some regressions or not, for example, if c == "a:b:c:d:e":
When using SplitN, cs[1] will be "b";
When using Cut, path will be "b:c:d:e".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the old code would error out when c == "a:b:c:d:e".

The new code would not, assigning b:c:d:e to path.

To me, the new code is (ever so slightly) more correct, since : is a valid symbol which should be allowed in path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, was looking at that as well; agreed, looks like new code is more correct here.

I still need to have a quick look at the splitting on commas, and the if len(args) != 1 { check below. Probably will check out this branch locally

Using strings.Cut (added in Go 1.18, see [1]) results in faster and
cleaner code with less allocations (as we're not using a slice).

This part of code is covered by tests in tests/integration/exec.bats.

[1]: golang/go#46336

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Nowadays strings.Fields are as fast as strings.SplitN so remove TODO.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Using strings.Cut (added in Go 1.18, see [1]) results in faster and
cleaner code with less allocations (as we're not using a slice). This
also drops the check for extra dash (we're unlikely to get it from the
kernel anyway).

While at it, rename min/max -> from/to to avoid collision with Go
min/max builtins.

This code is tested by TestCPUSetStats* tests.

[1]: golang/go#46336

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Remove extra global constants that are only used in a single place and
make it harder to read the code.

Rename nanosecondsInSecond -> nsInSec.

This code is tested by unit tests.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
For cgroup v2, we always expect /proc/$PID/cgroup contents like this:

> 0::/user.slice/user-1000.slice/user@1000.service/app.slice/vte-spawn-f71c3fb8-519d-4e2d-b13e-9252594b1e05.scope

So, it does not make sense to parse it using strings.Split, we can just
cut the prefix and return the rest.

Code tested by TestParseCgroupFromReader.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Using strings.Cut (added in Go 1.18, see [1]) results in faster and
cleaner code with less allocations (as we're not using a slice).

The code is tested by testCgroupResourcesUnified.

[1]: golang/go#46336

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Using strings.Cut (added in Go 1.18, see [1]) results in faster and
cleaner code with less allocations (as we're not using a slice).

This code is tested by TestStatCPUPSI.

[1]: golang/go#46336

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Using strings.Cut (added in Go 1.18, see [1]) results in faster and
cleaner code with less allocations (as we're not using a slice).

Also, use switch in parseRdmaKV.

[1]: golang/go#46336

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Use strings.Cut in ParseKeyValue and GetValueByKey.

Using strings.Cut (added in Go 1.18, see [1]) results in faster and
cleaner code with less allocations (as we're not using a slice).

[1]: golang/go#46336

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Comment on lines +151 to +152
usageKernelMode = append(usageKernelMode, kernel)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental empty line added here

Comment on lines +136 to 138
fields := strings.SplitN(scanner.Text(), " ", 3)
if len(fields) != 3 {
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was considering if we had to check for more than 3 fields to happen here, but I guess that would error already once we try to strconv.ParseUint() below, so probably not adding anything.

OTOTH, I wonder if there could ever be more columns added here? Old code would do a SplitN(.., 4), and would skip if there would be less than 3 or more than 3 columns. The new code would silently skip less than 3 columns, but now errors on more than 3 columns.

Not sure if that behavior was correct, or if it should've processed the first 3 columns (and only ignored the 4th+ columns)? Is there any chance there would ever be more columns added?

Comment on lines -90 to +87
if len(parts) < 3 {
return "", fmt.Errorf("invalid cgroup entry: %q", text)
}
// text is like "0::/user.slice/user-1001.slice/session-1.scope"
if parts[0] == "0" && parts[1] == "" {
return parts[2], nil
// "0::/user.slice/user-1001.slice/session-1.scope"
if path, ok := strings.CutPrefix(s.Text(), "0::"); ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we missing an error condition for lines that are invalid? Do we need to handle, say 0:::::::/hello.slice/world.scope, or is it fine to let those go through?

case "avg10":
pv = &data.Avg10
case "avg60":
pv = &data.Avg60
case "avg300":
pv = &data.Avg300
case "total":
v, err := strconv.ParseUint(kv[1], 10, 64)
v, err := strconv.ParseUint(val, 10, 64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but I started to look in some other codebases to dismantle these errors. The default error returned is something like;

invalid total PSI value: strconv.ParseUint: parsing "some invalid value": invalid syntax

Which tends to be a bit "implementation detail", and strconv.ParseUint is not very relevant to the user. So in some cases, dismantling the error can be useful; https://cs.opensource.google/go/go/+/refs/tags/go1.23.4:src/strconv/atoi.go;l=20-49

https://go.dev/play/p/E9Dx1YBzPiN

var numErr *strconv.NumError
if errors.As(err, &numErr) {
	fmt.Printf("invalid  %s PS value (%s): %w", key, numErr.Num, numErr.Err)
}

Which would be something like;

invalid total PSI value (some invalid value): invalid syntax


if len(parts) != 2 {
if !ok {
return errors.New("Unable to parse RDMA entry")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these could be improved as well; in this case we know it's missing a =. Wondering if we must also check for empty values after; looks like the if/else below would fall through to trying to parse empty values as integer, which may produce an obscure error (strconv.ParseUint: parsing "": invalid syntax).

We could check here if it's empty and return that no value is specified;

if !ok || v == "" {
	return errors.New("Unable to parse RDMA entry: no value specified")
}

parts := strings.SplitN(t, " ", 3)
if len(parts) != 2 {
key, val, ok := strings.Cut(t, " ")
if !ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily for this PR, but we could check for empty values here as well to return a nicer error than "strconv.ParseUint failed to parse ...";

if !ok || key == "" || val == "" {

Comment on lines -86 to +87
arr := strings.Split(line, " ")
if len(arr) == 2 && arr[0] == key {
val, err := ParseUint(arr[1], 10, 64)
k, v, ok := strings.Cut(line, " ")
if ok && k == key {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if relevant (and probably new code is more correct?); previously, we would ignore cases with more than 2 fields. I.e., key foo bar would be ignored; in the new code, we'd try to parse foo bar as value.

That said; see my other comment; this would potentially be an issue if we try to parse something that may currently have 2 cols, and in future 3 cols; not sure if that should error or silently ignore

Comment on lines +133 to 137
// No controller: prefix.
if len(args) != 1 {
return nil, fmt.Errorf("invalid --cgroup argument: %s (missing <controller>: prefix)", c)
}
paths[""] = c
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was trying to somewhat grasp what this case is for;

  • if zero args are passed, we return early (start of function)
  • if 1 arg is passed, it MUST have a controller prefix
  • But if 2 (or more) args are passed, then we don't have to?

So;

# valid: single arg
--cgroup a,b:/hello-world

# valid: single arg without controller
--cgroup /hello-world

# invalid: multiple args, one without controller
--cgroup a,b:/hello-world --cgroup /hello-world
--cgroup /hello-world --cgroup a,b:/hello-world 

So, if no controller is passed, it's for everything which is only allowed if no other (per-controller) paths are specified, correct? (so not allowed to be combined).

☝️ Perhaps that's something we should capture in a comment on the function

Also;

  • perhaps we need to change the ok check to also check if controller is not empty, or is ,,,,,:/hello-world something that should be considered valid?
  • are duplicate controllers valid? They currently overwrite previous values (foo,foo,foo:/bar, foo:/baz)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/refactor refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants