-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replaced ioutil.ReadAll with bufio.Scanner #1976
Conversation
Signed-off-by: chinhnc <chicknsoupuds@gmail.com>
@@ -60,13 +60,6 @@ func Test_parseTCPStatsError(t *testing.T) { | |||
|
|||
func TestTCPStat(t *testing.T) { | |||
|
|||
noFile, _ := os.Open("follow the white rabbit") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should probably not be removed. We want to make sure we catch the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because these lines are replaced
return nil, err
}
So that test is no longer valid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is not obsolete, you need to check the scanner for errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, i forgot to check scanner.Err().
This was explicitly changed away from bufio.Scanner in #1380 in order to make sure we read the file in one syscall. Using a scanner leads to poor interaction with the kernel, which allows the contents of the file to change while you read it. If you have hundreds of thousands of entries in this file, that's going to be many megabytes of output. Is it fast enough to be practical to read this much data? Do you have any benchmarks for this? |
Actually the file was around 60MB (due to application's connection leak), plus the server was under heavy load. However, now it is normal again, so I cannot do any benchmark. I will try to do it when I see any of our servers in the same situation. |
It sounds like this was working as intended. If the collector started to fail, it should report this via In situations where the node is failing, we don't want the exporter to contribute to the failure. We're intentionally short-circuiting the read in order to avoid situations exactly like what you're describing. |
It looks like this hasn't been converted to the new functions available in the procfs library. We should probably do that. The procfs library uses an |
In my situation
I will try with this when I have a server with high load. |
This is still working-as-intended for Prometheus. We can't predict all failures, so by design, we prefer to hard fail and be as noisy as possible. |
I see, so closing this PR. |
When
/proc/net/tcp
and/proc/net/tcp6
contain hundreds thousand of lines Prometheus will not be able to scrape node_exporter with tcpstat collector enabled (timeout exceeded).Using bufio.Scanner resolves the issue.
Signed-off-by: chinhnc chicknsoupuds@gmail.com