Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][broker]Change the log level to reduce repeated error logs #23192

Merged
merged 2 commits into from
Aug 27, 2024

Conversation

rayluoluo
Copy link
Contributor

@rayluoluo rayluoluo commented Aug 17, 2024

Fixes #23191

Motivation

If the pre-installed NIC is not activated, the NIC driver is not supported, or a virtual NIC is used, the software may fail to read the NIC speed. As a result, the pulsar broker continuously prints error logs every minute. In fact, the system is not in a fault state in these cases. In addition, repeated error logs increase performance overhead and log storage space, and affect log reading and analysis.

An example of the error log is as follows:

[pulsar-load-manager-1-1] ERROR o.a.p.b.loadbalance.impl.LinuxBrokerHostUsageImpl - Failed to read speed for nic enp180s0v0, maybe you can set broker config [loadBalancerOverrideBrokerNicSpeedGbps] to override it.
java.io.IOException: Invalid argument
	at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
	at sun.nio.ch.FileDispatcherImpl.read(FileDispatcherImpl.java:46)
	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
	at sun.nio.ch.IOUtil.read(IOUtil.java:197)
	at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:159)
	at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)
	at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
	at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
	at java.nio.file.Files.read(Files.java:3105)
	at java.nio.file.Files.readAllBytes(Files.java:3158)
	at org.apache.pulsar.broker.loadbalance.impl.LinuxBrokerHostUsageImpl.lambda$null$3(LinuxBrokerHostUsageImpl.java:255)
	at java.util.stream.ReferencePipeline$6$1.accept(ReferencePipeline.java:244)
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.DoublePipeline.collect(DoublePipeline.java:500)
	at java.util.stream.DoublePipeline.sum(DoublePipeline.java:411)
	at org.apache.pulsar.broker.loadbalance.impl.LinuxBrokerHostUsageImpl.lambda$getTotalNicLimitKbps$4(LinuxBrokerHostUsageImpl.java:261)
	at java.util.Optional.orElseGet(Optional.java:267)
	at org.apache.pulsar.broker.loadbalance.impl.LinuxBrokerHostUsageImpl.getTotalNicLimitKbps(LinuxBrokerHostUsageImpl.java:252)
	at org.apache.pulsar.broker.loadbalance.impl.LinuxBrokerHostUsageImpl.calculateBrokerHostUsage(LinuxBrokerHostUsageImpl.java:104)
	at org.apache.pulsar.common.util.Runnables$CatchingAndLoggingRunnable.run(Runnables.java:54)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:750)

Modifications

Actually, the system is not in the faulty state. It's ok to change the log level to debug.

Verifying this change

  • Make sure that the change passes the CI checks.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Aug 17, 2024
@rayluoluo rayluoluo force-pushed the master branch 3 times, most recently from 9513ecb to 2fee54a Compare August 17, 2024 13:59
@rayluoluo
Copy link
Contributor Author

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to add a solution that logs it once on the first time at error level and after that at debug level.

@codecov-commenter
Copy link

codecov-commenter commented Aug 23, 2024

Codecov Report

Attention: Patch coverage is 20.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 74.49%. Comparing base (bbc6224) to head (8ed80db).
Report is 546 commits behind head on master.

Files Patch % Lines
...ache/pulsar/broker/loadbalance/LinuxInfoUtils.java 20.00% 4 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #23192      +/-   ##
============================================
+ Coverage     73.57%   74.49%   +0.91%     
- Complexity    32624    33694    +1070     
============================================
  Files          1877     1922      +45     
  Lines        139502   144762    +5260     
  Branches      15299    15834     +535     
============================================
+ Hits         102638   107838    +5200     
+ Misses        28908    28647     -261     
- Partials       7956     8277     +321     
Flag Coverage Δ
inttests 27.89% <20.00%> (+3.31%) ⬆️
systests 24.66% <20.00%> (+0.34%) ⬆️
unittests 73.85% <20.00%> (+1.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...ache/pulsar/broker/loadbalance/LinuxInfoUtils.java 53.33% <20.00%> (-3.02%) ⬇️

... and 537 files with indirect coverage changes

@rayluoluo
Copy link
Contributor Author

I think it's better to add a solution that logs it once on the first time at error level and after that at debug level.

Change as suggested

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Good work @rayluoluo

@315157973 315157973 merged commit d9bd6b0 into apache:master Aug 27, 2024
51 checks passed
@lhotari lhotari added this to the 4.0.0 milestone Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[improve][broker]Change the log level to reduce repeated error logs
5 participants