-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDFS-16700. RBF: Record the real client IP carried by the Router in the NameNode log #4659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single line.
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we test with HADOOP_CALLER_CONTEXT_ENABLED_KEY set to true and false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @goiri for the comment and review.
I will update later.
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is same as NameNode#getClientMachine()
, can you move it to the common module and can used by here and NameNode
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads up, I'll extract it separately and make it more generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HDFS-13248 use isProxyUsers
to control whether obtain client's ip and port from CallerContext
. But in this PR, you plan ignore isProxyUsers
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a conflict with HDFS-13248 here.
For Router, it will always pass Clientip and port to CallerContext, but NameNode may not always use it, which is not a problem with isProxyUsers.
Logging the Clientip in the log file would look clearer.
210d1f4
to
cdbc91c
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Here are some logs from the online cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to refactor this into a general function and call it with the two configs?
The other interesting part would be to log the output too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @goiri for the comment and review.
If you want HADOOP_CALLER_CONTEXT_ENABLED_KEY to reset and take effect, MiniRouterDFSCluster needs to be restarted, so I refactored the process of starting MiniRouterDFSCluster. And use a static Configuration, which is for the NameNode.
You mean keep globalSetUp() unchanged and add static Configuration?
Another question is, do you think the new format of logging needs to be updated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small javadoc with an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add some javadoc later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is duplicate with NameNode#parseSpecialValue
, can you remove the method in NameNode.class
and use this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CLIENT_ID_STR and CLIENT_CALL_ID_STR are involved here, so I keep them for now.
It might be more appropriate to create a new jira to solve this problem?
What do you think, @ZanderXu .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't get you idea. Just public parseSpecialValue
method in CallerContext.java
and remove parseSpecialValue
method in NameNode.class
, and modify the caller place in NameNode.java
to use CallerContext.parseSpecialValue()
.
💔 -1 overall
This message was automatically generated. |
* If not, return null. | ||
*/ | ||
public static String getRealClientIp(String context) { | ||
if (context != null && !context.equals("")) { |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string here is obtained from outside.
* If not, return null. | ||
*/ | ||
public static String getRealClientPort(String context) { | ||
if (context != null && !context.equals("")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!context.equals("") -> context.isContextValid()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string here is obtained from outside.
|
||
@Override | ||
public String toString() { | ||
boolean isCallerContextEnabled = conf.getBoolean( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a good idea to get the value from conf every time. And HADOOP_CALLER_CONTEXT_ENABLED_KEY
is used to control whether output callerContext into audit log. We plan use this key here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, the essence is to obtain the real client ip, and it is not a problem to use HADOOP_CALLER_CONTEXT_ENABLED_KEY.
I have updated some, can you give some new suggestions, @goiri . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jianghuazhu .
Description of PR
When applying RBF, the ip recorded in the log file saved in the NameNode is still the Router. The real client ip should be logged.
Details: HDFS-16700
How was this patch tested?
When hadoop.caller.context.enabled=true, you should see the client ip recorded in the NameNode log file.