-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[meta] Potential fix for issue #801 #802
Conversation
Signed-off-by: Rajkumar Pennadam Ramamoorthy <rpennadamram@marvell.com>
@@ -1268,7 +1268,8 @@ std::string sai_serialize_acl_resource_list( | |||
return j.dump(); | |||
} | |||
|
|||
json arr = json::array(); | |||
json::array_t a; | |||
json arr(a); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain why this fixes the existing issue? also if you grep this file by json::array() its used in 7 places, so it would require to be fixed in all of them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line, "json arr = json::array();" calls constructor, "basic_json(init, false, value_t::array);" in file json.hpp:1615, whereas, "json::array_t a; json arr(a);" invokes different constructor "basic_json(const array_t& val)".
Here, the issue appears to be stack corruption, where passed parameter seems to have a value different from the one passed by the calling method. We see this issue only in this flow on API "sai_serialize_acl_resource_list".
When modified the constructor call to pass only 2 arg and assign 3 arg from default value as below, issue is not seen.
"basic_json(init, false);" in file json.hpp:1615
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if this is actual root cause, since the same exact line is used in different serialization methods in this file as well, and if they don't cause same error, then this could potentially by compiler bug or some memory alignment, for example same constructor us used in sai_serialize_json_fdb_event_notification_data and sai_serialize_fdb_event_ntf or sai_serialize_port_oper_status_ntf. So if you don't hit this error anywhere else, then it ether means that you don't receive those notifications in your case (if error is persistent), or the error is related to some compiler/optimization. Also, this error is not triggered on x86/64 platform, only on armhf, which would suggest armhf platform is flawed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will you be able to see if you will hit this error without -O2 enabled ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, do you have idea which actual line in json.hpp is causing this error? maybe it's better to fix this in json.hpp rather than here, if many different components may use json.hpp and this array constructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
boost was recently added as dependency as am' aware, and sairedis library is not currently using any boost bindings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was in azure pipeline build docker. Wondering if @rajkumar38 has mismatched binaries by any chance???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was in azure pipeline build docker. Wondering if @rajkumar38 has mismatched binaries by any chance???
I believe all packages are installed in build docker using apt-get command and I don't see any package mismatch (verified with dpkg -l in build docker env)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will you be able to see if you will hit this error without -O2 enabled ?
With changes from PR 6532, I recompiled libswsscommon, libsairedis and libsaimetadata debs with O2 flag disabled. With this, I'm not seeing the crash.
Issue is seen when PR 6532 and O2 flag enabled together.
I don't think any compiler settings changed recently and is not clear how PR 6532 is causing the regression.
do we need this for 202012 release? |
Yes this is needing backport to 202012 -because I believe the boost upgrade PR ( PR6532 ) was backported to 202012 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, do you have idea which actual line in json.hpp is causing this error? maybe it's better to fix this in json.hpp rather than here, if many different components may use json.hpp and this array constructor
We can have below changes in json.hpp. Let me know if we can have this fix, I will raise PR accordingly.
|
yes, seems good to me, please rise pr for swss-common |
Thanks Kamil. |
potential fixes: as analysis progresses on #801, probably cross compiler issue, if those 2 above will be successful, this one could be closed |
Closing this PR as analysis and progress tracked in #801. |
Signed-off-by: Rajkumar Pennadam Ramamoorthy rpennadamram@marvell.com
Issue:
#801