-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lab 5 SONiC training. #1
base: 201911
Are you sure you want to change the base?
Conversation
m_pollTimer->setInterval(interv); | ||
//is it ok to stop without having it started? | ||
m_pollTimer->stop(); | ||
m_pollTimer->start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use restart instead of start and stop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean reset.
{ | ||
auto interv = timespec { .tv_sec = interval, .tv_nsec = 0 }; | ||
|
||
SWSS_LOG_INFO("startTimer, find executor %p\n", m_pollTimer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why to print pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleted.
{ | ||
SWSS_LOG_ENTER(); | ||
|
||
try |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure you need try catch here as i can't see what failure could happen in this scenario that will be caught by catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stop and start can throw exceptions.
} | ||
} | ||
|
||
int TxMonOrch::handlePeriodUpdate(const vector<FieldValueTuple>& data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not return bool here if you have only 2 possible values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes it easier to call this and other functions and use the same status variable.
} | ||
} | ||
catch (...) { | ||
SWSS_LOG_ERROR("Failed to handle period update\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we return in this scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there are no function calls that can fail after needStart is set, we can be sure that needStart is false here, so effectively we will return.
int rc = 0; | ||
|
||
SWSS_LOG_ENTER(); | ||
SWSS_LOG_INFO("TxMonOrch doTask consumer\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the swss_log_enter pretty much does the same print only it is in DEBUG mode and not INFO.
for your consideration
|
||
/*should be accessed via an atomic approach?*/ | ||
uint32_t m_pollPeriod; | ||
int m_poolPeriodChanged; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not used anywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleted.
if (rc != 0) | ||
SWSS_LOG_ERROR("TX_ERR_APPL: got port %s tx_err_stat failed %d\n", i.first.c_str(), rc); | ||
fields.emplace_back(TXMONORCH_FIELD_APPL_STATI, to_string(tesStatistics(i.second))); | ||
fields.emplace_back(TXMONORCH_FIELD_APPL_TIMESTAMP, "0"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had trouble understanding what this code does.
std::stringstream stream; | ||
stream << "oid:0x" << std::hex << tesPortId(stat); | ||
std::string keystr( stream.str() ); | ||
m_countersTable.get(keystr, fvs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a simpler cleaner way of doing this:
m_countersTable.get(sai_serialize_object_id(tesPortId(stat), fvs);
not needed all previous calls and vars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking for something like that. I will try it.
if (tx_error_state < TXMONORCH_PORT_STATE_MAX) | ||
fvs.emplace_back(TXMONORCH_FIELD_STATE_TX_STATE, tx_status_name[tx_error_state]); | ||
else | ||
fvs.emplace_back(TXMONORCH_FIELD_STATE_TX_STATE, "invalid"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have definitions to all other states. why not add a define also for invalid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
What I did
Implemented a solution for Lab 5 in SONiC training based on one of the solutions provided in the Wiki, with some changes. These include using the counters DB instead of calls to SAI to retrieve the tx error counter, passing the statistics data by reference so that the difference between successive counter readings would be used correctly, and added python code to handle some not found conditions.
Why I did it
I used a prepared solution, studied it with gdb and modified it based on what I observed in gdb, because so much of the way code is written in C++ is unfamiliar that I expected writing from scratch would take an order of magnitude more than the time allotted, and I would get more benefit from studying and modifying existing code.
How I verified it
Configure polling period, e.g.
sudo config tx_error_stat_poll_period 30
Configure error threshold, e.g.
sudo config interface tx_error_threshold set Ethernet0 10
Disable automatic refresh of counters, so that counter values can be injected into DB and stay there:
counterpoll port disable
Lookup OID of port being tested, e.g.
redis-cli -n 2 hgetall "COUNTERS_PORT_NAME_MAP" | grep -A1 "Ethernet0"
Ethernet0
oid:0x10000000008d4
Inject tx errors to that port and verify that DB was updated:
redis-cli -n 2 hset "COUNTERS:oid:0x10000000008d4" "SAI_PORT_STAT_IF_OUT_ERRORS" "20"
(integer) 0
redis-cli -n 2 hget "COUNTERS:oid:0x10000000008d4" "SAI_PORT_STAT_IF_OUT_ERRORS"
"20"
Show status within the poll period:
show interfaces tx_error
Port status statistics
Ethernet0 error 20
Show status after the poll period:
show interfaces tx_error
Port status statistics
Ethernet0 ok 20
Repeat sequences of the above commands to verify that status becomes error when the delta exceeds the threshold, and ok when it doesn't.
Details if related