Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lab 5 SONiC training. #1

Open
wants to merge 1 commit into
base: 201911
Choose a base branch
from
Open

Lab 5 SONiC training. #1

wants to merge 1 commit into from

Conversation

raphaelt-nvidia
Copy link
Owner

What I did

Implemented a solution for Lab 5 in SONiC training based on one of the solutions provided in the Wiki, with some changes. These include using the counters DB instead of calls to SAI to retrieve the tx error counter, passing the statistics data by reference so that the difference between successive counter readings would be used correctly, and added python code to handle some not found conditions.

Why I did it

I used a prepared solution, studied it with gdb and modified it based on what I observed in gdb, because so much of the way code is written in C++ is unfamiliar that I expected writing from scratch would take an order of magnitude more than the time allotted, and I would get more benefit from studying and modifying existing code.

How I verified it

Configure polling period, e.g.

sudo config tx_error_stat_poll_period 30

Configure error threshold, e.g.

sudo config interface tx_error_threshold set Ethernet0 10

Disable automatic refresh of counters, so that counter values can be injected into DB and stay there:

counterpoll port disable

Lookup OID of port being tested, e.g.

redis-cli -n 2 hgetall "COUNTERS_PORT_NAME_MAP" | grep -A1 "Ethernet0"
Ethernet0
oid:0x10000000008d4

Inject tx errors to that port and verify that DB was updated:

redis-cli -n 2 hset "COUNTERS:oid:0x10000000008d4" "SAI_PORT_STAT_IF_OUT_ERRORS" "20"
(integer) 0
redis-cli -n 2 hget "COUNTERS:oid:0x10000000008d4" "SAI_PORT_STAT_IF_OUT_ERRORS"
"20"

Show status within the poll period:

show interfaces tx_error
Port status statistics

Ethernet0 error 20

Show status after the poll period:

show interfaces tx_error
Port status statistics

Ethernet0 ok 20

Repeat sequences of the above commands to verify that status becomes error when the delta exceeds the threshold, and ok when it doesn't.

Details if related

m_pollTimer->setInterval(interv);
//is it ok to stop without having it started?
m_pollTimer->stop();
m_pollTimer->start();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use restart instead of start and stop

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean reset.

{
auto interv = timespec { .tv_sec = interval, .tv_nsec = 0 };

SWSS_LOG_INFO("startTimer, find executor %p\n", m_pollTimer);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why to print pointer?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted.

{
SWSS_LOG_ENTER();

try

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure you need try catch here as i can't see what failure could happen in this scenario that will be caught by catch

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stop and start can throw exceptions.

}
}

int TxMonOrch::handlePeriodUpdate(const vector<FieldValueTuple>& data)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not return bool here if you have only 2 possible values?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes it easier to call this and other functions and use the same status variable.

}
}
catch (...) {
SWSS_LOG_ERROR("Failed to handle period update\n");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we return in this scenario?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there are no function calls that can fail after needStart is set, we can be sure that needStart is false here, so effectively we will return.

int rc = 0;

SWSS_LOG_ENTER();
SWSS_LOG_INFO("TxMonOrch doTask consumer\n");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the swss_log_enter pretty much does the same print only it is in DEBUG mode and not INFO.
for your consideration


/*should be accessed via an atomic approach?*/
uint32_t m_pollPeriod;
int m_poolPeriodChanged;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used anywhere

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted.

if (rc != 0)
SWSS_LOG_ERROR("TX_ERR_APPL: got port %s tx_err_stat failed %d\n", i.first.c_str(), rc);
fields.emplace_back(TXMONORCH_FIELD_APPL_STATI, to_string(tesStatistics(i.second)));
fields.emplace_back(TXMONORCH_FIELD_APPL_TIMESTAMP, "0");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had trouble understanding what this code does.

std::stringstream stream;
stream << "oid:0x" << std::hex << tesPortId(stat);
std::string keystr( stream.str() );
m_countersTable.get(keystr, fvs);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a simpler cleaner way of doing this:
m_countersTable.get(sai_serialize_object_id(tesPortId(stat), fvs);

not needed all previous calls and vars

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking for something like that. I will try it.

if (tx_error_state < TXMONORCH_PORT_STATE_MAX)
fvs.emplace_back(TXMONORCH_FIELD_STATE_TX_STATE, tx_status_name[tx_error_state]);
else
fvs.emplace_back(TXMONORCH_FIELD_STATE_TX_STATE, "invalid");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have definitions to all other states. why not add a define also for invalid?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants