Clarify a Node's System API (re)discovery procedure #6

garethsb · 2020-02-11T15:22:09Z

(Copied from discussion elsewhere)

I would like IS-09 to clarify the relationship between the Node's discovery procedure for a System API and the discovery procedure for a Registration API defined by IS-04.

IS-04 is clear about how connection failures, errors and timeouts should be handled, including retry, exponential back-off, etc. This process doesn't prevent the Node starting up its other functions, Node API, senders and receivers, etc.

I don't seem to be able to get the same level of detail from TR-1001-1 or IS-09 on interaction with the System API, and understand whether or not System API discovery may be performed simultaneously with Registration API discovery, and whether System API re-discovery should ever be performed, e.g. periodically, or perhaps when failures have been encountered with all Registration APIs, which might suggest the is04.heartbeat_interval has been changed in the system.

--

This may be partially addressed by PR #3.

The text was updated successfully, but these errors were encountered:

andrewbonney · 2020-02-12T15:19:26Z

Perhaps the first thing to clarify is whether the system resource is intended purely for device startup, or whether it is for maintaining correct configuration over longer periods. The former is notionally simpler, but does present at least a couple of issues:

If a Media Node comes online before the system resource is available (after a major outage), it will blindly assume its previous config (although generally this is probably the correct behaviour in this condition).
If the system resource config is changed for any reason, only new or rebooted devices will obtain this configuration, with the others requiring manual intervention.

The latter definition is certainly more flexible, but likely has a lot more which needs to be defined as a result. The PR mentioned (or at least the TTL aspect) is likely only relevant in this case.

Perhaps v1.0 could be limited to the former, with room to expand into the latter behaviour in a v1.1 at a later date.

garethsb · 2020-02-12T16:53:43Z

Confirming which of those approaches is expected would be a great start. Thanks, Andrew.

Even in the former case - which is all that is checked by the JT-NM Tested criteria right now, I believe - I think there are still details to nail down, like how long the Node waits for/how many times it retries the System API at start-up, and whether it is permitted to connect to a Registry and enable RTP transmitting, etc. during this time period.

wsneijers · 2020-02-14T07:25:39Z

Good point. Personally I think the second approach makes more sense:

It is more robust in regard to high availability, resource updates and startup sequence.
It is more in line with existing discovery mechanisms (I'm referring to IS-04 registered and peer-to-peer operation).

But indeed it is more complex and it may be better to start simple and expand from there.

garethsb · 2020-02-19T13:38:14Z

The difference between a Node's communication with the System API and with the Registration API is that the former is currently a single GET request, whereas the latter involves the regular heartbeat POST requests. Encountering an error in a Registration API request is the specified trigger to discover an alternative Registry. There is no such regular request mechanism defined between the Node and the System API, so it would need something else, such as TTL or a time interval as used in API security/authorization. (This fact that Node registration behaviour is 'sticky' unless it encounters errors has sometimes been confusing.)

We have a prototype that uses a time interval to poll the System API, which also currently enables RTP senders/receivers and uses a Registry heartbeat interval according to cached values, before a System API is discovered at start up.

garethsb mentioned this issue Apr 30, 2020

Node start-up behaviour #9

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify a Node's System API (re)discovery procedure #6

Clarify a Node's System API (re)discovery procedure #6

garethsb commented Feb 11, 2020

andrewbonney commented Feb 12, 2020

garethsb commented Feb 12, 2020

wsneijers commented Feb 14, 2020

garethsb commented Feb 19, 2020

Clarify a Node's System API (re)discovery procedure #6

Clarify a Node's System API (re)discovery procedure #6

Comments

garethsb commented Feb 11, 2020

andrewbonney commented Feb 12, 2020

garethsb commented Feb 12, 2020

wsneijers commented Feb 14, 2020

garethsb commented Feb 19, 2020