Skip to content

System Design Guide

Peter Brightwell edited this page Aug 17, 2018 · 1 revision

For many smaller systems it should be sufficient to purchase equipment which supports the relevant NMOS specifications and expect it to work successfully together. For larger systems it is recommended that some consideration is given to system design.

Minimising Configuration

The NMOS specifications are intended to be capable of operation as part of systems which make use of zero or near-zero configuration. In order to achieve this, the following technologies may be used:

  • DHCP

    • Dynamic Host Configuration Protocol is used primarily to permit hosts on a network to request an IP address from a central management service (DHCP server). Alongside an IP address, further configuration data may be provided. This typically includes the addresses of NTP servers, and importantly DNS servers (and DNS search domains) which are used below.
  • DNS

    • The Domain Name System provides the mechanism to translate human-friendly names (like nmos.tv) to IP addresses. In addition, it provides a directory to permit the lookup of various named services. On the Internet this is typically used to find the mail servers which are associated with a particular domain name, but in NMOS systems it can be used to identify the Registration and Query APIs which are authoritative for that domain or subdomain.
    • Multicast DNS provides a convenient mechanism to achieve discovery in small unrouted (layer 2) networks without any infrastructure services, however in larger networks it is preferable to use unicast DNS as the means to resolve discoverable services.

Split Control & Media Networks

IP media systems (which transport raw video in particular) are typically deployed in two or three layouts:

  • Merged control and media in a single flat network
  • Split control and media into two logical or physically separate networks
  • Split control and two split media networks (allowing for ST.2022-7)

The NMOS specifications are intended to operate in any of these designs. Note however that the locations which registries need to be deployed in these networks may be governed by the specific Nodes which are intended to be used and any limitations they may have with regard to network interfaces.

Typically where Nodes have multiple network interfaces they will default to carrying NMOS related control traffic over the 'control' or 'management' interface which may operate at a lower data rate. In some Node implementations, it may also be possible to access NMOS APIs via the 'media' interfaces and even use these as the default, however this is not guaranteed in all cases. A further class of Nodes may only have a single interface which is used for mixed control and media traffic.

IS-04 Registry Scaling & Failover

When planning an IS-04 registry deployment, it is worth considering the potential load which the system could come under, and what may happen in certain failure cases.

The following factors should help to inform the design of the registry deployment, and potentially which of the available registry implementations are capable of operating to your requirements.

Resource Load

The number of 'resources' which the system will consist of should be determined (a 'resource' is a single Node, Device, Sender, Receiver, Source or Flow). Some Nodes will expose more resources than others.

This factor will determine the number of Registration API instances which your system requires, and the overall capacity of the registry implementation. Registry vendors should be able to advise on how many resources an individual Registration API can handle. Bear in mind that in the 'steady state' the Registration API must handle periodic heartbeats from each Node along with any updates made to its resources. In a system failure situation a Registration API may need to handle re-registrations of all resources in the system in a short period of time. Systems should be tested for their handling of this scenario, and additional API instances (see Load Balancing) or greater segmentation (see Failure-Domains) used to resolve any issues.

Client Load

The expected number of clients using the Query API at any time should be identified. This factor will determine the number of Query API instances which your system requires. Registry vendors should be able to advise on the number of clients which their implementation can support via a single instance. This figure may vary dependent on the number of resources contained within the registry and the complexity of query operations which are expected from clients.

Failure Domains

A key principal in IT system design is the segmentation of failure domains. When deploying a large system, consider which areas should continue to operate if others experience a major failure. By way of example, a facility containing multiple studios may be designed to isolate individual studios from each other such that others may continue if one experiences a failure.

Isolation of the Registration and Query APIs can be achieved using various mechanisms (some of which may be vendor and control system specific). A vendor-agnostic way of achieving segmentation involves using a different DNS subdomain for each 'failure domain' (assuming use of unicast DNS). This ensures that Nodes and other API clients within the scope of an individual subdomain only contact a specific set of Registration and Query APIs. If these APIs fail, other areas of the system may continue to operate.

Load Balancing & Redundancy

Any number of Registration and Query APIs can be operated alongside each other (assuming the specific registry implementation supports it). Whilst systems should at a minimum have enough Registration and Query API instances to handle their peak load, it is advised to operate more instances than are required. This performs multiple functions: First it spreads the average load across the system to ensure no single instance is operating to capacity at all times. Second, it ensures that if any single instance fails there is still enough capacity present in the system to handle the load. Finally, when a single Registration API fails, Nodes can fail over to a secondary instance seamlessly provided they can contact another instance immediately, as opposed to becoming unregistered and causing an outage. Further details on this topic and the use of different 'priorities' for APIs can be found in the IS-04 documentation.

Advanced Techniques

Dependent on the registry implementation in use, it may be possible to use additional techniques to scale registries as appropriate to your deployment. These include:

  • Hosting the Registration and Query APIs on different physical hosts to your registry (database) hosts.
  • Scaling of Registration and Query APIs independent of each other to match the side of the system with the greatest load.

Registries which are best suited to large deployments will include additional debug and administration functions to highlight which areas of the system may be under the greatest load and require additional instances, particularly as systems grow beyond their initial design parameters.

IS-04 & Higher Level Control Systems

In smaller systems clients may connect directly to an IS-04 registry, listing labels as they appear from Nodes and providing a capability for any client to control any Node in the system. In larger systems it may be desirable to customise labelling and provide custom views of subsets of the system via specific control interfaces.

IS-04 is not intended to meet all of these requirements directly, however it does provide the building blocks with which to create solutions to these requirements from. The identifiers associated with each resource in the system (Nodes, Devices, Senders, Receivers etc) are intended to remain stable throughout a Node's lifetime. This provides a stable base to build more complex control systems on top of, allowing for custom labelling and grouping of resources amongst other concerns.