-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathsystemenvironments.tex
357 lines (291 loc) · 15.5 KB
/
systemenvironments.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
% LICENSE: see LICENCE
\section{System Environment}
\label{systemenvironment}
In this section, we describe the system configuration requirements for the
continuous operation of the tools. The production environment differs in some
respects to those used for experimentation and in laboratory testing. Monitoring,
automatic recovery (in case of errors) and resilience are crucial in 24/7
operations. The term \emph{production environment} will be used here to refer to
such use.
\subsection{Processing requirements}
The tools have differing requirements regarding CPU performance and amount of
memory, and while the performance of most desktop PCs is sufficient to run the
tools, it is important to take the requirements in consideration when setting up
a system.
Memory requirements are easily met with 1GB of RAM, so we'll look at CPU more in
depth.
The most resource-consuming part is the modulator ODR-DabMod. The
following impact its CPU usage: number of sub-channels; enabling of the
resampler; enabling crest factor reduction; enabling the predistorter.
Compilation options to optimise ODR-DabMod for your system are described in the
README. While you should have no trouble running it even on an older desktop PC,
the computing power of embedded ARM boards (like the Raspberry Pi) could be
insufficient, especially if the resampler is needed.\footnote{See section
\ref{otherhardware} for an example.}
When using a USB SDR device, the USB controller can have a large impact on the
robustness of the transmission, even if CPU usage is low. Such issues are visible as
underruns during operation: with a good controller, less than one underrun per
day is easily achievable on a machine dedicated to only this task. When using
a graphical interface at the same time, interaction with the user interface can
also trigger underruns. For a production system, it is better if no graphical
user interface is running.
In any case, it is required to evaluate a given system over several days if
reliable operation is to be proven.
The multiplexer ODR-DabMux mostly rearranges data internally, and doesn't do
much processing. Its resource requirements are low and it runs well on small
systems. The same goes for ODR-PadEnc, ODR-EDI2EDI, ODR-zmq2edi and ODR-zmq2farsync.
Audio encoding using ODR-AudioEnc is in-between ODR-DabMux and ODR-DabMod in
terms of resource usage, and running one encoder is not a problem even on small
embedded ARM boards.
However, you might want to run a dozen encoders on a single machine, where you
will have to plan for more headroom.
In general, for a robust 24/7 system, you should strive for a CPU usage below
50\%, regardless of which tools you are using. This gives you headroom for
monitoring, remote administration and background jobs run by cron.
Once your system is in operation, monitoring performance and observing logs is
essential to assess the health of your transmission.
\subsection{Launching the tools}
Services running in a production environment are usually administered remotely,
and must be able to run without user intervention, or connection. Traditionally,
such services are implemented (in UNIX terminology) as `daemons'. These are
started and stopped using the init system contained within the distribution.
As the ODR-mmbTools cannot daemonise themselves, a process supervisor is used.
\paragraph{supervisord}
One possibility is to use
\texttt{supervisord}\footnote{\url{http://supervisord.org}}
which can launch the tools and monitor their proper execution. It will
restart the processes and optionally inform the operator by email.
Once installed, supervisord reads its configuration file in
\texttt{/etc/supervisor.conf}
and launches the processes that are to be monitored. Each process is described
by a file. The following example assumes the tools are run as user \texttt{odr},
and that the multiplex configuration is in \texttt{/home/odr/config.mux}, and
that ODR-DabMux is to be launched.
The standard output and standard error streams of ODR-DabMux are written to the
specified log files.
\begin{lstlisting}
[program:ODR-DabMux]
command=odr-dabmux config.mux
directory=/home/odr
user=odr
autostart=true
autorestart=true
stdout_logfile=/home/odr/logs/mux.out.log
stderr_logfile=/home/odr/logs/mux.err.log
\end{lstlisting}
Once this configuration has been added to the supervisord configuration, the
settings have to be re-read using:
\begin{lstlisting}
supervisorctl reread
\end{lstlisting}
In order for supervisord to start managing and running this process, it needs to
be added:
\begin{lstlisting}
supervisorctl add ODR-DabMux
\end{lstlisting}
Setting up more processes (such as any of the other tools) can be easily
achieved by customising the configuration template above. Examples are provided
in the \texttt{mmbtools-aux} repository, under the \texttt{supervisor} folder -
these need to be changed to reflect the paths that are in use on your system.
supervisord also includes a small web-server that can display the state of the
managed processes. It is enabled with the \verb+[inet_http_server]+ setting in
the configuration file.
\paragraph{systemd}
Most recent GNU/Linux distributions use \texttt{systemd} as init system, which
also can handle the supervision of processes. To achieve this, systemd unit
files have to be written for the tools. For more information, see the systemd
documentation.\sidenote{Give an example unit file}
\subsection{Logging}
Collecting information about events is essential within a production environment.
This information is essential forensic analysis, and tracing sources of trouble.
This is achieved through the logging of important messages that can be sent by
the tools.
ODR-DabMux and ODR-DabMod both support logging to standard error, to a file and
to the system logger \texttt{syslog}. Logging to syslog is the most flexible
approach; log information can be forwarded over the network to a
centralised logging server - where logs can then be filtered according to the
priority of each message. Both tools log to the LOCAL0 facility which in turn
can be redirected into an ODR-mmbtools specific log file.
\sidenote{Describe rsyslog configuration}
In order to avoid the log files from becoming undesirably large, \texttt{logrotate}
should be set to rotate the files automatically.
\sidenote{Describe logrotate configuration}
\subsection{Timing}
The ODR-mmbTools require the system time to be accurate in order for them to
function correctly - this is especially important when running a SFN, but is
also important for standalone transmitters when in a production environment. It
is also important to remember that most receivers have a clock that is
synchronised to the clock time which is being transmitted by the multiplex to
which it has been tuned.
The system needs to run a NTP client that synchronises the system time over the
network. Correct synchronisation can be checked using \texttt{chronyc tracking}
or the the \texttt{lpeers} command of the \texttt{ntpq} tool, depending on if
you use chrony or openntp.
The magnitude of the offset should be below $10$\ms.
The performance of the NTP synchronisation should also be monitored permanently
during operation.
ODR-DabMux can run a command at startup to verify if the NTP client is properly
working, using the \texttt{startupcheck} setting. This can be used to call
\texttt{ntp-wait} or \texttt{chronyc waitsync} to wait for proper NTP sync.
\subsection{Monitoring through SNMP}
There is ongoing work to make the monitoring of the tools possible using SNMP.
Please see \url{https://wiki.opendigitalradio.org/SNMP} for information about
this effort.
\subsection{Monitoring using munin}
\label{monitmunin}
The Munin\footnote{\url{http://munin-monitoring.org/}} monitoring tool can
create graphs for essential system health parameters. It can also send emails
if values transgress the defined bounds - this assists the operator in the
assessment of system status, as well as the health of the services.
In addition to basic system measurements like CPU, RAM and disk usage, NTP
synchronisation, disk and network performance, there are custom data sources
available for ODR-DabMux and ODR-DabMod.
The ODR-DabMux data sources include EDI and ZMQ input buffer monitoring (buffer
level, underruns and overruns) and the peak audio input levels (mono, or
stereo). The plugin for ODR-DabMod can monitor SDR device statistics among
others.
The plugins are written in python and are located in the \verb+doc+ folder in
the repositories. Copy them to \texttt{/etc/munin/plugins.d} and restart
munin-node. They require that the ODR-DabMux management
server, and the ZeroMQ remote control enabled on the ports give in the example
configurations.
\subsection{Monitoring using Xymon}
The xymon monitoring tool\footnote{\url{http://xymon.sourceforge.net/}} is used
to monitor the health of many types of systems. It can present the results in
text, tables and/or graphs. It supports the basic health checks directly out of
the box, and can be extended with scripts to perform non-standard health checks.
The default mode of operation is that clients retrieve data and send it to the
xymon server, which interprets the results, displays them and generates alerts
if thresholds are exceeded. An alert can be send in an e-mail, an SMS or a
tweet.
The Perl script \verb+retodrs.pl+\footnote{The script name stands for
''Retrieve Opendigitalradio Status``}, retrieves the status and
statistics of an Opendigitalradio service and it reports the results to xymon.
The information is retrieved from the management server within ODR-DabMux. The
information presented includes a table with the status of each sub-channel and
the underrun and overrun rates on the sub-channels. If needed an alert can be
generated depending on the subchannel status or a rate exceeding a threshold.
The script needs to be installed on the same server running ODR-DabMux, as the
management service within it is only accessible from the same computer. This
implies that the xymon client software also needs to be installed on the same
machine. The client is configured to run the script.
The configuration and the scripts can typically be found in subdirectory
\verb+/usr/lib/xymon/client+, although that may depend on your distribution.
Once the client is set up, it needs to connect to a xymon server, which may or
may not be on the same machine.
The server is configured to limit the altering to specific sub-channels, to
store the statistical data and to generate graphs.
The configuration and the scripts on a xymon server are usually stored in the
subdirectory \verb+/usr/lib/xymon/server+.
\subsubsection{Installation of the Xymon Client}
The perl script has additional requirements:
\texttt{App::cpanminus}, \texttt{ZMQ::LibZMQ3}, and \texttt{JSON::PP}. They can
be installed through your distribution packages or using CPAN.
Once the script has been copied to \verb+/usr/lib/xymon/client/ext+, the
configuration of the launcher within the xymon client needs to be extended.
Create a new file named \verb+odrmux.cfg+ in
\verb+/usr/lib/xymon/client/etc/clientlaunch.d+ containing the following lines:
\begin{lstlisting}
#
# Test odrmux checks the state and the statistics
# of the ODR-DabMux service.
#
[odrmux]
ENVFILE $XYMONCLIENTHOME/etc/xymonclient.cfg
CMD $XYMONCLIENTHOME/ext/retodrs.pl
LOGFILE $XYMONCLIENTLOGS/retodrs.plog
INTERVAL 5m
\end{lstlisting}
After a restart of the xymon client, the script \verb+retodrs.pl+ will
be invoked once every 5 minutes.
\subsubsection{Server Configuration}
By default all subchannels will be monitored, and will raise alerts if the
status or the statistics are in outside of a valid operational range. The
alerting can be limited to a subset of the sub-channels by adding a tag to the
hosts-entry in the configuration file \verb+/usr/lib/xymon/server/etc/hosts.cfg+.
The additional tag is:
\begin{lstlisting}
ODR:select(<SubChannelName0>;<SubChannelName1>;...)
\end{lstlisting}
The sub-channels not named will still be shown, but no alerts will be generated
for those sub-channels. This is visible as the green/yellow/red icons are
missing for those sub-channels.
Six statistic values are gathered by the script, namely
\texttt{BufferMin}, \texttt{BufferMax}, \texttt{PeakLeft}, \texttt{PeakRight},
\texttt{UnderRun} and \texttt{OverRun}. It is found that only the latter two
seem to contain sensible values all the time, so those values are the only
ones shown in a graph. Note that those values retrieved by the script are
ever-increasing counters, showing the total number of over-runs or under-runs.
In the graph, the average number of over-runs or under-runs per second, averaged
over a period of 5 minutes, is shown.
The first step is to have the collected statistics to be moved into a database,
a so-called \textit{Round Robin Database}. This is accomplished by adding a file
named \verb+odr.cfg+ in \verb+/usr/lib/xymon/server/etc/xymonserver.d+
containing the following lines:
\begin{lstlisting}
TEST2RRD+=",odr_mux=devmon"
GRAPHS+=",odr_mux::1"
\end{lstlisting}
The next step is to define the layout of the graph.
Create a file named \verb+graphs.odr.cfg+ in
\verb+/usr/lib/xymon/server/etc/graphs.d+ containing the following lines:
\begin{lstlisting}
#
# Graphs to show the statistics collected from an
# Opendigitalradio DabMux server.
#
[odr_mux]
FNPATTERN ^odr_mux\.(.+)\.rrd$
TITLE , Frame loss rate
YAXIS Rate [/s]
-l 0
DEF:ur@RRDIDX@=@RRDFN@:Underrun:AVERAGE
DEF:or@RRDIDX@=@RRDFN@:Overrun:AVERAGE
LINE1:ur@RRDIDX@#FF0000:@RRDPARAM@ underrun
GPRINT:ur@RRDIDX@:MIN:Min \: %5.1lf %s
GPRINT:ur@RRDIDX@:MAX:Max \: %5.1lf %s
GPRINT:ur@RRDIDX@:AVERAGE:Avg \: %5.1lf %s
GPRINT:ur@RRDIDX@:LAST:Cur \: %5.1lf %s\n
LINE1:or@RRDIDX@#00FF00:@RRDPARAM@ overrun
GPRINT:or@RRDIDX@:MIN: Min \: %5.1lf %s
GPRINT:or@RRDIDX@:MAX:Max \: %5.1lf %s
GPRINT:or@RRDIDX@:AVERAGE:Avg \: %5.1lf %s
GPRINT:or@RRDIDX@:LAST:Cur \: %5.1lf %s\n
\end{lstlisting}
\subsection{Real-time Scheduling}
As a general principle, it is prudent not to run tools (that do not need superuser
privileges) as the \texttt{root} user. The same principle also applies to the
ODR-mmbTools, but care has to be taken that the tools can still request real-time
scheduling when it is needed.
This is achieved by adding the following to \texttt{/etc/security/limits.conf},
assuming the tools are run under the user \texttt{odr}.
\begin{lstlisting}
odr - rtprio 65
odr - nice -10
\end{lstlisting}
If you have installed JACK with real-time privileges, you may find this has
already been configured for the `audio' group, written as \texttt{@audio}, which
should suffice providing your desired user is a member of the `audio' group.
\subsection{Accessing the USRP as Non-root}
Superuser privileges are not required to access USB-connected USRP devices, but
sometimes the system lacks the configuration to enable normal users to
communicate with the device.
In that case, it is necessary to add a rule file for \texttt{udev}. This file is
included in the UHD sources, but might not have been automatically installed.
The file is called \texttt{10-uhd-usrp.rules}, should be placed into
\texttt{/etc/udev/rules.d/} and should contain
\subsection{Authentication Support}
In order to be able to use the Internet as contribution network, some form of
protection has to be put in place to make sure the audio data cannot be altered
by third parties. Usually, some form of VPN is set up for this case.
{ \footnotesize
\begin{lstlisting}
#USRP1
SUBSYSTEMS=="usb", ATTRS{idVendor}=="fffe", ATTRS{idProduct}=="0002", MODE:="0666"
#B100
SUBSYSTEMS=="usb", ATTRS{idVendor}=="2500", ATTRS{idProduct}=="0002", MODE:="0666"
#B200
SUBSYSTEMS=="usb", ATTRS{idVendor}=="2500", ATTRS{idProduct}=="0020", MODE:="0666"
\end{lstlisting}
}
% vim: spl=en spell tw=80 et