If your setup needs to buffer log messages during the transport to Graylog or Graylog is not accessible from all network segments, you can use Apache Kafka as a message broker from which Graylog will pull messages, once they are available.
Please be aware that Graylog will connect to Apache ZooKeeper and fetch the topics defined by the configured regular expressing. Adding SSL/TLS or authentification information is not possible with the latest stable version of Graylog (2.1.0 at the time of writing).
NOTE: This Guide will not give you a complete copy & paste howto, but it will guide you through the setup process and provide additional information if necessary.
Please do not follow the described steps blindly if you don't know how to deal with common issues yourself.
In the scenario used in this guide, a syslog message will run through the following stages:
- Message sent from rsyslog to Logstash via TCP or UDP
- Message sent from Logstash to Apache Kafka
- Message pulled and consumed from Apache Kafka by Graylog (via Kafka input)
- Structured syslog information extracted from JSON payload by Graylog
If you run rsyslog 8.7.0 or higher with support for Apache Kafka, the message can run through the following stages:
- Message sent from rsyslog to Apache Kafka
- Message pulled and consumed from Apache Kafka by Graylog (via Kafka input)
- Structured syslog information extracted from JSON payload by Graylog
We assume that there is an Apache Kafka instance running on kafka.int.example.org
(192.168.100.10) and a Graylog instance is running on graylog.int.example.org
(192.168.1.10). Additionally, the logs will be generated by Linux systems syslog.o1.example.org
(192.168.50.30) and syslog.o2.example.org
(192.168.2.30).
All Systems are running Ubuntu Linux, so you might need to adjust some configuration path settings on different operating systems.
If you do not have a running Apache Kafka cluster, you can follow the quickstart guide, but be aware that this is not a hardened production-ready setup!
With rsyslog, you can use templates to format messages. Formatting the messages directly at the source will help to have a clean, predictable workflow.
In order to be able to identify log messages via the fully qualified domain name (FQDN) of the system that created the log message, we're use the configuration option PreserveFQDN
- but you will need to have a working DNS resolution for this to work.
rsyslog will send the log message via UDP to the local Logstash instance (listening on 127.0.0.1:5514
).
PreserveFQDN on
template(name="ls_json"
type="list"
option.json="on") {
constant(value="{")
constant(value="\"@timestamp\":\"") property(name="timereported" dateFormat="rfc3339")
constant(value="\",\"@version\":\"1")
constant(value="\",\"message\":\"") property(name="msg")
constant(value="\",\"host\":\"") property(name="hostname")
constant(value="\",\"severity\":\"") property(name="syslogseverity-text")
constant(value="\",\"facility\":\"") property(name="syslogfacility-text")
constant(value="\",\"programname\":\"") property(name="programname")
constant(value="\",\"procid\":\"") property(name="procid")
constant(value="\"}\n")
}
*.* @127.0.0.1:5514;ls_json
The configuration above needs to be saved to /etc/rsyslog.d/90-logstash.conf
on the syslog hosts, syslog.o1.example.org
and syslog.o2.example.org
in our example. Additionally, rsyslog must be restarted with the command service rsyslog restart
to read the new configuration.
If you have rsyslog 8.7.0 or higher you can use the rsyslog Kafka output module omkafka to send the messages from rsyslog directly to Apache Kafka:
$ModLoad omkafka
action(type="omkafka" topic="logs" broker=["192.168.100.10:9092"] template="ls_json")
If your rsyslog does not support the Kafka output module, you can use Logstash to forward messages to Graylog.
Logstash will listen on localhost
(127.0.0.1) on port 5514/udp
for messages that are coming from rsyslog and will forward them to the Apache Kafka cluster.
input {
UDP {
port => 5514
host => "127.0.0.1"
type => syslog
codec => "json"
}
}
output {
kafka {
bootstrap_server => "192.168.100.10:9092"
topic_id => "logs"
}
}
Additional information about the configuration options can be found in the Kafka output module documentation of Logstash.
Now the log messages need to be pulled and consumed by Graylog.
Create a Syslog Kafka input and configure it according to information from the previous steps in this guide (exchange name, username, password, and hostname). Also set the option Allow overwrite date.
Start the newly created Syslog Kafka input to consume the first messages and create a JSON extractor. Additionally create a second extractor on the field host
and the type Copy input, and store it in the field source
. You might want a third Copy input
to store Logstash's @timestamp
field into the timestamp
message field used by Graylog.
You could use the rsyslog Linux systems as Syslog proxies for every possible source in the same network and add more systems to your setup.
- untergeek for rsyslog / json template and the blogpost
- IETF for documentation ips