bulklog

Collects, buffers, and outputs logs across multiple sources and destinations.

bulklog is written in go and requires little resource.

bulklog supports memory and redis buffering. bulklog also supports failover and can be set up for high availability.

Concepts

bulklog tries to structure data as JSON since it has enough structure to be accessible while providing felxibility.

Collection

A collection is a set of declarative informations about how bulklog should process data.

Output

bulklog outputs JSON docuemts to destinations such as Elasticsearch, MongoDB, etc...

Install

Docker

docker.pkg.github.com/khezen/bulklog/bulklog

docker run -p 5017:5017 -v /etc/bulklog:/etc/bulklog docker.pkg.github.com/khezen/bulklog/bulklog:stable

Supported tags

latest
2.0.1, 2.0, 2, stable

ENV

key	Description	Default Value
CONFIG_PATH	path to the configuration folder	/etc/bulklog

Kubernetes

Helm

Deploy bulklog to a kubernetes cluster using Helm.

helm repo add khezen https://khezen.github.com/charts
helm install khezen/bulklog --name bulklog

Config

Default config.yaml.

Persistence

Peristence is disabled by default in which case data is buffered in memory. If enabled, it uses Redis(>= 2.4) to persist documents buffer. Learn how to tune Redis persistence for your requirements.

persistence:
  enabled: true
  redis:
    endpoint: localhost:6379
    password: changeme #(optional)
    db: 0 #(optional, default:0)
    idle_conn: 2 #(optional, default: 0)
    max_conn: 10 #(optional, defaut: no limit)

Output

provides declarative information about bulklog output.

output:
  elasticsearch:
    enabled: true
    endpoint: localhost:9200
    scheme: http
#   aws_auth:
#     access_key_id: changeme
#     secret_access_key: changeme
#     region: eu-west-1
#   basic_auth:
#     username: elastic
#     password: changeme

from version 2.0.0 bulklog supports Elasticsearch 7.0.0 and above

Collections

examples:

collections:
  - name: logs
    flush_period: 5 seconds # hours|minutes|seconds|milliseconds
    retention_period: 45 minutes
    schema: {}

bulklog is schema free but we encourage you to provide some base structure since it might enbale output destination to process data more efficiently.

collections:
  - name: logs
    flush_period: 5 seconds # hours|minutes|seconds|milliseconds
    retention_period: 45 minutes
    shards: 5
    replicas: 1
    schema:
      source: 
        type: string
        max_length: 64
      stream: 
        type: string
        length: 6
      event: 
        type: string
      time:
        type: datetime
        date_format: 2006-01-02T15:04:05.999999999Z07:00

Even in the case above, bulklog remains schema free enabling log decoration with additional field.

collection

name: {collection name}
flush_period: {duration}
- flush buffer to output every {duration}
retention_period: {duration}
- if an output is unavailable, retention_period set how long bulklog tries to output data to this output
- if the output is unavailable for too long, retention_period ensure that bulklog will not accumulate too much data and will be able to serve other outputs.
shards: the number of shards to allocate this collection to
replicas: the number of replicas to allocate this collection to
schema: {map of fields by field name}

Learn more about Elasticsearch sharding and replication.

field

type: {field type}
- see supported types
length: {field exact length} (optional,string only)
max_length: {field maximum length} (optional, string only)
date_format: {date time formatting} (optional, datetime only)

API

push document

POST bulklog/v1/{collectionName} HTTP/1.1
Content-Type: application/json
{
  ...
}

HTTP/1.1 200 OK

example:

POST bulklog/v1/logs HTTP/1.1
Content-Type: application/json
{
  "source":"service1",
  "stream": "stderr",
  "event": "divizion by zero",
  "time": "2018-11-15T14:12:12Z"
}

### push documents in batches

```http
POST /v1/{collectionName}/batch HTTP/1.1
Content-Type: application/json
{...}
{...}

HTTP/1.1 200 OK

example:

POST bulklog/v1/logs/batch HTTP/1.1
Content-Type: application/json
{"source":"service1","stream": "stderr","event": "divizion by zero","time" : "2019-01-13T19:30:12"}
{"source":"service1","stream": "stdout","event": "successfully processed","time" : "2019-01-13T19:35:12"}

HTTP/1.1 200 OK

health

GET bulklog/liveness HTTP/1.1

HTTP/1.1 200 OK

GET bulklog/readiness HTTP/1.1

HTTP/1.1 200 OK

supported types

bool : True or False
unint8 : 0 to 255
uint16 : 0 to 65535
uint32 : 0 to 4294967295
unit64 : 0 to 18446744073709551615
int8 : -128 to 127
int16 : -32768 to 32767
int32 : -2147483648 to 2147483647
int64 : -9223372036854775808 to 9223372036854775807
float32 : -3.40282346638528859811704183484516925440e+38 to 3.40282346638528859811704183484516925440e+38
float64 : -1.797693134862315708145274237317043567981e+308 to 1.797693134862315708145274237317043567981e+308
string : sequence of characters
- lenght: string exact length
- max_length: string maximum length
datetime : 1970-01-01T00:00:00.000000000Z (example)
- bulklog doesn't check the date format. Most outputs accept any even if it defers from the configured one
- date_format: date format string
  - Mon Jan _2 15:04:05 2006
  - Mon Jan _2 15:04:05 MST 2006
  - Mon Jan 02 15:04:05 -0700 2006
  - 02 Jan 06 15:04 MST
  - 02 Jan 06 15:04 -0700
  - Monday, 02-Jan-06 15:04:05 MST
  - Mon, 02 Jan 2006 15:04:05 MST
  - Mon, 02 Jan 2006 15:04:05 -0700
  - 2006-01-02T15:04:05Z07:00
  - 2006-01-02T15:04:05.999999999Z07:00 (default)
  - 3:04PM
  - Jan _2 15:04:05
  - Jan _2 15:04:05.000
  - Jan _2 15:04:05.000000
  - Jan _2 15:04:05.000000000
  - 2006-01-02 15:04:05 MST
  - 2006-01-02 15:04:05.999999999 MST
object : inner document

Issues

If you have any problems or questions, please ask for help through a GitHub issue.

Contributions

Help is always welcome! For example, documentation (like the text you are reading now) can always use improvement. There's always code that can be improved. If you ever see something you think should be fixed, you should own it. If you have no idea what to start on, you can browse the issues labeled with help wanted.

As a potential contributor, your changes and ideas are welcome at any hour of the day or night, weekdays, weekends, and holidays. Please do not ever hesitate to ask a question or send a pull request.

Code of conduct.

Name		Name	Last commit message	Last commit date
Latest commit History 163 Commits
.doc		.doc
.github		.github
cmd/srv		cmd/srv
pkg		pkg
test		test
vendor		vendor
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
codecov.yml		codecov.yml
config.yaml		config.yaml
entrypoint.sh		entrypoint.sh
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bulklog

Concepts

Collection

Output

Install

Docker

Supported tags

ENV

Kubernetes

Helm

Config

Persistence

Output

Collections

collection

field

API

push document

health

supported types

Issues

Contributions

About

Releases 13

Packages

Languages

License

khezen/bulklog

Folders and files

Latest commit

History

Repository files navigation

bulklog

Concepts

Collection

Output

Install

Docker

Supported tags

ENV

Kubernetes

Helm

Config

Persistence

Output

Collections

collection

field

API

push document

health

supported types

Issues

Contributions

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 13

Packages 0

Languages

Packages