Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for non-keyed, regular hash functions #37

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 7 additions & 9 deletions docs/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,6 @@ This can e.g. be used to create consistent document ids when inserting
events into Elasticsearch, allowing events in Logstash to cause existing
documents to be updated rather than new documents to be created.

NOTE: When using any method other than 'UUID', 'PUNCTUATION' or 'MURMUR3'
you must set the key, otherwise the plugin will raise an exception

NOTE: When the `target` option is set to `UUID` the result won't be
a consistent hash but a random
https://en.wikipedia.org/wiki/Universally_unique_identifier[UUID].
Expand Down Expand Up @@ -99,8 +96,7 @@ source fields given.
* There is no default value for this setting.

When used with the `IPV4_NETWORK` method fill in the subnet prefix length.
Key is required with all methods except `MURMUR3`, `PUNCTUATION` or `UUID`.
With other methods fill in the HMAC key.
With other methods, optionally fill in the HMAC key.

[id="plugins-{type}s-{plugin}-method"]
===== `method`
Expand All @@ -111,10 +107,12 @@ With other methods fill in the HMAC key.

The fingerprint method to use.

If set to `SHA1`, `SHA256`, `SHA384`, `SHA512`, or `MD5` the
cryptographic keyed-hash function with the same name will be used to
generate the fingerprint. If set to `MURMUR3` the non-cryptographic
MurmurHash function will be used.
If set to `SHA1`, `SHA256`, `SHA384`, `SHA512`, or `MD5` and a key is set,
the cryptographic hash function with the same name will be used to generate
the fingerprint. When a key set, the keyed-hash (HMAC) digest function will
be used.

If set to `MURMUR3` the non-cryptographic MurmurHash function will be used.

If set to `IPV4_NETWORK` the input data needs to be a IPv4 address and
the hash value will be the masked-out address using the number of bits
Expand Down
57 changes: 29 additions & 28 deletions lib/logstash/filters/fingerprint.rb
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,7 @@ class LogStash::Filters::Fingerprint < LogStash::Filters::Base
config :target, :validate => :string, :default => 'fingerprint'

# When used with the `IPV4_NETWORK` method fill in the subnet prefix length.
# Key is required with all methods except `MURMUR3`, `PUNCTUATION` or `UUID`.
# With other methods fill in the HMAC key.
# With other methods, optionally fill in the HMAC key.
config :key, :validate => :string

# When set to `true`, the `SHA1`, `SHA256`, `SHA384`, `SHA512` and `MD5` fingerprint methods will produce
Expand All @@ -44,10 +43,12 @@ class LogStash::Filters::Fingerprint < LogStash::Filters::Base

# The fingerprint method to use.
#
# If set to `SHA1`, `SHA256`, `SHA384`, `SHA512`, or `MD5` the
# cryptographic keyed-hash function with the same name will be used to
# generate the fingerprint. If set to `MURMUR3` the non-cryptographic
# MurmurHash function will be used.
# If set to `SHA1`, `SHA256`, `SHA384`, `SHA512`, or `MD5` and a key is set,
# the cryptographic hash function with the same name will be used to generate
# the fingerprint. When a key set, the keyed-hash (HMAC) digest function will
# be used.
#
# If set to `MURMUR3` the non-cryptographic MurmurHash function will be used.
#
# If set to `IPV4_NETWORK` the input data needs to be a IPv4 address and
# the hash value will be the masked-out address using the number of bits
Expand Down Expand Up @@ -79,7 +80,7 @@ def register
# convert to symbol for faster comparisons
@method = @method.to_sym

# require any library and set the anonymize function
# require any library and set the fingerprint function
case @method
when :IPV4_NETWORK
if @key.nil?
Expand All @@ -90,23 +91,15 @@ def register
:error => "Key value is empty. please fill in a subnet prefix length"
)
end
class << self; alias_method :anonymize, :anonymize_ipv4_network; end
class << self; alias_method :fingerprint, :fingerprint_ipv4_network; end
when :MURMUR3
class << self; alias_method :anonymize, :anonymize_murmur3; end
class << self; alias_method :fingerprint, :fingerprint_murmur3; end
when :UUID
# nothing
when :PUNCTUATION
# nothing
else
if @key.nil?
raise LogStash::ConfigurationError, I18n.t(
"logstash.runner.configuration.invalid_plugin_register",
:plugin => "filter",
:type => "fingerprint",
:error => "Key value is empty. Please fill in an encryption key"
)
end
class << self; alias_method :anonymize, :anonymize_openssl; end
class << self; alias_method :fingerprint, :fingerprint_openssl; end
@digest = select_digest(@method)
end
end
Expand Down Expand Up @@ -137,14 +130,14 @@ def filter(event)
end
to_string << "|"
@logger.debug? && @logger.debug("String built", :to_checksum => to_string)
event.set(@target, anonymize(to_string))
event.set(@target, fingerprint(to_string))
else
@source.each do |field|
next unless event.include?(field)
if event.get(field).is_a?(Array)
event.set(@target, event.get(field).collect { |v| anonymize(v) })
event.set(@target, event.get(field).collect { |v| fingerprint(v) })
else
event.set(@target, anonymize(event.get(field)))
event.set(@target, fingerprint(event.get(field)))
end
end
end
Expand All @@ -154,22 +147,30 @@ def filter(event)

private

def anonymize_ipv4_network(ip_string)
def fingerprint_ipv4_network(ip_string)
# in JRuby 1.7.11 outputs as US-ASCII
IPAddr.new(ip_string).mask(@key.to_i).to_s.force_encoding(Encoding::UTF_8)
end

def anonymize_openssl(data)
def fingerprint_openssl(data)
# in JRuby 1.7.11 outputs as ASCII-8BIT
if @base64encode
hash = OpenSSL::HMAC.digest(@digest, @key, data.to_s)
Base64.strict_encode64(hash).force_encoding(Encoding::UTF_8)
if @key.nil?
if @base64encode
@digest.base64digest(data.to_s).force_encoding(Encoding::UTF_8)
else
@digest.hexdigest(data.to_s).force_encoding(Encoding::UTF_8)
end
else
OpenSSL::HMAC.hexdigest(@digest, @key, data.to_s).force_encoding(Encoding::UTF_8)
if @base64encode
hash = OpenSSL::HMAC.digest(@digest, @key, data.to_s)
Base64.strict_encode64(hash).force_encoding(Encoding::UTF_8)
else
OpenSSL::HMAC.hexdigest(@digest, @key, data.to_s).force_encoding(Encoding::UTF_8)
end
end
end

def anonymize_murmur3(value)
def fingerprint_murmur3(value)
case value
when Fixnum
MurmurHash3::V32.int_hash(value)
Expand Down
113 changes: 102 additions & 11 deletions spec/filters/fingerprint_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,22 @@
end
end

describe "fingerprint string with SHA1 alogrithm" do
describe "fingerprint string with SHA1 algorithm" do
config <<-CONFIG
filter {
fingerprint {
source => ["clientip"]
method => 'SHA1'
}
}
CONFIG

sample("clientip" => "123.123.123.123") do
insist { subject.get("fingerprint") } == "3a5076c520b4b463f43806896ea0b3978d09dcae"
end
end

describe "fingerprint string with SHA1 HMAC algorithm" do
config <<-CONFIG
filter {
fingerprint {
Expand All @@ -51,7 +66,7 @@
end
end

describe "fingerprint string with SHA1 alogrithm on all event fields" do
describe "fingerprint string with SHA1 HMAC algorithm on all event fields" do
config <<-CONFIG
filter {
fingerprint {
Expand All @@ -68,7 +83,23 @@
end
end

describe "fingerprint string with SHA1 alogrithm and base64 encoding" do
describe "fingerprint string with SHA1 algorithm and base64 encoding" do
config <<-CONFIG
filter {
fingerprint {
source => ["clientip"]
method => 'SHA1'
base64encode => true
}
}
CONFIG

sample("clientip" => "123.123.123.123") do
insist { subject.get("fingerprint") } == "OlB2xSC0tGP0OAaJbqCzl40J3K4="
end
end

describe "fingerprint string with SHA1 HMAC algorithm and base64 encoding" do
config <<-CONFIG
filter {
fingerprint {
Expand All @@ -85,7 +116,22 @@
end
end

describe "fingerprint string with SHA256 alogrithm" do
describe "fingerprint string with SHA256 algorithm" do
config <<-CONFIG
filter {
fingerprint {
source => ["clientip"]
method => 'SHA256'
}
}
CONFIG

sample("clientip" => "123.123.123.123") do
insist { subject.get("fingerprint") } == "4dabcab210766e35f03e77120e6986d6e6d4752b2a9ff22980b9253d026080d8"
end
end

describe "fingerprint string with SHA256 HMAC algorithm" do
config <<-CONFIG
filter {
fingerprint {
Expand All @@ -101,7 +147,7 @@
end
end

describe "fingerprint string with SHA256 alogrithm and base64 encoding" do
describe "fingerprint string with SHA256 HMAC algorithm and base64 encoding" do
config <<-CONFIG
filter {
fingerprint {
Expand All @@ -118,7 +164,22 @@
end
end

describe "fingerprint string with SHA384 alogrithm" do
describe "fingerprint string with SHA384 algorithm" do
config <<-CONFIG
filter {
fingerprint {
source => ["clientip"]
method => 'SHA384'
}
}
CONFIG

sample("clientip" => "123.123.123.123") do
insist { subject.get("fingerprint") } == "fd605b0a3af3e04ce0d7a0b0d9c48d67a12dab811f60072e6eae84e35d567793ffb68a1807536f11c90874065c2a4392"
end
end

describe "fingerprint string with SHA384 HMAC algorithm" do
config <<-CONFIG
filter {
fingerprint {
Expand All @@ -134,7 +195,7 @@
end
end

describe "fingerprint string with SHA384 alogrithm and base64 encoding" do
describe "fingerprint string with SHA384 HMAC algorithm and base64 encoding" do
config <<-CONFIG
filter {
fingerprint {
Expand All @@ -151,7 +212,22 @@
end
end

describe "fingerprint string with SHA512 alogrithm" do
describe "fingerprint string with SHA512 algorithm" do
config <<-CONFIG
filter {
fingerprint {
source => ["clientip"]
method => 'SHA512'
}
}
CONFIG

sample("clientip" => "123.123.123.123") do
insist { subject.get("fingerprint") } == "5468e2dc64ea92b617782aae884b35af60041ac9e168a283615b6a462c54c13d42fa9542cce9b7d76a8124ac6616818905e3e5dd35d6e519f77c3b517558639a"
end
end

describe "fingerprint string with SHA512 HMAC algorithm" do
config <<-CONFIG
filter {
fingerprint {
Expand All @@ -167,7 +243,7 @@
end
end

describe "fingerprint string with SHA512 alogrithm and base64 encoding" do
describe "fingerprint string with SHA512 HMAC algorithm and base64 encoding" do
config <<-CONFIG
filter {
fingerprint {
Expand All @@ -184,7 +260,22 @@
end
end

describe "fingerprint string with MD5 alogrithm" do
describe "fingerprint string with MD5 algorithm" do
config <<-CONFIG
filter {
fingerprint {
source => ["clientip"]
method => 'MD5'
}
}
CONFIG

sample("clientip" => "123.123.123.123") do
insist { subject.get("fingerprint") } == "ccdd8d3d940a01b2fb3258c059924c0d"
end
end

describe "fingerprint string with MD5 HMAC algorithm" do
config <<-CONFIG
filter {
fingerprint {
Expand All @@ -200,7 +291,7 @@
end
end

describe "fingerprint string with MD5 alogrithm and base64 encoding" do
describe "fingerprint string with MD5 HMAC algorithm and base64 encoding" do
config <<-CONFIG
filter {
fingerprint {
Expand Down