Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elasticsearch output - failed with response of 400 #144

Closed
jordansissel opened this issue May 18, 2015 · 59 comments
Closed

elasticsearch output - failed with response of 400 #144

jordansissel opened this issue May 18, 2015 · 59 comments

Comments

@jordansissel
Copy link
Contributor

(This issue was originally filed by @MarkusMayer at elastic/logstash#3131)


Hi,
I am using 1.5rc3 on Windows. I tried to specify a variable index-naming scheme in my elasticsearch output like this in my output section:

    elasticsearch { 
        host => localhost 
        cluster => "eVtgLogs"
        embedded => false
        index => "logstash-%{environment}-%{+YYYY.MM.dd}"
    } 

When using this config I my events won't ship to ES and I get the following error in my logstash output:

"failed action with response of 400, dropping action" (full message is below)

Any idea what I've misconfigured?

Thanks in advance,
Markus

Data: https://gist.githubusercontent.com/jordansissel/9d06c66d27f4d09c72d5/raw/36f6d8bda7ee5475c66914b0469140fe9b95361e/data.log

@blakeja
Copy link

blakeja commented May 18, 2015

I'm seeing the the same problem running Logstash 1.5.0 on Windows 2012 R2.

@naisanza
Copy link

I'm running Logstash 1.5.0 on Ubuntu server 14.10. I noticed this when I had uppercased letters in my index, but the --configtest passed it as OK.

output {
if [type] == "TestData" {
elasticsearch {
index => "TestData"
}
}
}

@elvarb
Copy link

elvarb commented May 22, 2015

I'm seeing the same, an extremely long log line for only one single windows eventID sent with nxlog from only one server. This is using Logstash 1.5.0 on Windows.

For some reason it seems the log is repeating the message again and again and again. Is it possible that for some reason it doesnt send the event right away but buffers it and then the buffer fills up? That single log line is about 45023 characters long.

There are then other log lines since I upgraded to 1.5 with the same event ID from the same server that are about 22510 characters long. Its also possible this bug crashed the indexing process for me yesterday but I'm not sure since I was not around.

15:14 Service started
15:19 A probably unrelated JSON parsing errror, what is strange about that one is that in the log file it takes up two lines and that its from the logstash server itself.
15:28 "failed action with response of 400, dropping action" 45023 characters long.
16:00 Reported last event processed through this Logstash Process.
17:13 "failed action with response of 400, dropping action" 22511 characters long.

The last event is still coming through every now and then.

@naisanza
Copy link

@elvarb do you have a copy of your config file?

@elvarb
Copy link

elvarb commented May 26, 2015

@naisanza here it is, with all its odds and ends, details retracted.

The event that is causing this is from I think our only Windows 2008 32bit server and the eventID is 4616. nxLog is used as a shipper and it converts the messages to json before shipping to Logstash.

input {
  rabbitmq {
    exchange => "logstash" 
    key => "logs.winserv"
    queue => "logs.winserv"
    host => "rabbitmq-server"
    auto_delete => false
    exclusive => false
    durable => true
  }
}

filter {
  if "winserv" in [tags] {
    json {
      source => "message"
      target => "winevent"
    }
    mutate {
      remove_field => [ "message" ]
    }
    date {
      match => ["[winevent][EventTime]", "YYYY-MM-dd HH:mm:ss"]
    }

    ## LogonType
    if [winevent][LogonType] {
      translate {
        field => "[winevent][LogonType]"
        destination => "[winevent][LogonTypeDesc]"
        dictionary_path => "winevent_logontype.txt"
      }
    }

    ## Add sourcegroup field to group similar sourcenames together
    if [winevent][SourceName] {
      translate {
        field => "[winevent][SourceName]"
        destination => "[winevent][SourceGroup]"
        dictionary_path => "winevent_sourcegroups.txt"
        add_tag => [ "%{[winevent][SourceGroup]}" ]
      }
    }

    ## Add servicegroup field to group similar services together
    if [winevent][SourceName] == "Service Control Manager" {
      translate {
        field => "[winevent][param1]"
        destination => "[winevent][ServiceGroup]"
        dictionary_path => "winevent_servicegroups.txt"
        add_tag => [ "%{[winevent][ServiceGroup]}" ]
      }
    }

    ## Add firewallgroup field to group similar firewall rules together
    if [winevent][Category] == "MPSSVC Rule-Level Policy Change" {
      translate {
        field => "[winevent][RuleName]"
        destination => "[winevent][FirewallGroup]"
        dictionary_path => "winevent_firewallgroups.txt"
        add_tag => [ "%{[winevent][FirewallGroup]}" ]
      }
    }

    ## Add filtergroup field to group similar filters together
    if [winevent][Category] == "Other Policy Change Events" {
      translate {
        field => "[winevent][FilterName]"
        destination => "[winevent][FilterGroup]"
        dictionary_path => "winevent_filtergroups.txt"
        add_tag => [ "%{[winevent][FilterGroup]}" ]
      }
    }

    ## Parse InternalApp JSON
    if [winevent][SourceGroup] == "InternalApp" {
      if [winevent][Message] =~ "^{" {
        range {
          ranges => [
            "[winevent][Message]", 0, 30000, "tag:InternalAppNormal",
            "[winevent][Message]", 30001, 1000000, "tag:InternalAppLong"
          ]
          #ranges => [ "%{[winevent][Message]}", 31839, 1e1000, "tag:InternalAppLong" ]
        }
        if "InternalAppNormal" in [tags] {
          json {
            source => "[winevent][Message]"
            target => "InternalApp"
          }
        }
      } else {
        mutate {
          add_tag => "InternalAppBrokenJSON"
          remove_tag => "InternalApp"
        }
      }
    }

    ## Throttle
    ## For InternalApp Applications
    if [InternalApp][Level] == "Error" and [InternalApp][Environment] == "PROD" {
      throttle {
        period => 60
        before_count => 2
        after_count => -1

        # Events with the same key will be throttled as a group. Field substitutions are allowed, so you can combine multiple fields.
        # For InternalApp events use maybe system,subsystem and logger
        #key => "%{host}%{message}"
        key => "%{[InternalApp][System]}"
        #max_counters => ... # number (optional), default: 100000
        #remove_field => ... # array (optional), default: []
        #remove_tag => ... # array (optional), default: []
        #add_field => ... # hash (optional), default: {}
        add_tag => "InternalAppalert"
      }
    }
  }
}

output {
  if "winserv" in [tags] {
    elasticsearch {
      protocol => "http"
      host => "logstash-server"
    }
    if "AV" in [tags] {
      elasticsearch {
        protocol => "http"
        host => "logstash-server"
        index => "log365-%{+YYYY.MM.dd}"
      }
    }
    if [InternalApp][Environment] == "DEV" or [InternalApp][Environment] == "TEST" {
      elasticsearch {
        protocol => "http"
        host => "elasticsearch-test-server"
      }
    }


    if "InternalAppalert" in [tags] {
      email {
        from => "elk@domain.com"
        to => "error@domain.com"
        subject => "%{[InternalApp][System]} - %{[InternalApp][Level]} in %{[InternalApp][SubSystem]}"
        via => "smtp"
        options => {
          smtpIporHost => "smtp-server"
          port => 25
        }
        htmlbody => "
          <body  offset='0' class='body' style='padding:0; margin:0; display:block; background:#ECEFF1; font-family: Arial, sans-serif; -webkit-text-size-adjust:none' bgcolor='#ECEFF1'>
              <center>
                <table cellspacing='0' cellpadding='0' style='background:#607D8B;' bgcolor='#607D8B' width='95%'>
                  <tr>
                    <td style='font-size:24px; font-weight: 600; font-family: Arial, sans-serif; color: #FFFFFF; text-align:center;' >
                      &nbsp;<br>
                      InternalApp Application Log<br>
                    </td>
                  </tr>
                  <tr>
                    <td style='font-size:15px; font-family: Arial, sans-serif; text-align:center; padding: 0 75px; color:#ECEFF1;'>
                     Throttle triggered, please view Kibana for more details<br>
                      &nbsp;
                    </td>
                  </tr>
                </table>
              </center>
              <center>
                <table cellspacing='0' cellpadding='0' width='95%' style='font-family: Arial, sans-serif;font-size:12px;'>
                  <tr style='background:#CFD8DC'>
                    <td style='border-bottom:1px solid #e3e3e3; font-weight: bold; text-align: left;'>
                      Field
                    </td>
                    <td style='border-bottom:1px solid #e3e3e3; font-weight: bold;'>
                      Value
                    </td>
                  </tr>
                  <tr>
                    <td style='text-align: left;'>
                      EventTime
                    </td>
                    <td >
                     %{[winevent][EventTime]}
                    </td>
                  </tr>
                  <tr>
                    <td style='text-align: left;'>
                      Hostname
                    </td>
                    <td >
                     %{[winevent][Hostname]}
                    </td>
                  </tr>
                  <tr>
                    <td style='text-align: left;'>
                      Environment
                    </td>
                    <td >
                     %{[InternalApp][Environment]}
                    </td>
                  </tr>
                  <tr>
                    <td style='text-align: left;'>
                      Level
                    </td>
                    <td >
                     %{[InternalApp][Level]}
                    </td>
                  </tr>
                  <tr>
                    <td style='text-align: left;'>
                      System
                    </td>
                    <td >
                     %{[InternalApp][System]}
                    </td>
                  </tr>
                  <tr>
                    <td style='text-align: left;'>
                      SubSystem
                    </td>
                    <td >
                     %{[InternalApp][SubSystem]}
                    </td>
                  </tr>
                  <tr>
                    <td style='text-align: left;'>
                      Logger
                    </td>
                    <td >
                     %{[InternalApp][Logger]}
                    </td>
                  </tr>
                </table>
              </center>

              <center>
                <table style='margin: 0 auto; font-family: Arial, sans-serif; font-size:12px; cellspacing='0' cellpadding='0' width='95%'>
                  <tr style='background:#CFD8DC'>
                    <td style='border-bottom:1px solid #e3e3e3; font-weight: bold; text-align: left;'>
                      Message
                    </td>
                  </tr>
                  <tr>
                    <td>
                      %{[InternalApp][Message]}
                    </td>
                  </tr>
                </table>
              </center>
              &nbsp;<br>
              <center>
              <table style='margin:0 auto; font-family: Arial, sans-serif;' cellspacing='0' cellpadding='10' width='20%'>
                <tr>
                  <td style='text-align:center; margin:0 auto;background-color:#f5774e;color:#ffffff;display:inline-block;font-family: Arial, sans-serif;;font-size:18px;font-weight:400;line-height:25px;text-align:center;text-decoration:none;width:180px;-webkit-text-size-adjust:none;'>
                    <a href='http://kibana.domain.com/#/dashboard/elasticsearch/domain%20%{[InternalApp][System]}' style='background-color:#f5774e;color:#ffffff;display:inline-block;font-family: Arial, sans-serif;;font-size:18px;font-weight:400;line-height:25px;text-align:center;text-decoration:none;width:180px;-webkit-text-size-adjust:none;'>
                      Go to Kibana
                    </a>
                  </td>
                </tr>
              </table>
              </center>
          </body>"
      }
    }
  }
}

@avcherkasov
Copy link

I have logstash 1.5.0 on debian.
I have the same error in my cluster: failed action with response of 400, dropping action
My config looks like:

input {

heartbeat {
    interval => 300
    type => "heartbeat"
}

udp {
    port => 60606
    type => syslog
}

tcp {
    port => 10514
    type => syslog
}

tcp {
    port => 1514
    type => syslog
}

gelf {
    codec => "plain"
    host => "0.0.0.0"
    port => 12201
    type => "gelf"
}

udp {
    codec => "json"
    host => "0.0.0.0"
    port => 12211
    type => "erlang"
}

}

filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "ISO8601", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
      add_tag => "Syslog Timestamped"
      locale => "en"

    }
  }
}


output {

elasticsearch {
    cluster => "elasticsearch"
}

}

I getting this error in output for gelf input.

@elvarb
Copy link

elvarb commented May 27, 2015

Same problem happened to me last night.

Getting "failed action with response of 400" errors all the time, once every hour or so.

Also getting "Got error to send bulk of actions to elasticsearch server at servername : Read timed out"
Followed by "Failed to flush outgoing items", :outgoing_count=>7, :exception=>#<Manticore::Timeout: Read timed out>, :backtrace=>"
Though that is not as common.

But last night the Logstash process stopped again with the same pattern.
The normal 400 error log lines are about 22000 characters long but when a 45020 character long line comes the process stops.

Edit:

Seems that the 45k lines are actually two events of the same type in the same line. The first half is when the process stopped and the second half when the service was restarted. Strange that when the service is started again that it doesnt start with a new line here.

@elvarb
Copy link

elvarb commented May 27, 2015

After analyzing the log file better it seems that somewhere a loop is not being closed on the right place. There are nested objects all over.

The line contains the following objects

@cancelled
@data
@metadata
@accessors
@store
@lut

The lut one is special because each object there contains the whole data object instead of the actual item.

It starts with
In the tags object it starts with the whole @DaTa object and I will just use @DaTa to reference to its contents. Then each object within @DaTa is called and references the whole @DaTa object.

@lut={\"
        [tags]\"=> 
            @data
            \"message\"=> @data
            \"winevent\"=> @data
            \"[winevent][EventTime]\"=> @data

What is strange is that within the logs the data inside tags comes up like this

\"tags\"=>#<Java::JavaUtil::ArrayList:0x22150fe7>, 

Is this normal?

@elvarb
Copy link

elvarb commented May 27, 2015

I added a output file filter to write the raw event to file before it got shipped to Elasticsearch.
Its possible that this is caused by a mapping error, but I cant find any mapping errors in Elasticsearch.

For the one that keeps giving me an error this field is added
"NewTime": "13:01:11",

For the one that works this field is added
"NewTime": "2015-05-27T12:46:12.798000000Z",

The mapping in elasticsearch is
"NewTime" : {
"format" : "dateOptionalTime",
"type" : "date"
},

This is the only difference worth noting between them. There is only one tag defined for both and it is exactly the same.

@avcherkasov
Copy link

This error spaming to my log every second! +1gb to file size every hour!
Please, give some advice? how may I turn off detail logging?

@jsvd
Copy link
Member

jsvd commented May 29, 2015

@avcherkasov can you provide the versions of logstash and elasticsearch you're using, the logstash configuration and one of those errors?

@avcherkasov
Copy link

i have this cluster:http://prntscr.com/7aouan
there you can see my versions of programs
config file looks like: http://pastebin.com/ihK2CXqz
these errors starts like: {:timestamp=>"2015-05-29T12:55:14.145000+0300", :message=>"failed action with response of 400, dropping action: ["index", {:_id=>nil, :_index=>"logstash-2015.05.29", :_type=>"gelf", :_routing=>nil}, #<LogStash::Event:0x64c1fbd6 @metadata_accessors=#<LogStash::Util::Accessors:0x44a78bd @store={"retry_count"=>0}, @lut={}>, @cancelled=false, @DaTa=
next going my data, which is confidential, sorry ;)

@jsvd
Copy link
Member

jsvd commented May 29, 2015

are you able to reduce the number of inputs and determine which one is causing the 400 errors? for example keeping only the tcp { } inputs

@avcherkasov
Copy link

I know that 98% of messages are going by
gelf {
codec => "plain"
host => "0.0.0.0"
port => 12201
type => "gelf"
}
And this error appeared from them.
I'm not saying that I send in the correct format to this port. Maybe you have an example for right gelf message?

@MarkusMayer
Copy link

hi,
in my case the reason was straight forward. apparently elasticsearch index names must not contain upper case letters among other things (as outlined here for example: elastic/elasticsearch#6736)

when I fixed my index name pattern it worked fine. a somewhat nicer error message would be helpful for non es pro's however.

thanks,
markus

@avcherkasov
Copy link

@MarkusMayer thanks for answer.
My index pattern has name: logstash-*
So I need to rename it? because:
These characters are already illegal: , /, *, ?, ", <, >, |, , ,. We should also add the null byte.
How may I make it?

@elvarb
Copy link

elvarb commented May 29, 2015

@avcherkasov That index pattern is ok, that just means that it prefixes the index name with "logstash-" and then with a datestamp

@elvarb
Copy link

elvarb commented May 29, 2015

I modified my config to drop that specific message causing the 400 error, makes no change to me since that message never reached the elasticsearch cluster anyways.

What I did notice is that when this error comes up and I kill the service the java.exe process does not die with it. If there are no 400 errors since the java.exe process started the process dies normally. This means that it does not release the log file among other things.

@naisanza
Copy link

I would have said the same thing - a better error message. I ran into this same problem. However, it is directly documented in the Logstash. You would have had to read the documentation to have known to use the Index field.

Index

Value type is string
Default value is "logstash-%{+YYYY.MM.dd}"

The index to write events to. This can be dynamic using the %{foo} syntax. The default value will partition your indices by day so you can more easily delete old data or only search specific date ranges. Indexes may not contain uppercase characters.

On May 29, 2015 8:07:36 AM EDT, MarkusMayer notifications@github.com wrote:

hi,
in my case the reason was straight forward. apparently elasticsearch
index names must not contain upper case letters among other things (as
outlined here for example:
elastic/elasticsearch#6736)

when I fixed my index name pattern it worked fine. a somewhat nicer
error message would be helpful for non es pro's however.

thanks,
markus


Reply to this email directly or view it on GitHub:
#144 (comment)

Sent from my Android device with K-9 Mail. Please excuse my brevity.

@PavelPolyakov
Copy link

Was there any update here?

I've updated the ES to 1.5.2 and logstash to 1.5.0 and started to had this error.
And I don't know where to go :)

My logstash config looks the next way:

        elasticsearch {
                host => "xxx"
                protocol => "http"
                user => "xxx"
                password => "xxx"
                port => 9200
                flush_size => 5000
                workers => 4
                manage_template => false
                index => "logstash-%{+YYYY.MM.dd}"
        }

upd 1
as soon as I've renamed the index template to

            index => "logstash2-%{+YYYY.MM.dd}"

It started to work better, but I still receive the same message. time to time, which means that a lot of the messages are not hitting the ES.

I assume that we had an issue, because index with the name logstash-2015.06.08 was already created, and logstash (ES) doesn't want to write there, or something

** upd 2 **
Here is my error, complete:

{:timestamp=>"2015-06-08T23:37:34.955000+0200", :message=>"failed action with response of 400, dropping action: ["index", {:_id=>nil, :_index=>"logstash-2015.06.08", :_type=>"yyy", :_routing=>nil}, #<LogStash::Event:0x36be02ca @metadata={"retry_count"=>0}, @accessors=#<LogStash::Util::Accessors:0x1191c365 @store={"message"=>"iii", "@Version"=>"1", "@timestamp"=>"2015-06-08T21:37:31.836Z", "host"=>"www", "type"=>"yyy", "client"=>"qqq", "country"=>"es", "level"=>"INFO", "data"=>{"userUuid"=>"zzz", "appUuid"=>"ppp", "tags"=>#Java::JavaUtil::ArrayList:0x6e500547, "sessionId"=>"63784859", "login"=>"xxx", "result"=>{"passwordMetadata"=>{"name"=>"ooo", "mask"=>false}}}}, @lut={"[message]"=>[{"message"=>"iii", "@Version"=>"1", "@timestamp"=>"2015-06-08T21:37:31.836Z", "host"=>"www", "type"=>"yyy", "client"=>"qqq", "country"=>"es", "level"=>"INFO", "data"=>{"userUuid"=>"zzz", "appUuid"=>"ppp", "tags"=>#Java::JavaUtil::ArrayList:0x6e500547, "sessionId"=>"63784859", "login"=>"xxx", "result"=>{"passwordMetadata"=>{"name"=>"ooo", "mask"=>false}}}}, "message"], "[type]"=>[{"message"=>"iii", "@Version"=>"1", "@timestamp"=>"2015-06-08T21:37:31.836Z", "host"=>"www", "type"=>"yyy", "client"=>"qqq", "country"=>"es", "level"=>"INFO", "data"=>{"userUuid"=>"zzz", "appUuid"=>"ppp", "tags"=>#Java::JavaUtil::ArrayList:0x6e500547, "sessionId"=>"63784859", "login"=>"xxx", "result"=>{"passwordMetadata"=>{"name"=>"ooo", "mask"=>false}}}}, "type"], "message"=>[{"message"=>"iii", "@Version"=>"1", "@timestamp"=>"2015-06-08T21:37:31.836Z", "host"=>"www", "type"=>"yyy", "client"=>"qqq", "country"=>"es", "level"=>"INFO", "data"=>{"userUuid"=>"zzz", "appUuid"=>"ppp", "tags"=>#Java::JavaUtil::ArrayList:0x6e500547, "sessionId"=>"63784859", "login"=>"xxx", "result"=>{"passwordMetadata"=>{"name"=>"ooo", "mask"=>false}}}}, "message"], "[tags]"=>[{"message"=>"iii", "@Version"=>"1", "@timestamp"=>"2015-06-08T21:37:31.836Z", "host"=>"www", "type"=>"yyy", "client"=>"qqq", "country"=>"es", "level"=>"INFO", "data"=>{"userUuid"=>"zzz", "appUuid"=>"ppp", "tags"=>#Java::JavaUtil::ArrayList:0x6e500547, "sessionId"=>"63784859", "login"=>"xxx", "result"=>{"passwordMetadata"=>{"name"=>"ooo", "mask"=>false}}}}, "tags"], "type"=>[{"message"=>"iii", "@Version"=>"1", "@timestamp"=>"2015-06-08T21:37:31.836Z", "host"=>"www", "type"=>"yyy", "client"=>"qqq", "country"=>"es", "level"=>"INFO", "data"=>{"userUuid"=>"zzz", "appUuid"=>"ppp", "tags"=>#Java::JavaUtil::ArrayList:0x6e500547, "sessionId"=>"63784859", "login"=>"xxx", "result"=>{"passwordMetadata"=>{"name"=>"ooo", "mask"=>false}}}}, "type"]}>, @DaTa={"message"=>"iii", "@Version"=>"1", "@timestamp"=>"2015-06-08T21:37:31.836Z", "host"=>"www", "type"=>"yyy", "client"=>"qqq", "country"=>"es", "level"=>"INFO", "data"=>{"userUuid"=>"zzz", "appUuid"=>"ppp", "tags"=>#Java::JavaUtil::ArrayList:0x6e500547, "sessionId"=>"63784859", "login"=>"xxx", "result"=>{"passwordMetadata"=>{"name"=>"ooo", "mask"=>false}}}}, @metadata_accessors=#<LogStash::Util::Accessors:0x777eb8cf @store={"retry_count"=>0}, @lut={}>, @cancelled=false>]", :level=>:warn}

Any advice how to overcome those issues?

@T0aD
Copy link

T0aD commented Jun 9, 2015

Hi,
We have a similar issue while writing to ES and of course, no explicit error message to tell us what going on. I tried to elevate error priority level to DEBUG, to check ES logs for errors, nothing. Any hint ? Thanks!

Its quite frustrating since we thought it would be easy to convert a kafka river to a logstash kafka-input to ES-output.. :)

@popsikle
Copy link

popsikle commented Jun 9, 2015

This was happening to me as well. I was setting the tags field to an empty array instead of none and that seems to kill the transport. Started dropping tags if empty and stuff started working again.

@PavelPolyakov
Copy link

@popsikle
You mean it's not possible to pass the empty arrays to the ES?
I had one issue with array, but did not understand the logic of it.

@popsikle
Copy link

popsikle commented Jun 9, 2015

Well, the empty array only fixed a handful of my messages, turns out i still got a problem.

{:timestamp=>"2015-06-09T14:35:56.928000+0000", :message=>"failed action with response of 400, dropping action: [\"index\", {:_id=>nil, :_index=>\"logstash-2015.06.09\", :_type=>\"logstash\", :_routing=>nil}, #<LogStash::Event:0x655f1d2a @metadata={\"retry_count\"=>0}, @accessors=#<LogStash::Util::Accessors:0x99785e2 @store={\"environment\"=>\"production-hawk\", \"tags\"=>#<Java::JavaUtil::ArrayList:0x68293c3c>, \"@version\"=>1, \"@timestamp\"=>\"2015-06-09T14:35:55.538Z\", \"appname\"=>\"crick\", \"celery\"=>{\"chord\"=>nil, \"retries\"=>0, \"args\"=>#<Java::JavaUtil::ArrayList:0x63c8b558>, \"is_eager\"=>false, \"correlation_id\"=>\"f895188b-2fc3-45f3-b783-1d6fa42a58b8\", \"errbacks\"=>nil, \"taskset\"=>nil, \"id\"=>\"f895188b-2fc3-45f3-b783-1d6fa42a58b8\", \"headers\"=>{}, \"called_directly\"=>false, \"utc\"=>true, \"task\"=>\"crick.tasks.queue.rate_limited_crawl\", \"group\"=>nil, \"callbacks\"=>nil, \"delivery_info\"=>{\"priority\"=>nil, \"redelivered\"=>false, \"routing_key\"=>\"crawler\", \"exchange\"=>\"crawler\"}, \"hostname\"=>\"celery@dda45c212157.crawler\", \"expires\"=>nil, \"timelimit\"=>#<Java::JavaUtil::ArrayList:0x708f60ea>, \"eta\"=>nil, \"kwargs\"=>{\"source_id\"=>13567, \"client_name\"=>\"outland_queue_all_include_rate_limited\", \"namespace\"=>\"ol_vp\", \"enqueue_time\"=>1433830056.703436, \"client_id\"=>dsfsdfwww}, \"reply_to\"=>\"bb116919-cf36-30d4-9ad5-750a868d7e45\", \"_protected\"=>1}, \"host\"=>\"dda45c212157\", \"path\"=>\"/opt/crick/releases/f5d3814775b59b3a317108d30f6e674a4223ccfb/crick/tasks/queue.py\", \"logger\"=>\"crick.tasks.queue\", \"type\"=>\"logstash\", \"message\"=>\"Starting crawl of url http://www.xxxx.com/ccc-ffff (client_id: ffff, priority: low_priority, job: f895188b-2fc3-45f3-b783-1d6fa42a58b8, client_name: outland_queue_all_include_rate_limited)\", \"levelname\"=>\"INFO\"}, @lut={\"host\"=>[{\"environment\"=>\"production-hawk\", \"tags\"=>#<Java::JavaUtil::ArrayList:0x68293c3c>, \"@version\"=>1, \"@timestamp\"=>\"2015-06-09T14:35:55.538Z\", \"appname\"=>\"crick\", \"celery\"=>{\"chord\"=>nil, \"retries\"=>0, \"args\"=>#<Java::JavaUtil::ArrayList:0x63c8b558>, \"is_eager\"=>false, \"correlation_id\"=>\"f895188b-2fc3-45f3-b783-1d6fa42a58b8\", \"errbacks\"=>nil, \"taskset\"=>nil, \"id\"=>\"f895188b-2fc3-45f3-b783-1d6fa42a58b8\", \"headers\"=>{}, \"called_directly\"=>false, \"utc\"=>true, \"task\"=>\"crick.tasks.queue.rate_limited_crawl\", \"group\"=>nil, \"callbacks\"=>nil, \"delivery_info\"=>{\"priority\"=>nil, \"redelivered\"=>false, \"routing_key\"=>\"crawler\", \"exchange\"=>\"crawler\"}, \"hostname\"=>\"celery@dda45c212157.crawler\", \"expires\"=>nil, \"timelimit\"=>#<Java::JavaUtil::ArrayList:0x708f60ea>, \"eta\"=>nil, \"kwargs\"=>{\"source_id\"=>13567, \"client_name\"=>\"outland_queue_all_include_rate_limited\", \"namespace\"=>\"ol_vp\", \"enqueue_time\"=>1433830056.703436, \"client_id\"=>sdsdsdee}, \"reply_to\"=>\"bb116919-cf36-30d4-9ad5-750a868d7e45\", \"_protected\"=>1}, \"host\"=>\"dda45c212157\", \"path\"=>\"/opt/crick/releases/f5d3814775b59b3a317108d30f6e674a4223ccfb/crick/tasks/queue.py\", \"logger\"=>\"crick.tasks.queue\", \"type\"=>\"logstash\", \"message\"=>\"Starting crawl of url http://www.zazzle.com/its_cold_outside-235404158988585988 (client_id: sssss, priority: low_priority, job: f895188b-2fc3-45f3-b783-1d6fa42a58b8, client_name: outland_queue_all_include_rate_limited)\", \"levelname\"=>\"INFO\"}, \"host\"], \"[type]\"=>[{\"environment\"=>\"production-hawk\", \"tags\"=>#<Java::JavaUtil::ArrayList:0x68293c3c>, \"@version\"=>1, \"@timestamp\"=>\"2015-06-09T14:35:55.538Z\", \"appname\"=>\"crick\", \"celery\"=>{\"chord\"=>nil, \"retries\"=>0, \"args\"=>#<Java::JavaUtil::ArrayList:0x63c8b558>, \"is_eager\"=>false, \"correlation_id\"=>\"f895188b-2fc3-45f3-b783-1d6fa42a58b8\", \"errbacks\"=>nil, \"taskset\"=>nil, \"id\"=>\"f895188b-2fc3-45f3-b783-1d6fa42a58b8\", \"headers\"=>{}, \"called_directly\"=>false, \"utc\"=>true, \"task\"=>\"crick.tasks.queue.rate_limited_crawl\", \"group\"=>nil, \"callbacks\"=>nil, \"delivery_info\"=>{\"priority\"=>nil, \"redelivered\"=>false, \"routing_key\"=>\"crawler\", \"exchange\"=>\"crawler\"}, \"hostname\"=>\"celery@dda45c212157.crawler\", \"expires\"=>nil, \"timelimit\"=>#<Java::JavaUtil::ArrayList:0x708f60ea>, \"eta\"=>nil, \"kwargs\"=>{\"source_id\"=>13567, \"client_name\"=>\"outland_queue_all_include_rate_limited\", \"namespace\"=>\"ol_vp\", \"enqueue_time\"=>1433830056.703436, \"client_id\"=>779350233124440647}, \"reply_to\"=>\"bb116919-cf36-30d4-9ad5-750a868d7e45\", \"_protected\"=>1}, \"host\"=>\"dda45c212157\", \"path\"=>\"/opt/crick/releases/f5d3814775b59b3a317108d30f6e674a4223ccfb/crick/tasks/queue.py\", \"logger\"=>\"crick.tasks.queue\", \"type\"=>\"logstash\", \"message\"=>\"Starting crawl of url http://www.yyyy.com/cccc-xxxx (client_id: ccc, priority: low_priority, job: f895188b-2fc3-45f3-b783-1d6fa42a58b8, client_name: outland_queue_all_include_rate_limited)\", \"levelname\"=>\"INFO\"}, \"type\"], \"type\"=>[{\"environment\"=>\"production-hawk\", \"tags\"=>#<Java::JavaUtil::ArrayList:0x68293c3c>, \"@version\"=>1, \"@timestamp\"=>\"2015-06-09T14:35:55.538Z\", \"appname\"=>\"crick\", \"celery\"=>{\"chord\"=>nil, \"retries\"=>0, \"args\"=>#<Java::JavaUtil::ArrayList:0x63c8b558>, \"is_eager\"=>false, \"correlation_id\"=>\"f895188b-2fc3-45f3-b783-1d6fa42a58b8\", \"errbacks\"=>nil, \"taskset\"=>nil, \"id\"=>\"f895188b-2fc3-45f3-b783-1d6fa42a58b8\", \"headers\"=>{}, \"called_directly\"=>false, \"utc\"=>true, \"task\"=>\"crick.tasks.queue.rate_limited_crawl\", \"group\"=>nil, \"callbacks\"=>nil, \"delivery_info\"=>{\"priority\"=>nil, \"redelivered\"=>false, \"routing_key\"=>\"crawler\", \"exchange\"=>\"crawler\"}, \"hostname\"=>\"celery@dda45c212157.crawler\", \"expires\"=>nil, \"timelimit\"=>#<Java::JavaUtil::ArrayList:0x708f60ea>, \"eta\"=>nil, \"kwargs\"=>{\"source_id\"=>13567, \"client_name\"=>\"outland_queue_all_include_rate_limited\", \"namespace\"=>\"ol_vp\", \"enqueue_time\"=>1433830056.703436, \"client_id\"=>rrr}, \"reply_to\"=>\"bb116919-cf36-30d4-9ad5-750a868d7e45\", \"_protected\"=>1}, \"host\"=>\"dda45c212157\", \"path\"=>\"/opt/crick/releases/f5d3814775b59b3a317108d30f6e674a4223ccfb/crick/tasks/queue.py\", \"logger\"=>\"crick.tasks.queue\", \"type\"=>\"logstash\", \"message\"=>\"Starting crawl of url http://www.ccc.com/ddd-rrrrr (client_id: xssss, priority: low_priority, job: f895188b-2fc3-45f3-b783-1d6fa42a58b8, client_name: outland_queue_all_include_rate_limited)\", \"levelname\"=>\"INFO\"}, \"type\"]}>, @data={\"environment\"=>\"production-hawk\", \"tags\"=>#<Java::JavaUtil::ArrayList:0x68293c3c>, \"@version\"=>1, \"@timestamp\"=>\"2015-06-09T14:35:55.538Z\", \"appname\"=>\"crick\", \"celery\"=>{\"chord\"=>nil, \"retries\"=>0, \"args\"=>#<Java::JavaUtil::ArrayList:0x63c8b558>, \"is_eager\"=>false, \"correlation_id\"=>\"f895188b-2fc3-45f3-b783-1d6fa42a58b8\", \"errbacks\"=>nil, \"taskset\"=>nil, \"id\"=>\"f895188b-2fc3-45f3-b783-1d6fa42a58b8\", \"headers\"=>{}, \"called_directly\"=>false, \"utc\"=>true, \"task\"=>\"crick.tasks.queue.rate_limited_crawl\", \"group\"=>nil, \"callbacks\"=>nil, \"delivery_info\"=>{\"priority\"=>nil, \"redelivered\"=>false, \"routing_key\"=>\"crawler\", \"exchange\"=>\"crawler\"}, \"hostname\"=>\"celery@dda45c212157.crawler\", \"expires\"=>nil, \"timelimit\"=>#<Java::JavaUtil::ArrayList:0x708f60ea>, \"eta\"=>nil, \"kwargs\"=>{\"source_id\"=>13567, \"client_name\"=>\"outland_queue_all_include_rate_limited\", \"namespace\"=>\"ol_vp\", \"enqueue_time\"=>1433830056.703436, \"client_id\"=>rrr}, \"reply_to\"=>\"bb116919-cf36-30d4-9ad5-750a868d7e45\", \"_protected\"=>1}, \"host\"=>\"dda45c212157\", \"path\"=>\"/opt/crick/releases/f5d3814775b59b3a317108d30f6e674a4223ccfb/crick/tasks/queue.py\", \"logger\"=>\"crick.tasks.queue\", \"type\"=>\"logstash\", \"message\"=>\"Starting crawl of url http://www.ddd.com/sss-rrrr (client_id: rrr, priority: low_priority, job: f895188b-2fc3-45f3-b783-1d6fa42a58b8, client_name: outland_queue_all_include_rate_limited)\", \"levelname\"=>\"INFO\"}, @metadata_accessors=#<LogStash::Util::Accessors:0x52662096 @store={\"retry_count\"=>0}, @lut={}>, @cancelled=false>]", :level=>:warn}

@PavelPolyakov
Copy link

@popsikle
You still have

\"tags\"=>#<Java::JavaUtil::ArrayList:0x68293c3c>

But this time array is not empty?

I have the messages like this as well, I am also trying to pass the array to the logstash and ES.

@popsikle
Copy link

popsikle commented Jun 9, 2015

The real issue was not the tags. The real issue was that I am sending a bunch of extra fields from my apps to ES via logstash and some of the types clashed between applications. I have auto-mapping turned on and one app was trying to send "client_id" as string and one was sending it as "bool". I figured this out by moving the logstash box that was having an issue to a new index (logstash-appname-DATE) and compared the mappings for my new index with my old index.

@PavelPolyakov
Copy link

@popsikle

Correct, I think we have the same issue. Which I've explained above.

Thanks for your input.

@nullbuddy1243
Copy link

I was encountering this same issue when trying to create a template for handling netflow data I was inputting into ES. I'm not sure what field was causing the issue but once I removed the template, everything worked fine.

To circumvent possible kibana type issues, ie. a field that should be an integer is a string, I used logstash to convert the fields to their proper type. I guess elasticsearch figured out the mapping on its own.

I have to +1 the idea of better logging for this kind of error. A dream scenario would be if the error log mentioned which field transgressed upon the elasticsearch field mapping's / template's type expectation.

@julienbeclin , do you have any ES field mappings / templates that could be interfering? Try removing any field mappings / templates you may have. It worked for me/

@fninja
Copy link

fninja commented Jun 17, 2015

i had this, too.
reverted to logstash 1.5 RC and everything is fine now, at least as far as i can see.

@soichih
Copy link

soichih commented Jun 18, 2015

@MarkusMayer So the issue here is that, index name must not contain any upper case character nor space (or others?).

I am having problem coming up with a working logstash filter config that cleanses my index name. I tried something like this, but logstash's lowercase/gsub doesn't seem to be doing anything.

filter {
    mutate {
        add_field => { "es_index" => "%{Modality}.%{ManufacturerModelName}.%{StationName}.%{SoftwareVersions}" }
        gsub => [
            "es_index", " ", "_"
        ]
        lowercase => [ "es_index" ]
    }
    date {
        match => [ "_series_timestamp", "YYYYMMdd HHmmss.SSSSSS" ]
    }
}

output {
    elasticsearch {
        host => elasticsearch
        index => "dicom-%{es_index}.%{+YYYY}"
    }
    stdout { codec => rubydebug }
}

I am using logstash 1.5.1

@MarkusMayer
Copy link

@soichih unfortunately there is a problem with mutate lowercase not working for utf-8 strings in the current version (see elastic/logstash#2973)

i had to do some (ugly) workaround for that to cleanse the index name.

look here for naming convention: elastic/elasticsearch#6736

@TinLe
Copy link

TinLe commented Jul 2, 2015

I am using logstash 1.5.2. I do not think this has anything to do with index case name. I've stripped down my logstash config and I have input from exact same data source. Output to two different index name, one works (ei1_mlogstash-%{+YYYY.MM.dd}), the other one (mserver-%{+YYYY.MM.dd}) give me errors in log of "failed action with response of 400, dropping action".

The index ei1_mlogstash-%{YYYY.MM.dd} gets created and data added, but the other one does not even get created -- all data are dropped.

I am using elasticsearch v1.4.5 at the moment, was using logstash v1.4.2, which works perfectly. Upgraded to 1.5.2 and it breaks.

This is very frustrating.

input {
  file {
   path => "/var/log/syslog"
   start_position => "beginning"
   type => "mobile-server"
 }
}

filter {
  date { match => ["timestamp", "UNIX_MS"] }
}

output {
   elasticsearch {
     host => "localhost"
     port => "9200"
     protocol => "http"
     index => "ei1_mlogstash-%{+YYYY.MM.dd}"
   }
   elasticsearch {
     host => "localhost"
     port => "9200"
     protocol => "http"
     index => "mserver-%{+YYYY.MM.dd}"
   }
}

@ph
Copy link
Contributor

ph commented Jul 3, 2015

I've tested the output with this configuration

input {
  generator {}
}
output {
  elasticsearch {
    host => "localhost"
      port => "9200"
      protocol => "http"
      index => "ei1_mlogstash-%{+YYYY.MM.dd}"
  }

  elasticsearch {
    host => "localhost"
      port => "9200"
      protocol => "http"
      index => "mserver-%{+YYYY.MM.dd}"
  }
}

When I query ES I see that they are correctly created:

/t/logstash-1.5.2 ❯❯❯ curl -XGET http://localhost:9200/_aliases\?pretty\=1
{
  "mserver-2015.07.03" : {
    "aliases" : { }
  },
  "ei1_mlogstash-2015.07.03" : {
    "aliases" : { }
  }
}

Can you remove the elasticsearch output and use the stdout output with the rubydebug codec and see if there is any date parsing errors?

@johnarnold
Copy link

See my case #3561 for a detailed example of dynamic mapping a dirty string field to datetime causing this error to spam the logs and fill up the disk. I think the plugin needs to handle these 400 errors more gracefully and not spam multiple copies of the event unless verbose is on. It should be like "Elasticsearch indexing failed action with response of 400, dropping action. Check your index naming, type mappings and elasticsearch logs."

@talevy
Copy link
Contributor

talevy commented Jul 7, 2015

thanks @johnarnold. I followed the progress that was made in #3561 between you and @jordansissel.

I understand that such mapping errors would result in many repeated warnings (if the issue exists in many of your documents)... would you prefer to only see such errors in debug mode? or would you prefer a configuration option instead. As these are not debug warnings, but rather real errors occuring which prevent you from successfully indexing messages

@johnarnold
Copy link

I think they are real errors indicating dropping events, which is important to log at default logging level. However, I think that loggingthe whole event (multiple times per attempt/retry?) makes it harder to understand what is actually happening as a user. A simpler, shorter and more succinct log message would be easier to root cause and less likely to blow up your disk. I think that if the logging level is configurable on the plugin, and set to debug then it would be worthwhile to log the whole event.

@johnarnold
Copy link

Perhaps someone can clarify the intended logic of the plugin when errors occur. Here's my confusion:

In https://github.com/logstash-plugins/logstash-output-elasticsearch/blob/master/lib/logstash/outputs/elasticsearch/protocol.rb#L84 there's a method def self.normalize_bulk_response(bulk_response) which flattens the elasticsearch response to just the status codes for each item (nuking the actual 'error' property which tells us what's going on).

Then in https://github.com/logstash-plugins/logstash-output-elasticsearch/blob/master/lib/logstash/outputs/elasticsearch.rb#L516 the caller handles any bulk responses that contain error codes by logging the whole action, but not the actual error returned from ES.

Wouldn't it make more sense for the client to bubble up the errors rather than normalizing them, and actual log the error returned from ES, rather than the response code and action? Also, couldn't the logging of the whole action (event) be moved to logger.debug so as to avoid filling up the disk?

@talevy
Copy link
Contributor

talevy commented Jul 13, 2015

this can be done yes. A normalization step will likely occur regardless to make the responses from both the node and http clients consistent. Please file a ticket with this request and we can work on that

@stemid
Copy link

stemid commented Jul 28, 2015

What's the status on this? I think a lot of different people are reporting in this thread but I can find some things in common with some of you. This problem appeared after upgrading ES and LS to 1.5. The problem only affects the gelf input listener in my case and this also happens to be where all my MS Windows EventLogs are coming from. Nxlog sends in gelf to my logstash listener.

So I've disabled gelf listener for now to avoid filling disk space, but what is the solution?

@dray0n
Copy link

dray0n commented Jul 29, 2015

Posting if this helps anyone... I was banging my head against a wall trying to figure out what was breaking logstash. Searched the "Error 400" and came across this post. I had a basic output to elasticsearch but have recently been placing more items in for indexing. I added the "index => " it it began spamming the errors.

I found the issue I was having was I placed in "index => "PFSense"... When I placed it in as lowercase it fixed my issues. Hope it helps someone.

@4n3w
Copy link

4n3w commented Aug 4, 2015

What is the status of this? At the risk of editorializing I'm not sure why lowercasing json input would fix this. Seems like a major bug to me, and it's blocking our team (at least from our desire to record activity in elasticsearch).

@talevy
Copy link
Contributor

talevy commented Aug 4, 2015

@4n3w could you please explain your specific scenario that is an issue?

@suyograo
Copy link
Contributor

suyograo commented Aug 4, 2015

@4n3w what exactly is your issue? Please open a new ticket.

I am closing this ticket, please open new issues when you see them

@suyograo suyograo closed this as completed Aug 4, 2015
@elvarb
Copy link

elvarb commented Aug 4, 2015

Why close this ticket?

My issue gives the same error when logstash tries to insert into a field in elasticsearch with a wrong mapping. Thus results in a failed insert and over time crashes logstash without any errors.

@jordansissel
Copy link
Contributor Author

@elvarb there are a ton of responses on this issue with some not-entirely-related problems. As a result, it's hard for us to follow exactly what specific issue (one, not many!) needs resolving. To help sort this out, we'd love for new tickets to be filed specific to each person's issue. We can help sort through the problem reports and combine tickets if there are duplicates filed.

@elvarb
Copy link

elvarb commented Aug 4, 2015

Two things to solve

  1. Have logstash log the insert error better, preferably by creating a different event into elasticsearch. Include error details in the file log.
  2. Solve the logstash internal bug that results in increased thread counts and the inevitable crash/lock down.

@elvarb
Copy link

elvarb commented Aug 4, 2015

@jordansissel

Makes sense, will make a new ticket, soonish. On vacation 😁

@jordansissel
Copy link
Contributor Author

@elvarb Thanks! Please enjoy your vacation! :)

@Gergues
Copy link

Gergues commented Aug 13, 2015

I had the same problem with previous versions of LogStash output to Elasticsearch,

In Short lowercase index name is needed
for your case
elasticsearch {
host => localhost
cluster => "eVtgLogs"
embedded => false
index => "logstash-%{environment}-%{+YYYY.MM.dd}"
}

%{environment} can generate upper case, causing this hang / 400 error .

you can use the mutate filter on your logstash

input {
...
}
filter {
....
lowercase => ["environment"]
...
}
output
{
...
elasticsearch {
host => localhost
cluster => "eVtgLogs"
embedded => false
index => "logstash-%{environment}-%{+YYYY.MM.dd}"
}
}

Best of Luck
George Gergues

@yodog
Copy link

yodog commented Nov 27, 2015

had the same problem yesterday.
and logstash.log was generating 1G of logs per minute.

it all happened because i turned on numeric_detection.
setting it back to false (the default) AND deleting the index solved the issue.

PS: just setting it back to false, without deleting the index will NOT resolve the problem.

"numeric_detection" : false

@Gergues
Copy link

Gergues commented Dec 1, 2015

@yodog
are you using ElasticSearch as index store ?

If so can you share the ElasticSearch section of your LogStash script .

in some cases if the tokens passed to the index name are not resolving properly , they may include some characters causing similar error.

@somu12
Copy link

somu12 commented Mar 22, 2016

I am facing the same problem for my log events. so I tried to create the same log event multiple times which is causing the issue and found some logs reached elasticsearch just fine, some failed with this exception.
logstash version: 1.5.6
Elasticsearch: 1.7.3.
"_index": "logstash-2016.03.22"

Exception:
{:timestamp=>"2016-03-22T06:10:34.511000+0000", :message=>"failed action with response of 400, dropping action: ["index", {:_id=>nil, :_index=>"logstash-2016.03.22", :_type=>"logs", :_routing=>nil}, #<LogStash::Event:0x16b25934 @metadata={"retry_count"=>0}, @accessors=#<LogStash::Util::Accessors:0x1a9009bb @store={"message"=>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests