Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Useragent plugin not parsing some strings correctly #1669

Closed
arctica opened this issue Aug 27, 2014 · 5 comments
Closed

Useragent plugin not parsing some strings correctly #1669

arctica opened this issue Aug 27, 2014 · 5 comments
Assignees
Labels

Comments

@arctica
Copy link

arctica commented Aug 27, 2014

Hi, I've been using the useragent plugin of logstash and found it very useful but encountered a problem with some user agents not being parsed properly.

A useragent string like "Mozilla/5.0 (Windows NT 6.3; Win64; x64; Trident/7.0; rv:11.0) like Gecko" will have the ua.os field set to just "Windows" while it should be "Windows 8.1".

The regexes.yml file contains appropriate regexes as shown below and testing the string on http://www.whatsmyua.info/ also identifies it correctly.

- regex: '(Windows NT 6\.3)' 
  os_replacement: 'Windows 8.1' 

Also I could not figure out what the difference between "ua.os" and "ua.os_name" is. They seem to always contain the same values.

@electrical
Copy link

Interesting bug. It could be that the library it self has some issues. We will do some investigation on this.
If you are able to find out what is wrong would be much appreciated.

@arctica
Copy link
Author

arctica commented Aug 27, 2014

Seems like the regexes.yaml file just needs updating. Even though the currently shipped regexes.yaml file has definitions for Windows 8.1 (NT 6.3), it doesn't seem to take priority over the generic "Windows" one.

Using the latest file from https://raw.githubusercontent.com/tobie/ua-parser/master/regexes.yaml fixes this issue and luckly it can be easily used via the filter config. Though still not sure about ua.os vs us.os_name

@electrical
Copy link

Ahh okay, we will make sure it gets updated with the next release.
I'm not sure my self about the different names either to be very honest.

@suyograo suyograo self-assigned this Aug 29, 2014
@deverton
Copy link

The ua.os field is basically the full operating system identification string and should be the full name and version of the operating system, e.g. Windows 8.1.

The ua.os_name should probably be renamed to ua.os_family as it's supposed to be the "family" of operating systems and is part of the breakdown of the operating system details in to family and version.

@untergeek
Copy link
Member

The original complaint, that regexes.yaml needs updating, is outside the scope of Logstash's control, as it comes as part of the user_agent_parser ruby gem (https://rubygems.org/gems/user_agent_parser/). Updating the plugin—now possible with Logstash 1.5—will pull the latest version of the user_agent_parser ruby gem.

You can obtain the latest yaml file from the uap-core github repo: https://github.com/ua-parser/uap-core/blob/master/regexes.yaml

If a change needs to be made to the user agent plugin code, please create a pull request or a new issue at http://github.com/logstash-plugins/logstash-filter-useragent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants