Parser and group determiner optimized for robots.txt
, X-Robots-tag
and Robots-meta-tag
usage cases.
- PHP 5.5+, 7.0+ or 8.0+
The library is available for install via Composer. Just add this to your composer.json
file:
{
"require": {
"vipnytt/useragentparser": "^1.0"
}
}
Then run php composer update
.
- Stripping of the version tag.
- List any rule groups the User-Agent belongs to.
- Determine the correct group of records by finding the group with the most specific User-agent that still matches.
- When parsing
robots.txt
rule sets, for robots online. - When parsing the
X-Robots-Tag
HTTP header. - When parsing
Robots meta tags
in HTML / XHTML documents.
Note: Full User-agent strings, like them sent by eg. web-browsers, is not compatible, this is by design.
Supported User-agent string formats are UserAgentName/version
with or without the version tag. Eg. MyWebCrawler/2.0
or just MyWebCrawler
.
use vipnytt\UserAgentParser;
$parser = new UserAgentParser('googlebot/2.1');
$product = $parser->getProduct()); // googlebot
use vipnytt\UserAgentParser;
$parser = new UserAgentParser('googlebot-news/2.1');
$userAgents = $parser->getUserAgents());
array(
'googlebot-news/2.1',
'googlebot-news/2',
'googlebot-news',
'googlebotnews',
'googlebot'
);
Determine the correct group of records by finding the group with the most specific User-agent that still matches your rule sets.
use vipnytt\UserAgentParser;
$parser = new UserAgentParser('googlebot-news');
$match = $parser->getMostSpecific(['googlebot/2.1', 'googlebot-images', 'googlebot'])); // googlebot
$parser = new UserAgentParser('MyCustomCrawler/1.2');
// Determine the correct rule set (robots.txt / robots meta tag / x-robots-tag)
$parser->getMostSpecific($array); // string
// Parse
$parser->getUserAgent(); // string 'MyCustomCrawler/1.2'
$parser->getProduct(); // string 'MyCustomCrawler'
$parser->getVersion(); // string '1.2'
// Crunch the data into groups, from most to less specific
$parser->getUserAgents(); // array
$parser->getProducts(); // array
$parser->getVersions(); // array