Skip to content

[PHP] Harvest statistics and meta data from an URL or his source code (seo oriented).

License

Notifications You must be signed in to change notification settings

PiedWeb/UrlHarvester

Repository files navigation

Open Source Package

Url Meta Data Harvester

Latest Version Software License GitHub Tests Action Status Quality Score Code Coverage Type Coverage Total Downloads

Harvest statistics and meta data from an URL or his source code (seo oriented).

Implemented in Seo Pocket Crawler (source on github).

Install

Via Packagist

$ composer require piedweb/url-harvester

Usage

Harvest Methods :

use \PiedWeb\UrlHarvester\Harvest;
use \PiedWeb\UrlHarvester\Link;

$url = 'https://piedweb.com';

Harvest::fromUrl($url)
    ->getResponse()->getInfo('total_time') // load time
    ->getResponse()->getInfo('size_download')
    ->getResponse()->getStatusCode()
    ->getResponse()->getContentType()
    ->getRes...

    ->getTag('h1') // @return first tag content (could be html)
    ->getUniqueTag('h1') // @return first tag content in utf8 (could contain html)
    ->getMeta('description') // @return string from content attribute or NULL
    ->getCanonical() // @return string|NULL
    ->isCanonicalCorrect() // @return bool
    ->getRatioTxtCode() // @return int
    ->getTextAnalysis() // @return \PiedWeb\TextAnalyzer\Analysis
    ->getKws() // @return 10 more used words
    ->getBreadCrumb()
    ->indexable($userAgent = 'googlebot') // @return int corresponding to a const from Indexable

    ->getLinks()
    ->getLinks(Link::LINK_SELF)
    ->getLinks(Link::LINK_INTERNAL)
    ->getLinks(Link::LINK_SUB)
    ->getLinks(Link::LINK_EXTERNAL)
    ->getLinkedRessources() // Return an array with all attributes containing a href or a src property
    ->mayFollow() // check headers and meta and return bool

    ->getDomain()
    ->getBaseUrl()

    ->getRobotsTxt() // @return \Spatie\Robots\RobotsTxt or empty string
    ->setRobotsTxt($content) // @param string or RobotsTxt

Testing

$ composer test

Contributing

Please see contributing

Credits

License

The MIT License (MIT). Please see License File for more information.

About

[PHP] Harvest statistics and meta data from an URL or his source code (seo oriented).

Resources

License

Stars

Watchers

Forks

Packages

No packages published