Url Meta Data Harvester

Harvest statistics and meta data from an URL or his source code (seo oriented).

Implemented in Seo Pocket Crawler (source on github).

Install

$ composer require piedweb/url-harvester

Usage

Harvest Methods :

use \PiedWeb\UrlHarvester\Harvest;
use \PiedWeb\UrlHarvester\Link;

$url = 'https://piedweb.com';

Harvest::fromUrl($url)
    ->getResponse()->getInfo('total_time') // load time
    ->getResponse()->getInfo('size_download')
    ->getResponse()->getStatusCode()
    ->getResponse()->getContentType()
    ->getRes...

    ->getTag('h1') // @return first tag content (could be html)
    ->getUniqueTag('h1') // @return first tag content in utf8 (could contain html)
    ->getMeta('description') // @return string from content attribute or NULL
    ->getCanonical() // @return string|NULL
    ->isCanonicalCorrect() // @return bool
    ->getRatioTxtCode() // @return int
    ->getTextAnalysis() // @return \PiedWeb\TextAnalyzer\Analysis
    ->getKws() // @return 10 more used words
    ->getBreadCrumb()
    ->indexable($userAgent = 'googlebot') // @return int corresponding to a const from Indexable

    ->getLinks()
    ->getLinks(Link::LINK_SELF)
    ->getLinks(Link::LINK_INTERNAL)
    ->getLinks(Link::LINK_SUB)
    ->getLinks(Link::LINK_EXTERNAL)
    ->getLinkedRessources() // Return an array with all attributes containing a href or a src property
    ->mayFollow() // check headers and meta and return bool

    ->getDomain()
    ->getBaseUrl()

    ->getRobotsTxt() // @return \Spatie\Robots\RobotsTxt or empty string
    ->setRobotsTxt($content) // @param string or RobotsTxt

Testing

$ composer test

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.php-cs-fixer.cache		.php-cs-fixer.cache
.php-cs-fixer.dist.php		.php-cs-fixer.dist.php
.phpunit.result.cache		.phpunit.result.cache
.scrutinizer.yml		.scrutinizer.yml
LICENSE		LICENSE
README.md		README.md
composer.json		composer.json
phpunit.xml.dist		phpunit.xml.dist
psalm.xml		psalm.xml
test.php		test.php

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Url Meta Data Harvester

Install

Usage

Testing

Contributing

Credits

License

About

Releases

Packages

Languages

License

PiedWeb/UrlHarvester

Folders and files

Latest commit

History

Repository files navigation

Url Meta Data Harvester

Install

Usage

Testing

Contributing

Credits

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages