From 4fe3e7b392f7928f0b5b0dfed3c725799d9b7845 Mon Sep 17 00:00:00 2001 From: Felix Boehm <188768+fb55@users.noreply.github.com> Date: Fri, 9 Aug 2024 19:54:38 +0100 Subject: [PATCH] chore(docs): Add 1.0 Announcement Post (#3984) Let's get this out! --- Readme.md | 21 ---- website/blog/2024-08-07-version-1.md | 148 +++++++++++++++++++++++++++ website/blog/authors.yml | 4 +- website/docs/advanced/extract.md | 12 +-- 4 files changed, 157 insertions(+), 28 deletions(-) create mode 100644 website/blog/2024-08-07-version-1.md diff --git a/Readme.md b/Readme.md index fa1339ede0..e45ed57a52 100644 --- a/Readme.md +++ b/Readme.md @@ -242,27 +242,6 @@ support for Cheerio and help us maintain and improve this open source project. -## Special Thanks - -This library stands on the shoulders of some incredible developers. A special -thanks to: - -**• @fb55 for htmlparser2 & css-select:** Felix has a knack for writing -speedy parsing engines. He completely re-wrote both @tautologistic's -`node-htmlparser` and @harry's `node-soupselect` from the ground up, making both -of them much faster and more flexible. Cheerio would not be possible without his -foundational work - -**• @jQuery team for jQuery:** The core API is the best of its class and -despite dealing with all the browser inconsistencies the code base is extremely -clean and easy to follow. Much of cheerio's implementation and documentation is -from jQuery. Thanks guys. - -**• @tj:** The style, the structure, the open-source"-ness" of this -library comes from studying TJ's style and using many of his libraries. This -dude consistently pumps out high-quality libraries and has always been more than -willing to help or answer questions. You rock TJ. - ## License MIT diff --git a/website/blog/2024-08-07-version-1.md b/website/blog/2024-08-07-version-1.md new file mode 100644 index 0000000000..9ade7fc191 --- /dev/null +++ b/website/blog/2024-08-07-version-1.md @@ -0,0 +1,148 @@ +--- +slug: cheerio-1.0 +title: Cheerio 1.0 Released, Batteries Included 🔋 +authors: fb55 +tags: [release, announcement] +--- + +# Cheerio 1.0 Released, Batteries Included 🔋 + +Cheerio 1.0 is out! After 12 release candidates and just a short seven years +after the initial 1.0 release candidate, it is finally time to call Cheerio 1.0 +complete. The theme for this release is "batteries included", with common use +cases now supported out of the box. + +So grab a pair of double-As, and read below for what's new, what's changed, and +how to upgrade! + + + +## New Website and Documentation + +Since the last release, we've published a new website and documentation for +Cheerio. The new site features detailed guides and API documentation to get the +most from Cheerio. Check it out at [cheerio.js.org](https://cheerio.js.org/). + +## A new way to load documents + +Loading documents into Cheerio has been revamped. Cheerio now supports multiple +loading methods, each tailored to different use cases: + +- `load`: The classic method for parsing HTML or XML strings. +- `loadBuffer`: Works with binary data, automatically detecting the document + encoding. +- `stringStream` and `decodeStream`: Parse HTML directly from streams. +- `fromURL`: Fetch and parse HTML from a URL in one go. + +Dive deeper into these methods in the +[Loading Documents](http:///docs/basics/loading) tutorial. + +## Simplified Data Extraction + +The new `extract` method allows you to extract data from an HTML document and +store it in an object. To fetch the latest release of Cheerio from GitHub and +extract the release date and the release notes from the release page is now as +simple as: + +```ts +import * as cheerio from 'cheerio'; + +const $ = await cheerio.fromURL( + 'https://github.com/cheeriojs/cheerio/releases', +); +const data = $.extract({ + releases: [ + { + // First, we select individual release sections. + selector: 'section', + // Then, we extract the release date, name, and notes from each section. + value: { + // Selectors are executed within the context of the selected element. + name: 'h2', + date: { + selector: 'relative-time', + // The actual release date is stored in the `datetime` attribute. + value: 'datetime', + }, + notes: { + selector: '.markdown-body', + // We are looking for the HTML content of the element. + value: 'innerHTML', + }, + }, + }, + ], +}); +``` + +Read more about all of the available options in the +[Extracting Data](/docs/advanced/extract) guide. + +## Breaking Changes and Upgrade Guide + +Cheerio 1.0 introduces several breaking changes, most notably: + +- The minimum NodeJS version is now 18.17 or higher. +- Import paths were simplified. For example, use `cheerio/slim` instead of + `cheerio/lib/slim`. +- The deprecated default Cheerio instance and static methods were removed. + + Before, it was possible to write code like this: + + ```ts + import cheerio, { html } from 'cheerio'; + + html(cheerio('')); // ~ '' -- NO LONGER WORKS + ``` + + Make sure to always load documents first: + + ```ts + import * as cheerio from 'cheerio'; + + cheerio.load('').html(); + ``` + +- htmlparser2 options now reside exclusively under the `xml` key: + + ```ts + const $ = cheerio.load('', { + xml: { + withStartIndices: true, + }, + }); + ``` + +- Node types previously re-exported by Cheerio must now be imported directly + from [`domhandler`](https://github.com/fb55/domhandler). + +For a comprehensive list of changes, please consult +[the changelog](https://github.com/cheeriojs/cheerio/releases). + +## Upgrading to Cheerio 1.0 + +To upgrade to Cheerio 1.0, just run: + +```bash npm2yarn +npm install cheerio@latest +``` + +## Get Involved + +Explore the new features and let us know what you think! Encounter an issue? +Report it on our +[GitHub issue tracker](https://github.com/cheeriojs/cheerio/issues). Have an +idea for an improvement? Pull requests welcome :) + +## Thank You + +Thanks to [@jugglinmike](https://github.com/jugglinmike) for kick-starting +Cheerio 1.0, and to all the contributors who have helped shape this release. We +couldn't have done it without you. + +Thanks to our +[sponsors and backers](https://github.com/cheeriojs/cheerio?sponsor) for +supporting Cheerio's development. If you use Cheerio at work, consider asking +your company to support us! + +And finally, thank you for using Cheerio 🙇🙇‍♀️ diff --git a/website/blog/authors.yml b/website/blog/authors.yml index 98fa5af9f9..bcfb6865b4 100644 --- a/website/blog/authors.yml +++ b/website/blog/authors.yml @@ -1,5 +1,7 @@ fb55: name: Felix Boehm title: Maintainer of Cheerio - url: https://github.com/fb55 + url: https://feedic.com/ image_url: https://github.com/fb55.png + socials: + github: fb55 diff --git a/website/docs/advanced/extract.md b/website/docs/advanced/extract.md index 600e1f3414..1ea2872a65 100644 --- a/website/docs/advanced/extract.md +++ b/website/docs/advanced/extract.md @@ -6,10 +6,10 @@ description: Extract multiple values at once. # Extracting Data with the `extract` Method -The `extract` method in Cheerio allows you to extract data from an HTML document -and store it in an object. The method takes a `map` object as a parameter, where -the keys are the names of the properties to be created on the object, and the -values are the selectors or descriptors to be used to extract the values. +The `extract` method allows you to extract data from an HTML document and store +it in an object. The method takes a `map` object as a parameter, where the keys +are the names of the properties to be created on the object, and the values are +the selectors or descriptors to be used to extract the values. To use the `extract` method, you first need to import the library and load an HTML document. For example: @@ -168,11 +168,11 @@ const data = $.extract({ selector: 'section', // Then, we extract the release date, name, and notes from each section. value: { - // Selectors are executed whitin the context of the selected element. + // Selectors are executed within the context of the selected element. name: 'h2', date: { selector: 'relative-time', - // The actual date of the release is stored in the `datetime` attribute. + // The actual release date is stored in the `datetime` attribute. value: 'datetime', }, notes: {