From 1737f8365793d0a050102ce3672a1490eeed3d85 Mon Sep 17 00:00:00 2001 From: Shahid Karimi Date: Fri, 12 Aug 2022 17:26:51 +0500 Subject: [PATCH 1/3] documentation for selector .get() text --- docs/usage.rst | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/usage.rst b/docs/usage.rst index d0a6fb0b..b3a9111e 100644 --- a/docs/usage.rst +++ b/docs/usage.rst @@ -120,6 +120,15 @@ pseudo-elements:: >>> selector.css('title::text').get() 'Example website' +Extract text witout ::text +========================== +You can extract inner text without specifying ``::text`` in your selctor instead +an optional paramter text=True in the ``get()`` or ``getall()`` methods. + + >>> selector.css('title').get(text=True) + +You can pass additional paramter ``guess_punct_space``, ``guess_layout`` and ``guess_layout`` + As you can see, ``.xpath()`` and ``.css()`` methods return a :class:`~parsel.selector.SelectorList` instance, which is a list of new selectors. This API can be used for quickly selecting nested data:: From 17ae5e0506da45d1355fb78d614d23c43228c57d Mon Sep 17 00:00:00 2001 From: Shahid Karimi Date: Fri, 26 Aug 2022 12:19:33 +0500 Subject: [PATCH 2/3] suggested changes in the PR fixed --- docs/usage.rst | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/docs/usage.rst b/docs/usage.rst index b3a9111e..08e4ec5c 100644 --- a/docs/usage.rst +++ b/docs/usage.rst @@ -120,12 +120,10 @@ pseudo-elements:: >>> selector.css('title::text').get() 'Example website' -Extract text witout ::text -========================== -You can extract inner text without specifying ``::text`` in your selctor instead +You can extract inner text without specifying ``::text`` in your selector instead an optional paramter text=True in the ``get()`` or ``getall()`` methods. - >>> selector.css('title').get(text=True) + >>> selector.css('#images').get(text=True) You can pass additional paramter ``guess_punct_space``, ``guess_layout`` and ``guess_layout`` From c6580cc466ca0df3f27edc69a0e0feb0996ad9cf Mon Sep 17 00:00:00 2001 From: Mikhail Korobov Date: Sun, 13 Nov 2022 17:41:57 +0500 Subject: [PATCH 3/3] Update docs/usage.rst MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Adrián Chaves --- docs/usage.rst | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/docs/usage.rst b/docs/usage.rst index 08e4ec5c..0b41a5ea 100644 --- a/docs/usage.rst +++ b/docs/usage.rst @@ -120,12 +120,17 @@ pseudo-elements:: >>> selector.css('title::text').get() 'Example website' -You can extract inner text without specifying ``::text`` in your selector instead -an optional paramter text=True in the ``get()`` or ``getall()`` methods. +To extract all text of one or more element and all their child elements, +formatted as plain text taking into account HTML tags (e.g. ``
`` is +translated as a line break), set ``text=True`` in your call to +:meth:`~Selector.get` or :meth:`~Selector.getall` instead of including +``::text`` (CSS) or ``/text()`` (XPath) in your query: - >>> selector.css('#images').get(text=True) +>>> selector.css('#images').get(text=True) +'Name: My image 1\nName: My image 2\nName: My image 3\nName: My image 4\nName: My image 5' -You can pass additional paramter ``guess_punct_space``, ``guess_layout`` and ``guess_layout`` +See :meth:`Selector.get` for additional parameters that you can use to change +how the extracted plain text is formatted. As you can see, ``.xpath()`` and ``.css()`` methods return a :class:`~parsel.selector.SelectorList` instance, which is a list of new