Skip to content

Russian Webpage parsing support. #263

@mrgodhani

Description

@mrgodhani
  • Platform: Mac
  • Mercury Parser Version: Web based api (at moment)
  • Node Version (if a Node bug):
  • Browser Version (if a browser bug):

Expected Behavior

Proper encoding for Russian language.

Current Behavior

When parsing this link https://www.finam.ru/analysis/newsitem/putin-nagradil-grefa-ordenom-20190208-203615/?utm_source=rss&utm_medium=new_compaigns&utm_campaign=news_to_finamb it doesn't give proper encode output and hence format is messed up when rendering in html.

Steps to Reproduce

  1. Parse link https://www.finam.ru/analysis/newsitem/putin-nagradil-grefa-ordenom-20190208-203615/?utm_source=rss&utm_medium=new_compaigns&utm_campaign=news_to_finamb
  2. Check the content output
  3. Try to render that content with Cyrillic font
  4. You will see instead of proper format it shows bunch of '�'

Detailed Description

I use this API for parsing articles in my reader app. And there are some Russian news feed try to use and are not able to get proper format output.

Possible Solution

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions