-
Notifications
You must be signed in to change notification settings - Fork 424
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[YouTube] Fix hashtags links extraction and escape HTML links
webCommandMetadata object is contained inside a commandMetadata one, so it is not accessible from the root of the navigationEndpoint object. The corresponding statement has been moved at the bottom of the specific endpoints parsing, as the webCommandMetadata object is present almost everywhere, otherwise URLs of some endpoints would have be changed, such as uploader URLs (from channel IDs to handles). As no ParsingException is now thrown by getUrlFromNavigationEndpoint, and so by getTextFromObject and getUrlFromObject, the methods which were catching ParsingExceptions thrown by these methods had to be updated. URLs got in the getTextFromObject HTML version are now escaped properly to provide valid HTML to clients. As YouTube descriptions are in HTML format (except for the fallback on the JSON player response, which is plain text and only happens when there is no visual metadata or a breaking change), URLs returned are escaped, so tests which are testing presence of URLs with escaped characters had to be updated (it was only the case for YoutubeStreamExtractorDefaultTest.DescriptionTestUnboxing).
- Loading branch information
Showing
4 changed files
with
58 additions
and
73 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters