Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ITV: metadata now (may) contain HTML tags #27399

Closed
5 tasks done
Vangelis66 opened this issue Dec 13, 2020 · 1 comment
Closed
5 tasks done

ITV: metadata now (may) contain HTML tags #27399

Vangelis66 opened this issue Dec 13, 2020 · 1 comment

Comments

@Vangelis66
Copy link

Vangelis66 commented Dec 13, 2020

Checklist

  • I'm reporting a broken site support
  • I've verified that I'm running youtube-dl version 2020.12.12
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar issues including closed ones

Verbose log

(Using SSH Tunnel to the UK... 😜 )
youtube-dl --proxy="https://localhost:1080" --console-title "https://www.itv.com/hub/family-guy/2a4259a0327" --get-description --skip-download -v =>

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--proxy=https://localhost:1080', '--console-title', 'https://www.itv.com/hub/family-guy/2a4259a0327', '--get-description', '--skip-download', '-v']
[debug] Encodings: locale cp1253, fs mbcs, out cp737, pref cp1253
[debug] youtube-dl version 2020.12.12
[debug] Python version 3.4.4 (CPython) - Windows-Vista-6.0.6003-SP2
[debug] exe versions: ffmpeg 4.3.1, ffprobe 4.3.1, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {'http': 'https://localhost:1080', 'https': 'https://localhost:1080'}
[debug] Default format spec: bestvideo+bestaudio/best

<strong class="episode-info__series">Series 17 - Episode 15</strong> - The guys volunteer to chaperone the high school prom, where Quagmire hits it off with Courtney, only to discover she is in fact his daughter.

Description

In latest version 2020.12.12, commit 225646c restored metadata (aka description) support for ITV Hub episodes; many thanks to the devs! 🥇
The code line responsible is

'description': strip_or_none(get_element_by_class('episode-info__synopsis', webpage)),

Sadly, this solution now ends up with the description containing HTML tags, as posted in the log:

<strong class="episode-info__series">Series 17 - Episode 15</strong> - The guys volunteer to chaperone the high school prom, where Quagmire hits it off with Courtney, only to discover she is in fact his daughter.

Now, I understand those tags are there in page source,

ITV-METADATA

but when actually downloading an episode with the flags --write-description --add-metadata, the description gets added into the MP4 file's tag (as expected), but my player of choice (MPC-BE) can't handle the tags:

MPCBE

Thus, I'm humbly requesting that the metadata extraction code be slightly amended to remove those HTML tags; many thanks for your highly-praiseworthy efforts, take care, best festive wishes! 🤶 😄

@Vangelis66
Copy link
Author

Many thanks indeed for the swift fix! 👍

ThirumalaiK pushed a commit to ThirumalaiK/youtube-dl that referenced this issue Jan 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant