-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
youtube-dl solution for YouTube throttling DASH streams #199
Comments
Pafy uses it's own download function for downloading streams. Any fixes done to the downloader in YouTube-dl won't be inherited. As a side note, it would be nice if we could use the YouTube-dl downloader to download the streams. But I have no idea whether it's possible or not. |
I thought pafy used YouTube-dl for something, but if that something is not downloading then I wonder what is it. |
Pafy uses youtube-dl for retrieving info about video(like title, description, stream info etc.).
Pafy uses YouTube API for playlist and channel retrieval (I don't know about the internal backend, maybe in that too). Do you have any examples where the throttling is evident? I can try to port the fix over here. |
Hello, I have exactly the same problem, pafy download is very slow since a few weeks. I try to download with youtube-dl and there is no problem. |
@vn-ki you can try to download any m4a dash audio stream. |
We could either port youtube-dl fix onto our download function or use I am currently trying to implement the latter. It seems doable, but I'm quite busy over the next few weeks and this seems to be an important issue. If I hit a breakthrough and if @ids1024 accepts this approach, I will make a PR. |
This sounds like a good idea. But using something from youtube-dl (without copying the code) isn't an option with the internal backend, which doesn't depend on youtube-dl. Pafy originally did not use youtube-dl, but instead implemented similar functionality. But youtube-dl is better maintained and did not have various issue pafy did, so I changed it to use youtube-dl (#109). Some people complained about this (for some legitimate reasons), so I added back the original code as a separate backend. I'm not sure what to do about the internal backend; perhaps it should just be removed. I disabled it by default in the last release (726c1a7) because it has bugs, and people were reporting issues that would not have happened if they had youtube-dl installed. I don't know if many/any people still rely on it (or for that matter, if it is currently working). |
Youtube-dl is used to get the stream urls; then pafy just downloads them normally. |
Is there any progress with this? |
I have my exams till 1st of March. I'll surely work on this after that. If someone can work on this before that, then cheers! |
That's great! |
I have no idea what's going on but I came across something interesting (while wondering about ytdl-org/youtube-dl#15271). Passing headers >>> import requests
>>> import pafy
>>> content = pafy.new('https://www.youtube.com/watch?v=sJa-1MKCx3w')
>>> audio = content.getbestaudio()
# without header
>>> slow_resp = requests.get(audio.url)
>>> with open('slow_download.webm', 'wb') as fout:
>>> fout.write(slow_resp.content)
# with header
>>> lucky_header = {'Range': 'bytes=0-'}
>>> fast_resp = requests.get(audio.url, headers=lucky_header)
>>> with open('fast_download.webm', 'wb') as fout:
>>> fout.write(fast_resp.content) Just compare the time delay when fetching content without and with header, you'll know. |
Wow! |
@ritiek Yup. It should work for one time successful downloading. But I don't know whether this would work when you are resuming the download (It might). Did you try resuming the download? From what I understood from looking in the http_downloader of youtube-dl you have to do more work if you want to resume. Their downloader looked more robust, so I thought utilizing their downloader would be a better choice than trying to port the fix over. But if this works, doing this would be a lot easier than trying to utilize their downloader class. |
@embryo10 Yep, I actually made a local working pafy fork. I'll make a PR soon. EDIT:
Since YouTube allows resume support, so we can request partial content using
@vn-ki That should probably work as well but gotta test that. |
OK did it too.. else:
resuming_opener = build_opener()
resuming_opener.addheaders = [("Range", "bytes=%s-" % offset)]
response = resuming_opener.open(self.url) and everything works fast again.... |
Actually, I think this is not a great solution either. Long videos like https://www.youtube.com/watch?v=ffQM8ALVJV8 throttle with pafy even when passing Range header (but downloads at full speed with youtube-dl). |
Check it and sadly you are right... |
@vn-ki Any news from the throttling front? :o) |
@embryo10 I have implemented the basic http downloader from youtube-dl. This means throttling is fixed (Yay!). But I have to make sure the This week is pretty heavy for me, so please wait for 1 week (I'll try to do it before this weekend). Within that time frame, I will fix this (atleast, for the youtube-dl backend) EDIT: I did take a look at porting their fix onto our download function. It is doable(not too complex) but would require a lot of rewriting. I want to eventually port that over here. |
These are great news! :o) |
I'm getting this: Traceback (most recent call last):
File "D:\Apps\DEV\PROJECTS\KataLib\secondary.py", line 1185, in process
self.get_stream()
File "D:\Apps\DEV\PROJECTS\KataLib\secondary.py", line 1240, in get_stream
stream.download(self.m4a_file, quiet=True, callback=self.progress_down)
File "D:\Apps\DEV\PYTHON\Python27\lib\site-packages\pafy\backend_shared.py", line 575, in download
self._youtubedl_download(*args, **kwargs)
File "D:\Apps\DEV\PYTHON\Python27\lib\site-packages\pafy\backend_shared.py", line 642, in _youtubedl_download
downloader.real_download(filename, infodict)
File "D:\Apps\DEV\PYTHON\Python27\lib\site-packages\youtube_dl\downloader\http.py", line 341, in real_download
return download()
File "D:\Apps\DEV\PYTHON\Python27\lib\site-packages\youtube_dl\downloader\http.py", line 298, in download
'elapsed': now - ctx.start_time,
File "D:\Apps\DEV\PYTHON\Python27\lib\site-packages\youtube_dl\downloader\common.py", line 372, in _hook_progress
ph(status)
File "D:\Apps\DEV\PYTHON\Python27\lib\site-packages\pafy\backend_shared.py", line 600, in progress_hook
rate = s['speed']/1024
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int' I just overwritten the pafy files with yours. |
I used the https://www.youtube.com/watch?v=ffQM8ALVJV8 link that you used. |
@embryo10 I did some changes locally and forgot to push! Sorry for that! Try the code now! |
@ritiek Can you try against the new HEAD? It should be fixed now (was a math error from my part). |
@vn-ki Wow, yes. It works great now. Amazing. Thanks for the great work! |
The download speed seems to be OK now (!!!!) |
It seems that the |
Added if savedir:
filename = os.path.join(savedir, filename) before the |
@embryo10 Fixed and created PR! |
Great! Waiting for a new release... |
Wait for the merge, at least! |
OK ;o) |
Seems like this issue returned. Anybody else noticed the slowdown lately? |
Nothing wrong here yet.. |
@embryo10 I tried this, super slow download on my end (latest version installed) youtube-dl -f140,264 https://www.youtube.com/watch?v=xegAZE0ez04 |
I've just updated to youtube-dl-2018.3.26.1 |
@embryo10 Same version here. Weird. I'll try a few more and report back. |
@embryo10 Try this one: Painfully slow on my end. I pulled this one much quicker a few weeks ago. |
After the recent problems with the throttling of DASH streams (audio or video), youtube-dl seems to have solve the problem at last!
Pafy 0.5.4 still has it though.
Trying to get an m4a with pafy (m4astreams[-1]) still gets throttled but getting the same stream directly with youtube-dl 2018.02.11 works fast as before the speed limiting.
Am I doing something wrong?
Any ideas?
The text was updated successfully, but these errors were encountered: