Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate JSON files to be consumed by Vue.js UI #219

Merged
merged 3 commits into from
Jun 11, 2024
Merged

Conversation

dan-niles
Copy link
Collaborator

@dan-niles dan-niles commented Jun 4, 2024

This PR adds the functionality for generating JSON files to be consumed by Vue.js UI as mentioned in #212. The JSON files being generated are:

  • channel.json : Contains information about the channel
  • playlists.json : Contains a list of all playlists in the channel
  • Seperate JSON files under playlists folder for each playlist : Contains info about the playlist and the videos it contains. (This includes the "Uploads" of the channel as well)
  • Seperate JSON files under videos folder for each video : Contains info about the specific video

The JSON files are added "on the fly" to the ZIM as mentioned in #209. However the video files and branding files are still being copied from the build_dir.

Closes #212
Closes #220

@dan-niles dan-niles marked this pull request as ready for review June 4, 2024 06:11
@dan-niles dan-niles self-assigned this Jun 4, 2024
@dan-niles dan-niles requested a review from benoit74 June 4, 2024 06:11
Copy link
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

Main changes required are around the usage of type hints. I know mostly no type hints are used so far in current codebase, but we would like to have as many as possible. So please try at least to add as many as reasonably possible in the code you add or modify.

Second concern is that I don't know how to test this, do we have at least a test channel or set of playlists where we could check how this code behaves? If not, we should probably build one so that we avoid reinventing the test scenario every time we test this. It should be small enough so that we can scrape it quite quickly, but big enough to exercise most scenarii of the scraper.

Finally, I'm a bit ashamed to continue to create code with absolutely no tests. I don't know if you have any suggestion regarding how we could automate testing this at least a bit. Maybe an issue for later, scraper has worked without tests for year, we can maybe continue like this for few more years ^^

CHANGELOG Outdated Show resolved Hide resolved
scraper/src/youtube2zim/scraper.py Outdated Show resolved Hide resolved
scraper/src/youtube2zim/scraper.py Outdated Show resolved Hide resolved
scraper/src/youtube2zim/scraper.py Outdated Show resolved Hide resolved
scraper/src/youtube2zim/scraper.py Outdated Show resolved Hide resolved
scraper/src/youtube2zim/scraper.py Outdated Show resolved Hide resolved
scraper/src/youtube2zim/scraper.py Outdated Show resolved Hide resolved
scraper/src/youtube2zim/scraper.py Outdated Show resolved Hide resolved
scraper/src/youtube2zim/scraper.py Outdated Show resolved Hide resolved
@dan-niles
Copy link
Collaborator Author

dan-niles commented Jun 4, 2024

@benoit74 I've included a zimdump of a ZIM file I generated using this new code here. Can you check the JSON files and let me know if everything is ok?

zimdump.zip

(I removed the video files since there is a limit of 25MB for things I can upload here)

@dan-niles
Copy link
Collaborator Author

Additionally, you can try running the scraper on the new testing channel created in #221 with the following command:

youtube2zim --api-key "<YOUR-API-KEY>" --type channel --id "UC8elThf5TGMpQfQc_VE917Q" --name "openZIM_testing" --output ../output

@dan-niles dan-niles requested a review from benoit74 June 4, 2024 15:24
@dan-niles dan-niles marked this pull request as draft June 7, 2024 14:20
@dan-niles
Copy link
Collaborator Author

dan-niles commented Jun 7, 2024

I added a new videoSlug property to the PlaylistPreview schema, that points to the video to be redirected to when clicking on the preview.

Edit 1 - Added joined_date, channel_name and channel_description to channel.json.
Edit 2 - I found a way to get the video durations from the YouTube API. I updated the scraper to fetch the durations and include them in the JSON files.

@dan-niles dan-niles marked this pull request as ready for review June 7, 2024 15:21
Copy link
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more things to fix (I'm still very pleased by the PR, I just dug a bit more and found more stuff ^^):

  • see inline comments
  • remove what we now know for sure is obsolete code and files:
    • make_html_files function (and maybe others I've missed)
    • the whole templates folder and files
  • translation of the UI was pretty minimalist, most probably partially broken, now it is completely broken by this PR (and previous one) ; I suggest to remove all stuff linked to the previous translation since it is just useless/missleading now (locale subfolder + installation of locales-all in Dockerfile) and open a new issue "Create structure for new Youtube UI translations" (I can open the issue with details if you want)

scraper/src/youtube2zim/scraper.py Outdated Show resolved Hide resolved
scraper/src/youtube2zim/schemas.py Outdated Show resolved Hide resolved
scraper/src/youtube2zim/schemas.py Outdated Show resolved Hide resolved
scraper/src/youtube2zim/schemas.py Outdated Show resolved Hide resolved
@dan-niles
Copy link
Collaborator Author

dan-niles commented Jun 10, 2024

Removed obsolete code and files:

  • make_html_files method and related imports in a605e72
  • Unused templates (home.html, article.html) in a605e72
  • locale folder in 725720d

Copy link
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more small comments regarding the CHANGELOG.

Please take them into account and you are now good to go to rewrite the Git history of your branch to keep only a handful of meaningful commits, as discussed yesterday. I don't really mind about the exact number, but I don't want to keep stuff which is only linked to the back-and-forth exchanges we had on this PR and do not really make much sense for the history (long term goal of commit history is to understand why changes have been done, not how). Force push will be needed after the rewrite obviously. Do not hesitate to ask for help if what I expect is too blurry for you or if you face issues.

Ask for a new review once this has been done and I will merge this.

CHANGELOG Outdated
@@ -14,6 +14,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Initialize new Vue.js project in `zimui` subfolder
- Update dependencies in pyproject.toml (pydantic, pyhumps, python-slugify)
- Update scraper to generate JSON files for `zimui` (#212)
- Remove template files (home.html, article.html) and `make_html_files` method in scraper.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be more specific with something like Remove old UI files and methods: template files (home.html, article.html) and make_html_files method in scraper.py

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in 9981785

CHANGELOG Outdated
@@ -14,6 +14,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Initialize new Vue.js project in `zimui` subfolder
- Update dependencies in pyproject.toml (pydantic, pyhumps, python-slugify)
- Update scraper to generate JSON files for `zimui` (#212)
- Remove template files (home.html, article.html) and `make_html_files` method in scraper.py
- Remove locale folder and files
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be more specific with something like Remove broken locale folder and files used for translation ; translation will be restored with #222

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in 0d7050f

Copy link
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@benoit74 benoit74 merged commit dff542d into main Jun 11, 2024
7 checks passed
@benoit74 benoit74 deleted the generate_json branch June 11, 2024 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fetch video durations when scraping Generate JSON files to be consumed by Vue.js UI
2 participants