-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ko: refresh all messages #1058
ko: refresh all messages #1058
Conversation
This refreshes all messages with the latest messages from the course: MDBOOK_OUTPUT='{"xgettext": {"pot-file": "messages.pot"}}' mdbook build -d po msgmerge --update po/ko.po po/messages.pot Part of #925.
After this PR, there are new fuzzy messages: % msgfmt -o /dev/null --statistics po/ko.po
1328 translated messages, 225 fuzzy translations, 217 untranslated messages. We should fix those in smaller follow-up PRs. |
@jiyongp and @jooyunghan, would it be useful if I setup a job which does this every week or every two weeks? |
Every two weeks, or perhaps monthly? |
That could be a good frequency as well! In general, running |
re: automatic refresh Well, if we can avoid that, I'd rather want to avoid that and do the manual monthly (but could be longer or shorter depending on my bandwidth) refreshes, as I think a slightly out-dated doc is better than an up-to-date doc where English and Korean are mixed. FYI: I was in the process of fixing the 200+ fuzzy translations first. When I am done, I will submit the change. Then adding the missing translations will be done in a separate PR. |
The infrastructure does not support this right now: we publish all languages using the up-to-date English Markdown files. We could of course change this and this is the topic of google/mdbook-i18n-helpers#16.
I'm happy to drop the PR here — it was auto-generated and has no special value to me. I only put it up to get the PO file aligned with the current texts in case that was easier for people to work with. However, when fixing the 200+ fuzzy messages, I still suggest doing this in two steps as well: one with the auto-generated update (essentially this PR) and one or more PRs which removes the fuzzy markers and translate new messages. |
Publishing all translations at the same time doesn't sound right. Why can't we publish each language independently from each other? Just to clarify: does the below process work?
The step 1 is triggered manually by a human translator (ex: me). Correct? |
We could do this, it would just complicate the publishing pipeline. GitHub pages are published by essentially uploading a zip file. We currently build this zip file using what we find in To publish older versions of the translation, we would likely
Yes, that is the correct workflow. Should we document this better in In step 2, removing the fuzzy markers means "look at the English text and update the translation to match". In the ideal case, the changes will be small and so the diff from +#, fuzzy
msgid ""
- "The course takes four days"
+ "The course takes three days" and thus immediately tell you what to fix in the translation. This is argument for letting translators do the updating (instead of a cron job). However, this tactic only works if a) the translations are complete or nearly complete and b) the translators run this regularly. The PR here is showing the problematic case where a ton of updates go in at once and then the diff is no longer useful to anybody. |
Sorry, I don't understand. Let's assume Later on, I follow the four steps and update Can this be done? |
I think the missing piece here is that the Korean translation (the To do what you suggest, we need to archive the translations when we publish them. We need the archive to be able to publish the site since the publish action overwrites any existing content previously published. If we automatically publish a
That would populate the working directory with the new English HTML and the HTML from all the translations (a separate GitHub action would generate these zip files). The actual publication to GitHub Pages can then happen afterwards. |
Aha, that's what I was missing. The markdown is like an app, and these po files are localized resources that the app uses. So, we can't build an up-to-date app showing English and 1 month-old app showing Korean at the same time. It however is a bit odd since in our case the po files have most (or all?) information. Having to depend on the markdown feels unfortunate. I feel like the rendered HTML pages in the translated languages should be stored in this git project (i.e. the final step of the translation is to generate the pages). Then the act of releasing the language will be just copying the HTML pages to the web server. |
Yes, that's the right analogy!
You are correct: today, the PO files happen to be loss-less. However, I'm working on removing most of the Markdown from the files. I have an example in this
into
without having any
Precisely, I think we can do exactly that. I think many people are surprised about the current system and trading stale information for a more complete translation seems good. We write write on each page when it was last updated and we could possible put a bit of JavaScript code on the page to detect if there English page is newer. |
Let me close this for now since the update will happen together with the new translations. |
This refreshes all messages with the latest messages from the course:
Part of #925.