-
Notifications
You must be signed in to change notification settings - Fork 63
Use JSON5 for en.json
#1949
Use JSON5 for en.json
#1949
Conversation
Storybook and Tailwind configuration previews: Ready Storybook: https://wordpress.github.io/openverse-frontend/_preview/1949 Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again. You can check the GitHub pages deployment action list to see the current status of the deployments. |
Size Change: +3 kB (0%) Total Size: 817 kB
ℹ️ View Unchanged
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having comments in the POT files will be really great for translators! The file generation works well.
Does it also save the en.json
to the branch that we use for translations? I'm not sure how to test that, but we definitely need to check (if you haven't yet) that it's working, and the translations are uploaded to GlotPress correctly.
Could you also add some documentation saying that en.json5
is the main file where we should put all the strings and comments, and en.json
and all other translated files should not be edited?
I think we also need to add the en.json5
-> en.json
conversion step to pnpm dev
. en.json
is git-ignored. So, when anyone downloads the repo and try to run pnpm dev
, they will not have en.json
, so will not be able to run the app. I tried removing all of the locale JSON files, and confirmed that pnpm dev
cannot run without them.
One more thing: to make sure that the Playwright tests always use the same translation files, we saved 3 locale JSON files to tests/locales
. When running the tests, we run i18n:copy-test-locales
to copy them to the locales
folder. Could you add en.json
creation to that step, too, so that the tests always have the English locales as well?
The Thanks @obulat for pointing out the usages I missed! I've made the other improvements you suggested. |
src/locales/scripts/read-i18n.js
Outdated
/** | ||
* In `en.json` we use a limited version of JSON5. It is valid JSON5 but only | ||
* uses a limited set of features and syntax. This package provides functions | ||
* to read this format and convert it into regular JSON. | ||
* | ||
* - All values must be strings. This is okay because we're only dealing with | ||
* i18n translation files. | ||
* - Only single line `// ...` comments are allowed. These comments describe the | ||
* content that follows it. Inline comments are not allowed. | ||
* | ||
* ```json5 | ||
* { | ||
* // documentation about `key-a` | ||
* key: 'value', | ||
* 'key-b': { | ||
* 'key-c': 'value-c', | ||
* }, | ||
* } | ||
* ``` | ||
* | ||
* @typedef {{ [key: string]: string | SimJson }} SimJson | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module is clever. Reading it, however, it struck me that you've essentially re-written a very basic version of a JavaScript AST parser. We could use instead use an existing AST parser and treat the json5
file as regular JavaScript and get the following benefits:
- No need to maintain a file parser
- Typed AST
- Support for multi-line comments
@babel/parser
does a great job at all of these and is thoroughly documented. Check out this AST Explorer example: https://astexplorer.net/#/gist/fd2baa6817ccaa87c23704ead7d3e7f8/fb97d60a32d361ab1da42d9a19c88a2b22762733
You'll notice the object had to be wrapped in parens. This could be done in code by opening the file and adding an open and closing paren to the object. Alternatively, we could write the file as plain JavaScript and have it like so:
module.exports = {
// ... the translation object
Instead of using the JSON5 extension but only supporting a limited subset of it.
If you click on the object properties in the AST explorer link, you'll see a leadingComments
array that includes all the leading comments. Navigating through nested blocks is also easy, thanks to the uniform AST. We could transform this into the Entry
AST for simplicity’s sake and to pare down the unnecessary parts of the babel AST: the goal of using babel's parser is mostly to avoid having to maintain our own AST parser 🙂
Finally, for generating the en.json
, if we use json5
we can use the json5
reference implementation to get the comment-free version of the object. If we use JavaScript for the file instead, then we can just import the object and write it out as JSON without the need for an additional package.
None of these suggestions are blocking, but I am hesitant to approve without further clarification for why we should add a simplified JSON5 parser to our codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I searched everywhere for a JSON-AST package but didn't find any that supported JSON5 and had been actively maintained. The JSON5 parsers I found didn't have the option for reading to AST, only parsing it which strips all comments.
I hadn't considered the idea of treating the file as JS and using @babel/parser
on it. Thank you for the suggestion! That might be worth investigating for the consistency of having a standard AST, although
- the rudimentary parser is only a tiny bit of a maintenance burden because it's only a bunch of
if
statements - Babel's parser is overkill for a JSON file and then getting the information we want from the AST it outputs will also add a fair bit of code
For generating the en.json
I didn't want to use the json5
parser because we already have all the information we need to write the JSON in the Entry
object so it made sense to me to just use that to output the JSON.
To be honest, I'm being biased towards the hack parser because I spent a lot of time writing and testing it but I'm open to using a full-fledged parser like Babel if it improves the quality of the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored read-i18n.js
to use the @babel/parser
package and it while the LoC didn't change much, the code looks much cleaner. Thanks again for the suggestion, wish I had asked you for pointers sooner!
Not necessarily requesting changes yet, just blocking the PR for further discussion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I had a comment typed out yesterday responding to some of your points (which I think are all pretty good) but the one I will still share is that yes, I think while it doesn't reduce the overall amount of code it is code that is (a) more readable and (b) more familiar to most people. Parsers are just a bit more niche compared to tree-walking code, which most people learn pretty early on. It's nice to support multi-line comments as well 😁 I appreciate your willingness to explore other avenues here 🚀
Fixes
Fixes #1815 by @dhruvkb
Description
This PR
en.json
withen.json5
and reformats iten.json5
fileen.json5
file#.
vs#:
)en.json
fileen.json
to be like any other languagei18n:get-translations
scriptCode in many places depends on
en.json
. This continues to work once theen.json
file has been generated by runningi18n:get-translations
.Screenshots
Testing Instructions
pnpm i18n
.Additionally CI is passing and unit tests that depend on locales are passing. So it's good to start.
Checklist
Update index.md
).main
) or a parent feature branch.Developer Certificate of Origin
Developer Certificate of Origin