Skip to content
This repository has been archived by the owner on Jul 27, 2024. It is now read-only.

Add check for valid HTML translation #63

Merged
merged 3 commits into from
Dec 10, 2020
Merged

Conversation

yishus
Copy link
Contributor

@yishus yishus commented Dec 7, 2020

Fixes #59

I'm running into issues validating HTML with Nokogiri (img, br, svg tags are flagged as errors), I've switched it out to Nokogumbo after checking this issue. Getting good results with that so far.

en.default.json

{
    "test_html": "<img src='something.jpg'>",
    "hello_html": "<h1>wutwuti<p><img><h2",
    "images": {
        "assets": {
            "svg": "<svg viewBox='0 0 10 10' x='200' width='100'><circle cx='5' cy='5' r='4' /></svg>"
        }
    },
    "accounts": {
        "one": "<h3>Your account</h3>",
        "other": "<h3>Your accounts</h3>"
    }
}

dev check path/to/my/theme

suggestion: ValidHTMLTranslation: 'hello_html' contains invalid HTML:
1:23: ERROR: End of input in tag.
<h1>wutwuti<p><img><h2
                      ^
1:20: ERROR: Premature end of file  Currently open tags: html, h1, p.
<h1>wutwuti<p><img><h2
                   ^.

4 files inspected, 1 offenses detected

Copy link
Contributor

@macournoyer macournoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

lib/theme_check/checks/valid_html_translation.rb Outdated Show resolved Hide resolved
lib/theme_check/checks/valid_html_translation.rb Outdated Show resolved Hide resolved
Gemfile Outdated Show resolved Hide resolved
@yishus yishus force-pushed the valid_translation_html branch 2 times, most recently from 7d1e9b2 to 412d622 Compare December 8, 2020 09:05
@yishus yishus marked this pull request as ready for review December 8, 2020 09:05
@yishus
Copy link
Contributor Author

yishus commented Dec 8, 2020

@macournoyer okay ready for review!

Add tests and fixes

Fix style
Copy link
Contributor

@macournoyer macournoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! We can now replace the old theme-lint 🎉

lib/theme_check/checks/valid_html_translation.rb Outdated Show resolved Hide resolved
offenses = analyze_theme(
ThemeCheck::ValidHTMLTranslation.new,
"locales/en.default.json" => JSON.dump(
hello_html: "<h1>Hello, world</h1>"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! One more thing. Can you test it will not cause errors w/ self-closing tags: <img src="...> or <br>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated to use Nokogumbo, details in the PR description above ☝️

@yishus yishus requested a review from jbourassa December 9, 2020 07:16
Copy link
Contributor

@macournoyer macournoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! Didn't know about nokogumbo. Looks perfect.

Copy link
Contributor

@jbourassa jbourassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time looking for a parser that would be easy to use and did not come across Nokogumbo, but looks great. Well done!

Left some minor comments but from my perspective this is a great addition.


def html_key?(keys)
pluralized_key = keys[-2] if keys.length > 1
keys[-1].end_with?('_html') || pluralized_key.end_with?('_html')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for "is this a pluralization set of keys?" is a bit different than what we had in ThemeCheck::MatchingTranslations but I think that's a good enough heuristic, and we can tweak those later if need be.

def pluralization?(hash)
hash.all? do |key, value|
PLURALIZATION_KEYS.include?(key) && !value.is_a?(Hash)
end
end

'hello_html' contains invalid HTML:
1:17: ERROR: Premature end of file Currently open tags: html, h1.
<h1>Hello, world
^
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error messages are a bit odd when we're parsing single lines, wonder if we can tweak them? Here for instance

1:17:

Line 1 most likely isn't line 1

Premature end of file

From a consumer's perspective, a string is being parsed, not a file, right?

Currently open tags: html, h1.

Looks like html is automatically injected here.

Do you think there's a way for us to improve these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with all the points above, but sadly most of these information are only available through the to_s method. We are able to obtain the line and column values, and also str1 which returns the HTML standard error code. (source)

If we are printing only the error code, it will look like

suggestion: ValidHTMLTranslation: 'hello_html' contains invalid HTML: eof-in-tag, generic-parser.

Which is still okay, it doesn't point to the exact location of the error, but can be helpful

@macournoyer
Copy link
Contributor

Merging this. We can tweak the error message in another PR. I've used dirty regexes in other places to do it :/

@macournoyer macournoyer merged commit 1ba8539 into main Dec 10, 2020
@yishus yishus deleted the valid_translation_html branch December 10, 2020 13:16
@macournoyer macournoyer temporarily deployed to rubygems January 6, 2021 18:53 Inactive
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lint HTML inside translation keys
3 participants