Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #1266: modifies sanitize.js to allow Telescope to parse img tags with data uris #1282

Merged
merged 3 commits into from
Nov 6, 2020

Conversation

chrispinkney
Copy link
Contributor

@chrispinkney chrispinkney commented Nov 3, 2020

Issue This PR Addresses

Fixes #1266

Type of Change

  • Bugfix: Change which fixes an issue
  • New Feature: Change which adds functionality
  • Documentation Update: Change which improves documentation
  • UI: Change which improves UI

Description

  • Modified sanitize.js to allow for img tags to use data uris as src so they can be successfully parsed and indexed by Telescope.
  • Adds a test to ensure data uris are sanitized properly.
  • Adds two more http and https schema tests for img tag with a data uri src.
  • Adds blockquote test and cite sanitation.

Hey everyone!

As per issue #1266, this PR allows images which have a src data uri type to be properly indexed. It passes all the tests on my end thus showing that data uris are being allowed by Telescope.

I tried to ensure that this works as expected by purging my redis db via docker exec -it telescope_redis_1 redis-cli flushall but it seems that once a post is indexed, it is how it is and forever shall be the way it was initially indexed. So instead of going back in time I disallowed all html attributes and waited for a post to come in. Once one arrived I noticed that it had no tags at all, just plaintext, so I reverted back to normal and allowed imgs with a data uri.

While this isn't exactly a perfect sign that everything works as expected, given that npm test sanitize-html runs and returns that all tests pass, and specifically: <img src="data:image/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==" /> returned as expected, I feel that Telescope should now properly parse imgs with a data uri.

Let me know, thanks!

Checklist

  • Quality: This PR builds and passes our npm test and works locally
  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Screenshots: This PR includes screenshots or GIFs of the changes made or an explanation of why it does not (if applicable)
  • Documentation: This PR includes updated/added documentation to user exposed functionality or configuration variables are added/changed or an explanation of why it does not(if applicable)

  * added two more http and https schema tests for img tag with a data uri src
  * added blockquote test and cite sanitation
@chrispinkney
Copy link
Contributor Author

chrispinkney commented Nov 3, 2020

So this is my first solo rebase that I did because I had like 3 commits and figured it'd be too messy. I also pushed them to my origin by accident, so I git rebase HEAD~3 -i and squashed the other two, then force pushed this to my branch and made a PR. I think I did all the steps right so let me know if I didn't!

@humphd
Copy link
Contributor

humphd commented Nov 3, 2020

@chrispinkney can you fix the title of this PR? fixes issue-1266 is not useful, as it doesn't reference what you're working on (the issue number is good, but describe the work too).

@manekenpix manekenpix changed the title fixes issue-1266 Fixes #1266: Missing src attribute on img elements in posts when using data URIs Nov 3, 2020
Copy link
Contributor

@humphd humphd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a test for protocol-less URLs? (i.e., <img src="//www.telescope.com/image.jpg" />)

src/backend/utils/html/sanitize.js Outdated Show resolved Hide resolved
test/sanitize-html.test.js Show resolved Hide resolved
test/sanitize-html.test.js Outdated Show resolved Hide resolved
@chrispinkney
Copy link
Contributor Author

chrispinkney commented Nov 3, 2020

Can you also add a test for protocol-less URLs? (i.e., <img src="//www.telescope.com/image.jpg" />)

I was going to add that too but there's already one in place that does the same thing here.

Ah nvm, misread. Will do.

@chrispinkney chrispinkney changed the title Fixes #1266: Missing src attribute on img elements in posts when using data URIs Fixes #1266: modifies sanitize.js to allow Telescope to parse img tags with data uris Nov 3, 2020
@chrispinkney
Copy link
Contributor Author

@chrispinkney can you fix the title of this PR? fixes issue-1266 is not useful, as it doesn't reference what you're working on (the issue number is good, but describe the work too).

Is this better?

@humphd humphd requested review from c3ho and tonyvugithub November 6, 2020 15:54
@tonyvugithub
Copy link
Contributor

tonyvugithub commented Nov 6, 2020

@chrispinkney @humphd I wonder why the github action for macos failed. Should we retry with another push with a revised single commit and see if all are passed? Right now one check has failed.

This is the error in the macos build
[HPM] Proxy rewrite rule created: "^/admin/build/log" ~> "/log" [ 2020-11-03 23:27:02.081 ] ERROR (3824 on Mac-1604444996117.local): Something went wrong with search indexing

@humphd
Copy link
Contributor

humphd commented Nov 6, 2020

That failure is not actually causing the tests to fail (i.e, the test expects the failure). What's causing the failure is some process hanging and blocking the tests from exiting correctly (see #1284):

A worker process has failed to exit gracefully and has been force exited. This is likely caused by tests leaking due to improper teardown. Try running with --detectOpenHandles to find leaks.

Test Suites: 26 passed, 26 total
Tests:       169 passed, 169 total
Snapshots:   0 total
Time:        15.689 s
Ran all test suites.
Error: The operation was canceled.

@tonyvugithub
Copy link
Contributor

tonyvugithub commented Nov 6, 2020

@humphd I noticed that too. So I guess this PR is good to go then if thats the case

@humphd humphd merged commit 2f76f77 into Seneca-CDOT:master Nov 6, 2020
manekenpix pushed a commit to manekenpix/telescope that referenced this pull request Nov 11, 2020
…rse img tags with data uris (Seneca-CDOT#1282)

* fixes issue-1266, adds a test to ensure data URIs are sanitized properly
  * added two more http and https schema tests for img tag with a data uri src
  * added blockquote test and cite sanitation

* removed irrelevant test, removed allowed attribute, and added a protocolless test.

* added helpful comment regarding mixed content to img tag over http test
@chrispinkney chrispinkney deleted the issue-1266 branch April 22, 2021 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing src attribute on img elements in posts when using data URIs
4 participants