Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.css() splits a property declaration if it contains a semi-colon #1134

Closed
carrieje opened this issue Jan 27, 2018 · 7 comments · Fixed by #2521
Closed

.css() splits a property declaration if it contains a semi-colon #1134

carrieje opened this issue Jan 27, 2018 · 7 comments · Fixed by #2521
Labels

Comments

@carrieje
Copy link

carrieje commented Jan 27, 2018

Hello !

I am reporting a bug I found.
I am running cheerio 1.0.0-rc.2
I have an unexpected behavior using the .css() function.
It should go along the way to the closing parenthesis of my url statement.
What I actually get is that it stops at the semi-colon ; as if it was the end of my property declaration.

Initial settings

const cheerio = require('cheerio')
const $ = cheerio.load(`<button style="background-image: url()"></button>`)

Faulty function

$('button').css('background-image')

Expected output

'url()'

Given output

'url(data:image/png'
@carrieje
Copy link
Author

carrieje commented Jan 27, 2018

It's a duplicate of #907 sorry.
I mis-searched.

But it has no activity since 2016 ?
In the meantime, I'm going to look how #315 was resolved, as I think it may be linked.

@carrieje
Copy link
Author

carrieje commented Jan 27, 2018

A data-uri scheme is as follow :

 data:[<media type>][;base64],<data>

So, i figured I could rely on base64 being present after a semi-colon to not trigger the split.
I am opening a PR shortly

EDIT: <media type>

The media type part may include one or more parameters, in the format attribute=value, separated by semicolons (;)

@carrieje
Copy link
Author

PR opened #1135

@Brandoning
Copy link

This is also an issue when I have a URL that includes semicolon (;) even when it isn't part of a base64 encoded URL.

My clients website is old and includes these as part of the URL which I think i can see is part of an original spec that it is actually a supported, reserved character for the URL such as / and ? are. I'm not sure if this has been superseded by something but this at least indicates to me that it should be supported and there will be use cases outside of the base64 encoding of images.

See below for an example of the source, and how it's parsed. It decides to just completely end parsing at the ; character for that CSS rule which breaks it.

Screenshot 2019-03-25 at 11 33 30

zzz

@bruceCzK
Copy link

bruceCzK commented Jul 2, 2019

I've had the same problem, and for now I'm using a workaround by replacing ;base64, to something like #base64, then replace it back before output the html string.
Hope this issue will be solved in the future.

@iloginow
Copy link

I am experiencing the same issue. @bruceCzK thank you for the handy workaround.

@fb55 fb55 added the ❌ Bug label Dec 22, 2020
@5saviahv
Copy link
Contributor

I would suggest package cssom it gives very browser like interface. All parsing and serializing is done by package.

caveat is if content for parsing has some faulty content it throws.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

6 participants