Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<td> tags with no <tr> are not properly handled #422

Closed
defunctzombie opened this issue Mar 24, 2014 · 10 comments · Fixed by #985
Closed

<td> tags with no <tr> are not properly handled #422

defunctzombie opened this issue Mar 24, 2014 · 10 comments · Fixed by #985

Comments

@defunctzombie
Copy link

I am being passed some broke-ass-html where there is a missing <tr> tag before a <td>. This causes my queryies for table rows to fail as cheerio (or maybe the htmlparser) is not scoping the <td> inside of a logical <tr> as the browser does.

var cheerio = require('cheerio');
var str = '<table><td>bar</td></tr></table>';
var $ = cheerio.load(str);
console.log($.html());
@defunctzombie
Copy link
Author

Cheerio v0.13.1 btw (latest from what I can tell)

@fb55
Copy link
Member

fb55 commented Mar 25, 2014

The parser doesn't add opening tags. Known issue.

@imshengli
Copy link

I have a same question!

@defunctzombie
Copy link
Author

@fb55 Has there been any work on this issue? Is it an issue that can be fixed or should be closed?

@fb55
Copy link
Member

fb55 commented Oct 18, 2014

@defunctzombie This can definitely be fixed, but it'll take a lot of effort. I'd like to keep this issue open as long as it's valid.

@rpedela
Copy link

rpedela commented Jun 23, 2015

+1

Are there any known workarounds? I have the exact same problem.

@rpedela
Copy link

rpedela commented Jun 23, 2015

I answered my own question. jsdom does not have this problem as it just uses jQuery. The API is a bit ugly, but it works. If you use Node.js instead of io.js then use jsdom 3.x.x. It might be worth looking at how jQuery handles this case.

npm install jsdom@3
var jsdom = require('jsdom').jsdom;
var window = jsdom(body).parentWindow;
jsdom.jQueryify(window, 'http://code.jquery.com/jquery-2.1.1.js', function () {
    var $ = window.$;
    console.log($('body').html());
});

@defunctzombie
Copy link
Author

What is the perf like for jsdom now? I remember that being a bit of a show
stopper for my needs when evaluating solutions.

On Tuesday, June 23, 2015, Ryan Pedela notifications@github.com wrote:

I answered my own question. jsdom https://github.com/tmpvar/jsdom does
not have this problem as it just uses jQuery. The API is a bit ugly, but it
works. If you use Node.js instead of io.js then use jsdom 3.x.x
https://github.com/tmpvar/jsdom/tree/3.x. It might be worth looking at
how jQuery handles this case.

npm install jsdom@3

var jsdom = require('jsdom').jsdom;var window = jsdom(body).parentWindow;
jsdom.jQueryify(window, 'http://code.jquery.com/jquery-2.1.1.js', function () {
var $ = window.$;
console.log($('body').html());
});


Reply to this email directly or view it on GitHub
#422 (comment).

@rpedela
Copy link

rpedela commented Jun 23, 2015

I don't know. It seems fine, but performance isn't that important for my use case. You can load a local version of jQuery to eliminate one HTTP call.

@fb55
Copy link
Member

fb55 commented Dec 22, 2020

Fixed by #985

@fb55 fb55 closed this as completed Dec 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

4 participants