-
-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory access violation / Crash during GC #1985
Comments
@aleksandrs-ledovskis Thanks for opening this issue, and sorry you're experiencing this problem. I'll take a deeper look at this over the next few days. |
I couldn't reproduce with Ruby, but the explanation makes sense. I think the XSLT parser shouldn't touch memory that it didn't allocate. Anyway, I was able to write a C program that uses libxml2 to demonstrate the problem: #include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <libxml/parser.h>
#include <libxml/xmlschemas.h>
static xmlNode *
find_text_node(xmlNode * a_node)
{
xmlNode *cur_node = NULL;
for (cur_node = a_node; cur_node; cur_node = cur_node->next) {
if (cur_node->type == XML_TEXT_NODE) {
return cur_node;
} else {
xmlNode *found = find_text_node(cur_node->children);
if (found) {
return found;
}
}
}
return NULL;
}
int main(int argc, char *argv[]) {
char * source = "<?xml version=\"1.0\" encoding=\"UTF-8\" ?><xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\">\n<xs:element name=\"foo\" type=\"xs:string\"/></xs:schema>";
xmlDocPtr doc = xmlReadMemory(source, strlen(source), NULL, NULL, 0);
xmlNodePtr root = xmlDocGetRootElement(doc);
xmlSchemaParserCtxtPtr ctx;
xmlSchemaPtr schema;
xmlNode * text = find_text_node(root);
printf("node type: Element, name: %s %d\n", text->name, xmlIsBlankNode(text));
ctx = xmlSchemaNewDocParserCtxt(doc);
schema = xmlSchemaParse(ctx);
printf("node type: Element, name: %s %d\n", text->name, xmlIsBlankNode(text));
exit(0);
} If I enable malloc scribbling on MacOS, it will segv:
I think upstream should probably not call that free function when a schema is created from a document. |
@tenderlove Thanks for taking a time to investigate and building a C repro program. It is somewhat strange that you couldn't repeat the crash. While I was testing initially with CRuby 2.5.7 (to match with PROD version), I have just retested with
Env and build chain (if that matters):
Interestingly enough, 2.8.0 crashed less (3 out of 5 runs). But the traces still pointed out to same Regarding your note that this looks like a problem in upstream/Libxml2, their "official guideline" is quite involved (and seems outdated), so I would like to ask @nwellnhof to confirm if mailing list is still the way to go or reporting on GitLab instance (https://gitlab.gnome.org/GNOME/libxml2/issues) is sufficient? Nokogiri is using Libxml2 |
You can report bugs either on the libxml2 GitLab instance or the mailing list. Regarding this issue, |
If per Nick's comment it's unfeasible to change the way how Libxml2 operates, maybe at least Nokogiri could detect (and I would prefer to know right away that unsafe operation is attempted rather than to come back to crashed app's carcass. |
tbf, I only ran the Ruby program twice. Your description was clear enough and reading the code made it obvious there's a bug. I figured we'd need a pure C test case anyway if we plan to fix upstream. I will try to think of a fix for libxml2, but even if I can't I'll still file a ticket upstream.
I think this is a possible alternative. The document should know what nodes have been exposed to Ruby, and if any of them are "blank", we can raise an exception. It'll make constructing the schema a little more expensive, but I agree it's better to get an exception than find a segv later. |
Probably needs a better exception, but this would prevent the segvs: diff --git a/ext/nokogiri/xml_schema.c b/ext/nokogiri/xml_schema.c
index da2774ba..42bbeb8d 100644
--- a/ext/nokogiri/xml_schema.c
+++ b/ext/nokogiri/xml_schema.c
@@ -133,6 +133,31 @@ static VALUE read_memory(VALUE klass, VALUE content)
return rb_schema;
}
+/* Schema creation will remove and deallocate "blank" nodes.
+ * If those blank nodes have been exposed to Ruby, they could get freed
+ * out from under the VALUE pointer. This function checks to see if any of
+ * those nodes have been exposed to Ruby, and if so we should raise an exception.
+ */
+static int has_blank_nodes_p(VALUE cache)
+{
+ long i;
+
+ if (NIL_P(cache)) {
+ return 0;
+ }
+
+ for (i = 0; i < RARRAY_LEN(cache); i++) {
+ xmlNodePtr node;
+ VALUE element = rb_ary_entry(cache, i);
+ Data_Get_Struct(element, xmlNode, node);
+ if (xmlIsBlankNode(node)) {
+ return 1;
+ }
+ }
+
+ return 0;
+}
+
/*
* call-seq:
* from_document(doc)
@@ -152,6 +177,10 @@ static VALUE from_document(VALUE klass, VALUE document)
/* In case someone passes us a node. ugh. */
doc = doc->doc;
+ if (has_blank_nodes_p(DOC_NODE_CACHE(doc))) {
+ rb_raise(rb_eRuntimeError, "Creating a schema from a document that has blank nodes exposed to Ruby is dangerous");
+ }
+
ctx = xmlSchemaNewDocParserCtxt(doc);
errors = rb_ary_new(); |
Something I think is interesting is that you can pretty easily demonstrate that turning a document in to a schema removes the node: require 'nokogiri'
doc = <<~doc
<?xml version="1.0" encoding="UTF-8" ?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="foo" type="xs:string"/></xs:schema>
doc
xsd_doc = Nokogiri::XML(doc)
p xsd_doc.root.children.find(&:blank?) # Finds a node
xsd_doc = Nokogiri::XML(doc)
Nokogiri::XML::Schema.from_document(xsd_doc)
p xsd_doc.root.children.find(&:blank?) # nil Also this will reliably SEGV with MallocScribble: require 'nokogiri'
doc = <<~doc
<?xml version="1.0" encoding="UTF-8" ?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="foo" type="xs:string"/></xs:schema>
doc
xsd_doc = Nokogiri::XML(doc)
node = xsd_doc.root.children.find(&:blank?) # Finds a node
Nokogiri::XML::Schema.from_document(xsd_doc)
p node.empty?
|
Sorry for the wait. I submitted an issue upstream because I couldn't figure out a good way to fix this: https://gitlab.gnome.org/GNOME/libxml2/issues/148 I'll submit a PR that has my workaround patch. |
This commit works around a bug in libxml2 where parsing schemas can result in dangling pointers which can lead to a segv. Upstream bug is here: https://gitlab.gnome.org/GNOME/libxml2/issues/148 Fixes #1985
@aleksandrs-ledovskis I'm curious if the PR @tenderlove wrote will help your case; it raises an exception if any blank nodes have been wrapped by a Ruby object. Your original description indicated this was a production problem for you, from which I infer you're creating Ruby objects for the DOM in your schema. Before I merge it, just wanted to gut-check whether it's helpful or not. |
@flavorjones probably getting an exception is better than a segv for a case we know we can't handle. 😬 |
@tenderlove Thanks for the patch and upstream report! 🍪 I can confirm that locally built 74abb4f is correctly failing with
In fact not. The failing app in question deals with incoming SOAP requests and validates them against defined XML schema (read from file). A quirk in current app's implementation is that Approximately like this: xsd_doc = File.open(path_to_xsd_file) { |file| Nokogiri::XML(file) }
if xsd_doc.xpath("//xs:import[@namespace='http://schemas.xmlsoap.org/soap/envelope/']").empty?
xsd_doc.xpath("//xs:schema")[0].children.first.add_previous_sibling("import tag with envelope schemaLocation")
end
xsd_schema = Nokogiri::XML::Schema.from_document(xsd_doc) Obviously, the offending call sequence is The latter variant is safe/doesn't cause an exception with 74abb4f patch - while
Yes, the @tenderlove's patch is helpful. If in some other code path/project the |
@aleksandrs-ledovskis Great, thanks for the explanation - I didn't want to merge this if you didn't have a path forward to do what you need to do. |
I've merged the PR. I'll be cutting a bugfix release today - likely v1.10.9 - with this fix. |
This commit works around a bug in libxml2 where parsing schemas can result in dangling pointers which can lead to a segv. Upstream bug is here: https://gitlab.gnome.org/GNOME/libxml2/issues/148 Fixes #1985
v1.10.9 was released yesterday |
Was tasked with fixing crash in one production application that was pointing to Nokogiri.
Two day's worth of debugging resulted in following.
My understanding of the issue
Check relevant
# --N--
markers in dehydrated reproduction code.--1--
XML schema document is parsed and by Nokogiri as Libxml2's
xmlDoc
.--2--
xsd_doc.first_element_child.children.to_a
generates axmlNodeSet
with bunch ofxmlNode
's, including one withnode->type == XML_TEXT_NODE
andnode->content == '\n'
.--3--
Schema::from_document
executesxmlSchemaParse
in Libxml2xmlschemas.c
whichcalls
and later
The
xmlNode
representing'\n'
is flushed away usingxmlFree
(ruby_xfree
).--4--
A GC pressure is built up by doing some random array generation. One of
RData
VALUE
s will almost certainly hit place occupied by prunedxmlNode
.During on of array capacity increases, a GC is auto-started which executes
push_mark_stack
ingc.c
with one ofdata
being aRData
pointing toxmlNode
This is direct cause of crash.
While GC runs mark & sweep routines, during one of iterations of which, the
gc_mark_stacked_objects
topop_mark_stack
call is done which returnsRData
which has been pushed on mark stack. It is detected asT_DATA
with customdmark
(mark
in Nokogiri'sxml_node.c
).As node was free'd and its memory was reused by others objects,
xmlDocPtr doc = node->doc;
returns inaccessible memory location. Any attempt to use it down the run yields segfault.To Reproduce
Environment
master
nokogiri and bundledlibxml2/libxslt
, compiled with-O0
(it was a royal pain to get working)Full C level backtrace
The text was updated successfully, but these errors were encountered: