There might be hidden GC problem in ext/pg.c #47

ged · 2010-09-19T20:20:10Z

In totally different project (rocaml) I tracked down and identified (and fixed) this GC bug relating to rb_ary_new2. Then I went looking similar issues in the net and found this http://redmine.ruby-lang.org/issues/show/3174 which relates to pg and segfault.

This is kind of bug, which only happens when GC is triggered while code is executing in C code. And rb_ary_new2 has already created. And initialization of it is not complete yet! And code is using rb_newobj (maybe thru some non-visible way; like rb_tainted_str_new or rb_tainted_str_new2 and ending up in str_alloc/NEWOBJ).

So, in normal "toy" examples, you cannot reproduce it, because GC is not hit in trivial, short toy examples. You need memory shortage and big return array (created in foreign code side, not Ruby side; that means long row in ruby-pg case) to trigger this one.

I have identified two possible places where this could happen in pg.c. Search for rb_ary_new2 there. Ignore ones with rb_ary_new2(0) calls. You should find make_column_result_array and pgresult_fields methods (lines 3593 and 3652). In those methods, array is created and code starts to fill it. Problem is, code uses method which end up calling NEWOBJ macro in Ruby code, which might GC. And in that case, GC will find uninitialized memory pointer, tries to follow it and then it segfaults. Oops.

How to fix it? Simply initializing all those array fields before calling any rb__new methods. Qnil is enough. Something like following code (adjust to match your coding standards):

  ary = rb_ary_new2(n);
  for(i = 0; i < n; i++) {
      RARRAY(ary)->ptr[i] = Qnil;
  }

At least this worked with rocaml project and stopped it segfaulting while passing large arrays from OCaml to Ruby via rocaml generated C code.

Use your own judgement to decide if this actually affects ruby-pg code. And please note, I'm not expect on Ruby GC. This report might be total bs.

JMP

The text was updated successfully, but these errors were encountered:

ged · 2010-09-20T14:24:18Z

Original comment by Michael Granger (Bitbucket: ged, GitHub: ged).

Thanks for bringing this to my attention! I'll familiarize myself with the problem a bit more and then apply the fix.

ged · 2011-01-06T22:50:57Z

Original comment by Anonymous.

Perhaps rb_ary_new2() is calling malloc() for RARRAY(ary)->ptr.
malloc() does not bzero() the buffer on most platforms.
I'd say it would be a bug in MRI rb_ary_new2().
When you call Array.new(100) you get an Array with 100 nils.

ged · 2011-01-19T16:18:40Z

Original comment by Michael Granger (Bitbucket: ged, GitHub: ged).

I'll look at this for 0.11.0.

ged · 2012-03-10T02:08:32Z

Original comment by Michael Granger (Bitbucket: ged, GitHub: ged).

Avoid use of uninitialized Arrays (fixes #47).

Instead of declaring an Array at the beginning of fetching results and pushing values onto it to return at the end (which could result in a segfault if the GC runs during a NEWOBJ), collect an array of VALUEs on the stack and create the Array on return with rb_ary_new4().

ged closed this as completed Mar 10, 2012

ged added major bug 0.10.1 labels Oct 8, 2019

ged mentioned this issue Oct 8, 2019

Change URLs to point to Github #43

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There might be hidden GC problem in ext/pg.c #47

There might be hidden GC problem in ext/pg.c #47

ged commented Sep 19, 2010

ged commented Sep 20, 2010

ged commented Jan 6, 2011

ged commented Jan 19, 2011

ged commented Mar 10, 2012

There might be hidden GC problem in ext/pg.c #47

There might be hidden GC problem in ext/pg.c #47

Comments

ged commented Sep 19, 2010

ged commented Sep 20, 2010

ged commented Jan 6, 2011

ged commented Jan 19, 2011

ged commented Mar 10, 2012