Ruby test suite failures with libffi-3.4.2 #102

voxik · 2022-01-11T19:02:27Z

Hi, libffi-3.4.2 recently landed in Fedora and since that time, I observe strange failures in Ruby test suite:

  1) Failure:
TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_autoload.rb:380]:
[ruby-core:86410] [Bug #14634].
Expected #<Test::Unit::AssertionFailedError: Expected #<Process::Status: pid 3249430 SIGABRT (signal 6) (core dumped)> to be success?.> to be nil.
  2) Failure:
TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/tool/lib/zombie_hunter.rb:6]:
Expected [[3249431, #<Process::Status: pid 3249431 SIGABRT (signal 6) (core dumped)>]] to be empty.
  3) Failure:
TestRand#test_fork_shuffle [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_rand.rb:276]:
#<Process::Status: pid 3249809 SIGABRT (signal 6) (core dumped)>.
Expected #<Process::Status: pid 3249809 SIGABRT (signal 6) (core dumped)> to be success?.
  4) Failure:
TestRand#test_rand_reseed_on_fork [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_rand.rb:306]:
[ruby-core:41209]
pid 3249817 killed by SIGABRT (signal 6) (core dumped)
  5) Failure:
TestIO#test_copy_stream_socket7 [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_io.rb:995]:
Expected #<Process::Status: pid 3251775 SIGABRT (signal 6) (core dumped)> to be success?.
  6) Failure:
JSONGeneratorTest#test_broken_bignum [/builddir/build/BUILD/ruby-3.1.0/test/json/json_generator_test.rb:305]:
Failed assertion, no message given.
  7) Failure:
TestBeginEndBlock#test_internal_errinfo_at_exit [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_beginendblock.rb:175]:
Expected #<Process::Status: pid 3252235 SIGABRT (signal 6) (core dumped)> to not be signaled?.
  8) Failure:
TestProcess#test_signals_work_after_exec_fail [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_process.rb:2428]:
Expected #<Process::Status: pid 3252558 SIGABRT (signal 6) (core dumped)> to be success?.
  9) Failure:
TestProcess#test_threading_works_after_exec_fail [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_process.rb:2464]:
Expected #<Process::Status: pid 3252910 SIGABRT (signal 6) (core dumped)> to be success?.
 10) Failure:
TestProcess#test_process_detach [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_process.rb:2348]:
Expected #<Process::Status: pid 3253011 SIGABRT (signal 6) (core dumped)> to be success?.
 11) Failure:
TestThread#test_blocking_mutex_unlocked_on_fork [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_thread.rb:1223]:
[ruby-core:55102] [Bug #8433].
<false> expected but was
<nil>.
 12) Failure:
TestThread#test_fork_in_thread [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_thread.rb:1243]:
[ruby-core:62070] [Bug #9751]
pid 3253221 killed by SIGABRT (signal 6) (core dumped).
Expected #<Process::Status: pid 3253221 SIGABRT (signal 6) (core dumped)> to not be signaled?.
 13) Failure:
TestThread#test_fork_while_locked [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_thread.rb:1254]:
[ruby-core:85940] [Bug #14578].
Expected #<Process::Status: pid 3253241 SIGABRT (signal 6) (core dumped)> to be success?.
 14) Failure:
TestThread#test_fork_while_locked [/builddir/build/BUILD/ruby-3.1.0/tool/lib/zombie_hunter.rb:6]:
Expected [[3253249, #<Process::Status: pid 3253249 SIGABRT (signal 6) (core dumped)>],
 [3253250, #<Process::Status: pid 3253250 SIGABRT (signal 6) (core dumped)>]] to be empty.
Finished tests in 885.270988s, 24.2186 tests/s, 3106.1596 assertions/s.
21440 tests, 2749793 assertions, 14 failures, 0 errors, 56 skips
ruby -v: ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-linux]
make: *** [uncommon.mk:822: yes-test-all] Aborted (core dumped)

Executing the test on itself, they passes just fine, but testing e.g. test/ruby/test_autoload.rb -v -n '/TestAutoload#test_autoload_fork/' together with test/fiddle/test_import.rb makes the reproducer smaller.

Trying to reduce the issue even further, I have reduced the test/ruby/test_import.rb into the following shape:

# coding: US-ASCII
# frozen_string_literal: true
begin
  require_relative 'helper'
  require 'fiddle/import'
rescue LoadError
end

module Fiddle
  module LIBC
    extend Importer
    dlload LIBC_SO, LIBM_SO

    CallCallback = bind("void call_callback(void*, void*)"){ | ptr1, ptr2|
#      f = Function.new(ptr1.to_i, [TYPE_VOIDP], TYPE_VOID)
#      f.call(ptr2)
    }
  end


end if defined?(Fiddle)

And use just miniruby:

$ gdb --args ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems "./test/runner.rb" --ruby="./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems" --excludes-dir=./test/excludes --name='!/memory_leak/'  test/fiddle/test_import.rb test/ruby/test_autoload.rb -v -n '/TestAutoload#test_autoload_fork/'
GNU gdb (GDB) Fedora 11.1-6.fc36
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./miniruby...
warning: File "/builddir/build/BUILD/ruby-3.1.0/.gdbinit" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
	add-auto-load-safe-path /builddir/build/BUILD/ruby-3.1.0/.gdbinit
line to your configuration file "/builddir/.config/gdb/gdbinit".
To completely disable this security protection add
	set auto-load safe-path /
line to your configuration file "/builddir/.config/gdb/gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
	info "(gdb)Auto-loading safe path"
(gdb) r
Starting program: /builddir/build/BUILD/ruby-3.1.0/miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems ./test/runner.rb --ruby=./miniruby\ -I./lib\ -I.\ -I.ext/common\ \ ./tool/runruby.rb\ --extout=.ext\ \ --\ --disable-gems --excludes-dir=./test/excludes --name=\!/memory_leak/ test/fiddle/test_import.rb test/ruby/test_autoload.rb -v -n /TestAutoload\#test_autoload_fork/
Download failed: No route to host.  Continuing without debug info for /builddir/build/BUILD/ruby-3.1.0/system-supplied DSO at 0x7ffff7fc4000.
Download failed: No route to host.  Continuing without debug info for /lib64/libz.so.1.
Download failed: No route to host.  Continuing without debug info for /lib64/libgmp.so.10.
Download failed: No route to host.  Continuing without debug info for /lib64/libcrypt.so.2.
Download failed: No route to host.  Continuing without debug info for /lib64/libm.so.6.
Download failed: No route to host.  Continuing without debug info for /lib64/libc.so.6.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
process 13364 is executing new program: /builddir/build/BUILD/ruby-3.1.0/ruby
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34.9000-36.fc36.x86_64 gmp-6.2.1-1.fc36.x86_64 libxcrypt-4.4.27-1.fc36.x86_64 zlib-1.2.11-30.fc35.x86_64
Download failed: No route to host.  Continuing without debug info for /lib64/libz.so.1.
Download failed: No route to host.  Continuing without debug info for /lib64/libgmp.so.10.
Download failed: No route to host.  Continuing without debug info for /lib64/libcrypt.so.2.
Download failed: No route to host.  Continuing without debug info for /lib64/libm.so.6.
Download failed: No route to host.  Continuing without debug info for /lib64/libc.so.6.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Run options: 
  --seed=54837
  "--ruby=./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems"
  --excludes-dir=./test/excludes
  --name=!/memory_leak/
  -v
  -n
  /TestAutoload#test_autoload_fork/

# Running tests:

[Detaching after vfork from child process 13401]
[1/0] TestAutoload#test_autoload_fork[New Thread 0x7ffff4ccf640 (LWP 13402)]
[New Thread 0x7ffff4bae640 (LWP 13403)]
[New Thread 0x7ffff4a8d640 (LWP 13404)]
[New Thread 0x7ffff496c640 (LWP 13405)]
[New Thread 0x7ffff484b640 (LWP 13406)]
[New Thread 0x7ffff472a640 (LWP 13407)]
[Detaching after fork from child process 13408]
[Detaching after fork from child process 13409]
[Detaching after fork from child process 13410]
 = 0.39 s

  1) Failure:
TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/test/ruby/test_autoload.rb:380]:
[ruby-core:86410] [Bug #14634].
Expected #<Test::Unit::AssertionFailedError: Expected #<Process::Status: pid 13409 SIGABRT (signal 6) (core dumped)> to be success?.> to be nil.

  2) Failure:
TestAutoload#test_autoload_fork [/builddir/build/BUILD/ruby-3.1.0/tool/lib/zombie_hunter.rb:6]:
Expected [[13410, #<Process::Status: pid 13410 SIGABRT (signal 6) (core dumped)>]] to be empty.

Finished tests in 0.392854s, 2.5455 tests/s, 12.7274 assertions/s.
1 tests, 5 assertions, 2 failures, 0 errors, 0 skips

ruby -v: ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-linux]

Thread 1 "ruby" received signal SIGABRT, Aborted.
0x00007ffff78a764c in __pthread_kill_implementation () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34.9000-36.fc36.x86_64 gmp-6.2.1-1.fc36.x86_64 libxcrypt-4.4.27-1.fc36.x86_64 zlib-1.2.11-30.fc35.x86_64
(gdb) bt
#0  0x00007ffff78a764c in __pthread_kill_implementation () from /lib64/libc.so.6
#1  0x00007ffff785a656 in raise () from /lib64/libc.so.6
#2  0x00007ffff7844833 in abort () from /lib64/libc.so.6
#3  0x00007ffff4d0a5b8 in dlfree (mem=0x7ffff7bf1010) at ../src/dlmalloc.c:4350
#4  0x00007ffff4d190b1 in dealloc (ptr=0x5555558c1c00) at /builddir/build/BUILD/ruby-3.1.0/ext/fiddle/closure.c:32
#5  0x00007ffff7cb7801 in run_final (zombie=140737300557440, objspace=0x55555555d800) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4011
#6  finalize_list (objspace=objspace@entry=0x55555555d800, zombie=140737300557440) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4030
#7  0x00007ffff7cb80cc in rb_objspace_call_finalizer (objspace=0x55555555d800) at /builddir/build/BUILD/ruby-3.1.0/gc.c:4194
#8  0x00007ffff7ca56eb in rb_ec_finalize (ec=0x55555555dd70) at /builddir/build/BUILD/ruby-3.1.0/eval.c:164
#9  rb_ec_cleanup (ec=ec@entry=0x55555555dd70, ex0=<optimized out>) at /builddir/build/BUILD/ruby-3.1.0/eval.c:256
#10 0x00007ffff7ca5c14 in ruby_run_node (n=0x7ffff7699660) at /builddir/build/BUILD/ruby-3.1.0/eval.c:321
#11 0x000055555555518f in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:47
(gdb) f 3
#3  0x00007ffff4d0a5b8 in dlfree (mem=0x7ffff7bf1010) at ../src/dlmalloc.c:4350
4350	      USAGE_ERROR_ACTION(fm, p);
(gdb) l
4345	          check_free_chunk(fm, p);
4346	          goto postaction;
4347	        }
4348	      }
4349	    erroraction:
4350	      USAGE_ERROR_ACTION(fm, p);
4351	    postaction:
4352	      POSTACTION(fm);
4353	    }
4354	  }

The text was updated successfully, but these errors were encountered:

voxik · 2022-01-13T18:19:17Z

I have reduced the reproducer:

$ cat fiddle_fork.rb 
require 'fiddle/import'

module Fiddle
  module LIBC
    extend Importer
    dlload "libc.so.6", "libm.so.6"

    CallCallback = bind("void call_callback(void*, void*)"){ | ptr1, ptr2| }
  end
end

error, pid, status = IO.pipe do |r, w|
  pid = fork {}
  w.close
  [r.read, *Process.wait2(pid)]
end


$ RUBYLIB=.:lib:.ext/common:.ext/x86_64-linux:tool/lib LD_LIBRARY_PATH=. ./ruby fiddle_fork.rb
Aborted (core dumped)

I think this might be similar to ffi/ffi/issues/621

I have also reported this issue in Fedora:

https://bugzilla.redhat.com/show_bug.cgi?id=2040380

tenderlove · 2022-07-14T20:16:00Z

I tested this on Ubuntu (with clang and gcc) and macOS with clang using libffi version 3.4.2 but I'm not able to reproduce this. Since the problem started happening after upgrading libffi, could it be a problem with libffi rather than a problem with Fiddle?

jackorp · 2022-09-09T09:57:01Z

We looked at this issue closer, and it seems to be composed of SELinux, FFI Closure, and Forking the Ruby process.

From the linked bug: "In modern libffi 3.4.2 we use memfd_create quite eagerly for SELinux protected systems and such closures are shared between the parent and child process." and "Closures may be inherited by the child, and so the parent and child must coordinate the usage.".

Which would be consistent with our findings.

See the following minimal reproducer that triggers the bug with Fedora 36 and later:

require 'fiddle/closure'
require 'fiddle/struct'

Fiddle::Closure.new(Fiddle::TYPE_VOID, [])

fork { }

GC.start

This describes all required to trigger the bug on the Ruby side. Note that SELinux plays a role in this as well since libffi takes a different approach to allocate the closure on SELinux systems.

The reproducer demonstrates the following:
First, we allocate closure, then we fork the process. The forked process inherits the shared FFI memory. Then that process ends, I think that Ruby frees what was present in that child process which corrupts the memory that is still present in the parent, then we start GC in the parent (to make sure GC runs, in bigger programs this would happen automatically in the background), which results in a SIGABRT with the backtrace noted in the first comment.

The failure we observe in the Ruby test suite follows a very similar pattern to the reproducer when it fails for us in the build. First, some Fiddle tests that allocate Closures run, then some other test forks, and then the GC needs to be triggered. Some tests trigger them manually (e.g. TestHash#test_replace_bug15358), and some are very likely to trigger the GC simply by their memory usage (e.g. TestTrick2015#test_kinaba).

We can see the problem is that this memory usage is not coordinated properly between parent and child process.
On Fedora side we currently disable the closure tests (ruby.spec) to prevent this kind of failure.

The possible mitigations for this are:

Enable static trampolines in libffi. As noted in the Bugzilla it seems to improve the situation, but there are some blockers we need to resolve first before we use this approach in Fedora.
Coordinate memory usage between the child and parent processes
Do not run tests using Fiddle::Closure in the main process that might fork later on.

To the last point, we also found that if the Fiddle::Closure is allocated in the child, it is not a problem for overall status of the test suite, therefore this approach could be used for tests using closure to ensure that the tests can run even with SELinux and whatnot. Additionally, with this approach, one more test would be created with the mentioned reproducer to test the integration of libffi on the current system.

Using the previous reproducer:

require 'fiddle/closure'
require 'fiddle/struct'

fork do
  Fiddle::Closure.new(Fiddle::TYPE_VOID, [])
end

fork { }

GC.start

This does not crash the process, unlike the previous reproducer.

I noticed that Ruby tests include assert_separately, which would help with this. I am unsure if it is usable here as it seems unique to the Ruby test suite.

kou · 2022-09-09T15:02:11Z

@jackorp Thanks for your detailed description! Could you try 1343ac7 ?

voxik · 2022-09-09T15:52:15Z

Thx for the patch. I leave the testing to @jackorp thx 😉

Nevertheless, this is fixing the specific use case in the test suite, while this is generic issue. Therefore I wonder if there is a way to e.g.:

Prohibit the fork if some closure is allocated?
Warn that this is not right thing to do.
Make the closure work with fork on Fiddle / ffi level?
Document this behavior in documentation.

I am not ffi expert, so apologies if I am completely off.

jackorp · 2022-09-09T17:19:34Z

Thanks for the patch!

It improved the situation somewhat but not complemetely. After uncomentting and re-enabling the tests for Ruby and applying the patch I observe new issue in the build logs:

  1) Failure:
Fiddle::TestClosure#test_conversion_unsigned_char [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  2) Failure:
Fiddle::TestClosure#test_conversion_unsigned_long [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  3) Failure:
Fiddle::TestClosure#test_const_string [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  4) Failure:
Fiddle::TestClosure#test_conversion_long [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  5) Failure:
Fiddle::TestClosure#test_conversion_char [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  6) Failure:
Fiddle::TestClosure#test_conversion_unsigned_long_long [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  7) Failure:
Fiddle::TestClosure#test_conversion_int [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  8) Failure:
Fiddle::TestClosure#test_conversion_long_long [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  9) Failure:
Fiddle::TestClosure#test_conversion_unsigned_int [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
 10) Failure:
Fiddle::TestClosure#test_conversion_short [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
 11) Failure:
Fiddle::TestClosure#test_block_caller [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
 12) Failure:
Fiddle::TestClosure#test_conversion_unsigned_short [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.

The Ruby test suite failed. Although without a segfault, that is an improvement.

Interestingly there seems to be some other issue with this, as I have another build log:

  1) Failure:
Fiddle::TestFunc#test_qsort1 [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_func.rb:87]:
<0> expected but was
<1>.

Here, only a single test failed. Some builds do pass, but there are more of those that do not pass.

I think sometimes there is a (uncollectable maybe?) reference to Fiddle::Closure introduced in the tests. I'll try some local testing and poking around to find what that reference is.

Maybe asserting that there is either 0 or 1 reference in the ObjectSpace would be the way to go, but it requires lengthier testing on my side to confirm that this approach would not lead to a segfault due to the "random" nature of GC.

kou · 2022-09-09T20:32:21Z

Thanks for confirming the patch.
It seems that we need a feature that frees a Fiddle::Closure explicitly without GC. e.g.:

closure = Fiddle::Closure.new(...)
closure.free # Free the closure immediately.

Fiddle::Closure.allocate(...) do |closure|
  # ...
end # Free the allocated closure when the block is exited.

But I don't have a good name for the feature for now... I consider the name in RubyKaigi 2022... (I give a talk today. I consider the name after my talk is finished.)

kou · 2022-09-14T04:01:55Z

@jackorp Could you try a0ccc6b ?

jackorp · 2022-09-14T09:10:06Z

Hmm, not better. I also applied the previous commit with GC.start to ensure application of the a0ccc6b without modifying the patch too much. I am seeing similar failures as before.

However I have to note, in order to run the whole Ruby test suite together with the patches, I patch the fiddle shipped with Ruby 3.1.2. I may be missing a commit that makes all this works together.

If my patch-only approach fails, I'll try importing fiddle from the newest master branch commit as a gem and see if that improves the situation.

The work is happening on my downstream Fedora Ruby fork branch: https://src.fedoraproject.org/fork/jackorp/rpms/ruby/commits/fiddle_closure_test

Failed build: https://koji.fedoraproject.org/koji/taskinfo?taskID=91998265
relevant test failures, note failure number 14, which is new:

  1) Failure:
Fiddle::TestFunc#test_qsort1 [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_func.rb:87]:
<0> expected but was
<1>.
  2) Failure:
Fiddle::TestClosure#test_conversion_unsigned_long_long [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  3) Failure:
Fiddle::TestClosure#test_const_string [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  4) Failure:
Fiddle::TestClosure#test_conversion_unsigned_char [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  5) Failure:
Fiddle::TestClosure#test_conversion_unsigned_long [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  6) Failure:
Fiddle::TestClosure#test_conversion_long [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  7) Failure:
Fiddle::TestClosure#test_conversion_char [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  8) Failure:
Fiddle::TestClosure#test_conversion_unsigned_short [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
  9) Failure:
Fiddle::TestClosure#test_conversion_int [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
 10) Failure:
Fiddle::TestClosure#test_block_caller [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
 11) Failure:
Fiddle::TestClosure#test_conversion_short [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
 12) Failure:
Fiddle::TestClosure#test_conversion_unsigned_int [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
 13) Failure:
Fiddle::TestClosure#test_conversion_long_long [/builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:14]:
<0> expected but was
<1>.
 14) Error:
Fiddle::TestClosure#test_free:
ArgumentError: wrong number of arguments (given 0, expected 1+)
    /builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:83:in `block in test_free'
    /builddir/build/BUILD/ruby-3.1.2/.ext/common/fiddle/closure.rb:20:in `create'
    /builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:82:in `test_free'
    /builddir/build/BUILD/ruby-3.1.2/tool/test/runner.rb:23:in `<top (required)>'
    ./test/runner.rb:14:in `require_relative'
    ./test/runner.rb:14:in `<main>'

GitHub: GH-102 This also improves freed closures assertions.

GitHub: GH-102

kou · 2022-09-14T22:10:11Z

Sorry... I missed some tests to be switched to new API...

Could you also apply the following patches?

GitHub: GH-102

kou · 2022-09-14T23:54:17Z

Sorry. One more commit: 2530496

jackorp · 2022-09-15T10:35:18Z

Thanks! The state is Much better now.

However this error from the logs is puzzling:

 1) Error:
Fiddle::TestClosure#test_free:
NoMethodError: undefined method `assert_false' for #<Fiddle::TestClosure:0x00007f5f73a35fa0 @__name__=:test_free, @__io__=nil, @passed=false, @_assertions=0, @options={:job_status=>nil, :retry=>true, :hide_skip=>true, :repeat_count=>nil, :excludes=>["./test/excludes"], :ruby=>["./miniruby", "-I./lib", "-I.", "-I.ext/common", "./tool/runruby.rb", "--extout=.ext", "--", "--disable-gems"], :filter=>/\A(?=.*)(?!.*(?-mix:(?-mix:memory_leak)|(?-mix:TestAddressResolve#test_socket_getnameinfo_domain_blocking)))/, :verbose=>true, :seed=>10492}, @__gc_disabled__=false, @tracepoint_captured_singlethread=true, @tracepoint_captured_stat=[[#<RubyVM:0x00007f5f7deb3af0>, 0, 0]], @libc=#<Fiddle::Handle:0x00007f5f73a3a2d0>, @libm=#<Fiddle::Handle:0x00007f5f73a3a2a8>>
Did you mean?  assert_file
               assert_raise
               assert_same
               assert_raises
               assert_all?
        assert_false(closure.freed?)
        ^^^^^^^^^^^^
    /builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:86:in `block in test_free'
    /builddir/build/BUILD/ruby-3.1.2/.ext/common/fiddle/closure.rb:20:in `create'
    /builddir/build/BUILD/ruby-3.1.2/test/fiddle/test_closure.rb:85:in `test_free'

I don't understand how it does not know about assert_false... I assume Ruby only has the test-unit available in the tool/lib/test/unit during the tests phase. That one does not have assert_false, however it has refute, which seems to fulfill similar role and is also available in the upstream test-unit.

Alternatively, consider something like assert(closure.freed? == false).

This seems like the single error so far. I'll try out more builds in parallel later, just to be more sure.

jackorp · 2022-09-15T13:15:01Z

For now I am using the following patch for e1221297:

From e1221297fb0177d98c8670e5490c8131227621d5 Mon Sep 17 00:00:00 2001
From: Sutou Kouhei <kou@clear-code.com>
Date: Thu, 15 Sep 2022 06:40:31 +0900
Subject: [PATCH 2/6] test: don't use power-assert

It seems that we can't use it in ruby/ruby.
---
 test/fiddle/test_closure.rb | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/test/fiddle/test_closure.rb b/test/fiddle/test_closure.rb
index 1726db7..13dfa5b 100644
--- a/test/fiddle/test_closure.rb
+++ b/test/fiddle/test_closure.rb
@@ -83,13 +83,9 @@ module Fiddle
 
     def test_free
       Closure.create(:int, [:void]) do |closure|
-        assert do
-          not closure.freed?
-        end
+        assert_equal(closure.freed?, false)
         closure.free
-        assert do
-          closure.freed?
-        end
+        assert_equal(closure.freed?, true)
         closure.free
       end
     end
-- 
2.37.3

Note I changed the assert_false to assert_equal(closure.freed?, false).

So far 4 arch specific builds passed with the changes: https://koji.fedoraproject.org/koji/taskinfo?taskID=92035826 I'll run more builds to confirm the results, but this seems successful so far.

jackorp · 2022-09-15T19:41:58Z

Ran 5 builds with 4 different arches each, only problem was unrelated failure on aarch64 on a few of those. Otherwise, nothing fiddle related!

GitHub: GH-102 They aren't available in ruby/ruby.

kou · 2022-09-15T20:56:13Z

Thanks!
Please use ced671e for assert_false.

GitHub: fix GH-102 We can't use Fiddle::Closure before we fork the process. If we do it, the process may be crashed with SELinux. See ruby/fiddle#102 (comment) for details. Reported by Vít Ondruch. Thanks!!! ruby/fiddle@1343ac7a95

GitHub: fix GH-102 It's for freeing a closure explicitly. We can't use Fiddle::Closure before we fork the process. If we do it, the process may be crashed with SELinux. See ruby/fiddle#102 (comment) for details. Reported by Vít Ondruch. Thanks!!! ruby/fiddle@a0ccc6bb1b

Fiddle::Closure object is making use of FFI closure from libffi. When such object is created (instantiated) in Ruby, and then the process forks on an SELinux-enabled system, the memory will become corrupted. That is usually not a problem until the The garbage collector sweeps the object and tries to free it, in which case the Ruby process will fail with signal SIGABRT. Tests in test/fiddle/test_closure.rb, test/fiddle/test_func.rb, and test/fiddle/test_function.rb use the `Fiddle::Closure` class directly and fiddle/test_import.rb use the class indirectly through `bind_function` method, therefore they are disabled to prevent introducing the problematic object into the Ruby GC during test suite execution instead of relying on that fork and subsequent garbage collection will not happen. If an FFI closure object is allocated in Ruby and the `fork` function is used afterward, the memory pointing to the closure gets corrupted, and if Ruby GC tries to collect the object in that state, a SIGABRT error occurs. The minimal Ruby reproducer for the issue is the following: ~~~ $ cat fiddle_fork.rb require 'fiddle/closure' require 'fiddle/struct' Fiddle::Closure.new(Fiddle::TYPE_VOID, []) fork { } GC.start ~~~ We allocate an unused Closure object, so it is free for the GC to pick up. Before we call `GC.start` we fork the process as that corrupts the memory. Running this with ruby-3.1.2-167.fc37.x86_64 on SELinux enabled system: ~~~ $ ruby fiddle_fork.rb Aborted (core dumped) ~~~ Such issues may appear at random (depending on the use of forking and GC) in larger applications that use Fiddle::Closure but can be spotted by the following functions appearing in the coredump backtrace: ~~~ 0x00007f6284d3e5b3 in dlfree (mem=<optimized out>) at ../src/dlmalloc.c:4350 0x00007f6284d6d0b1 in dealloc () from /usr/lib64/ruby/fiddle.so 0x00007f6295e432ec in finalize_list () from /lib64/libruby.so.3.1 0x00007f6295e43420 in finalize_deferred.lto_priv () from /lib64/libruby.so.3.1 0x00007f6295e4ff1c in gc_start_internal.lto_priv () from /lib64/libruby.so.3.1 ~~~ Possible solutions to prevent Ruby from crashing: * Do not use Fiddle::Closure. * Use the Fiddle::Closure object only in isolated subprocess that will not fork further. * Enable static trampolines in libffi as noted in bugzilla comment: <https://bugzilla.redhat.com/show_bug.cgi?id=2040380#c9> See related discussion on <https://bugzilla.redhat.com/show_bug.cgi?id=2040380> Ruby upstream ticket: <https://bugs.ruby-lang.org/issues/18914> Ruby Fiddle ticket: <ruby/fiddle#102>

kou closed this as completed in 1343ac7 Sep 9, 2022

voxik mentioned this issue Sep 9, 2022

SIGABRT on Centos/RHEL 6+7 ffi/ffi#621

Open

kou reopened this Sep 12, 2022

kou closed this as completed in a0ccc6b Sep 14, 2022

kou added a commit that referenced this issue Sep 14, 2022

test: ensure freeing closure

f6431f3

GitHub: GH-102 This also improves freed closures assertions.

kou added a commit that referenced this issue Sep 14, 2022

test: ensure freeing closure

0495624

GitHub: GH-102 This also improves freed closures assertions.

kou added a commit that referenced this issue Sep 14, 2022

test: ensure freeing closure

b2fef17

GitHub: GH-102

kou added a commit that referenced this issue Sep 14, 2022

closure: free resources when an exception is raised in Closure.new

81a8a56

GitHub: GH-102

kou added a commit that referenced this issue Sep 14, 2022

closure: follow variable name change

2530496

GitHub: GH-102

kou added a commit that referenced this issue Sep 15, 2022

test: don't use assert_true/assert_false

ced671e

GitHub: GH-102 They aren't available in ruby/ruby.

ylecuyer mentioned this issue May 1, 2024

[Bug 20541] Revert 84f2aabd272a54e79979795d2d405090704a1d07 to build fiddle with libffi < 3.2 ruby/ruby#10695

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ruby test suite failures with libffi-3.4.2 #102

Ruby test suite failures with libffi-3.4.2 #102

voxik commented Jan 11, 2022

voxik commented Jan 13, 2022

tenderlove commented Jul 14, 2022

jackorp commented Sep 9, 2022 •

edited

Loading

kou commented Sep 9, 2022

voxik commented Sep 9, 2022

jackorp commented Sep 9, 2022

kou commented Sep 9, 2022

kou commented Sep 14, 2022

jackorp commented Sep 14, 2022

kou commented Sep 14, 2022

kou commented Sep 14, 2022

jackorp commented Sep 15, 2022

jackorp commented Sep 15, 2022

jackorp commented Sep 15, 2022

kou commented Sep 15, 2022

Ruby test suite failures with libffi-3.4.2 #102

Ruby test suite failures with libffi-3.4.2 #102

Comments

voxik commented Jan 11, 2022

voxik commented Jan 13, 2022

tenderlove commented Jul 14, 2022

jackorp commented Sep 9, 2022 • edited Loading

kou commented Sep 9, 2022

voxik commented Sep 9, 2022

jackorp commented Sep 9, 2022

kou commented Sep 9, 2022

kou commented Sep 14, 2022

jackorp commented Sep 14, 2022

kou commented Sep 14, 2022

kou commented Sep 14, 2022

jackorp commented Sep 15, 2022

jackorp commented Sep 15, 2022

jackorp commented Sep 15, 2022

kou commented Sep 15, 2022

jackorp commented Sep 9, 2022 •

edited

Loading