Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault with trunk-recorder on Debian #59

Open
dlaw opened this issue Nov 2, 2019 · 11 comments
Open

Segmentation fault with trunk-recorder on Debian #59

dlaw opened this issue Nov 2, 2019 · 11 comments

Comments

@dlaw
Copy link

dlaw commented Nov 2, 2019

I'm getting a segfault from trunk-recorder:

~snip~
[2019-11-02 18:13:03.580708] (info)   Driver: osmosdr
[2019-11-02 18:13:03.580738] (info)   Center: 8.555000e+08
[2019-11-02 18:13:03.580805] (info)   Rate: 10000000
[2019-11-02 18:13:03.580849] (info)   Error: 500
[2019-11-02 18:13:03.580885] (info)   PPM Error: 0
[2019-11-02 18:13:03.580914] (info)   Gain: 20
[2019-11-02 18:13:03.580950] (info)   IF Gain: 0
[2019-11-02 18:13:03.580979] (info)   BB Gain: 0
[2019-11-02 18:13:03.581009] (info)   LNA Gain: 0
[2019-11-02 18:13:03.581044] (info)   PGA Gain: 0
[2019-11-02 18:13:03.581082] (info)   TIA Gain: 0
[2019-11-02 18:13:03.581115] (info)   MIX Gain: 0
[2019-11-02 18:13:03.581147] (info)   VGA1 Gain: 0
[2019-11-02 18:13:03.581180] (info)   VGA2 Gain: 0
[2019-11-02 18:13:03.581210] (info)   Squelch: 0
[2019-11-02 18:13:03.581257] (info)   Idle Silence: false
[2019-11-02 18:13:03.581287] (info)   Digital Recorders: 3
[2019-11-02 18:13:03.581322] (info)   Debug Recorders: 0
[2019-11-02 18:13:03.581360] (info)   SigMF Recorders: 0
[2019-11-02 18:13:03.581391] (info)   Analog Recorders: 0
[2019-11-02 18:13:03.581426] (info)   Modulation: qpsk
[2019-11-02 18:13:03.581468] (info)   Source Device: soapy=0,driver=sdrplay
gr-osmosdr v0.1.4-127-g4d83c606 (0.1.5git) gnuradio 3.7.13.4
built-in source types: file fcd rtl rtl_tcp uhd hackrf rfspace soapy redpitaya 
[INFO] Using format CF32.
[2019-11-02 18:13:03.989391] (info)   SOURCE TYPE OSMOSDR (osmosdr)
[2019-11-02 18:13:03.989568] (info)   Setting sample rate to: 10000000
[2019-11-02 18:13:03.989851] (info)   Actual sample rate: 10000000
[2019-11-02 18:13:03.990037] (info)   Tunning to 8.555005e+08
[2019-11-02 18:13:03.990996] (info)   Gain Stage: IFGR supported values: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 
[2019-11-02 18:13:03.991232] (info)   Gain Stage: RFGR supported values: 0 1 2 3 4 5 6 7 8 9 
[2019-11-02 18:13:03.991290] (info)   Max Freqency: 8.605000e+08
[2019-11-02 18:13:03.991365] (info)   Min Freqency: 8.505000e+08
[2019-11-02 18:13:03.991460] (info)   LNA Gain set to: 0
[2019-11-02 18:13:03.991527] (info)   TIA Gain set to: 0
[2019-11-02 18:13:03.991588] (info)   PGA Gain set to: 0
[2019-11-02 18:13:03.991659] (info)   Gain set to: 20
[2019-11-02 18:13:03.991747] (info)   Setting antenna to [RX]
~snip~
[2019-11-02 19:03:39.872087] (info)   	- Starting P25 Recorder Num [0]	TG: 1059	Freq: 8.523000e+08 	TDMA: false	Slot: 0
[2019-11-02 19:03:39.873122] (info)   [comirs]	TG: 1059	Freq: 8.523000e+08	Starting Recorder on Src: soapy=0,driver=sdrplay
[2019-11-02 19:03:44.910787] (info)   [comirs]	TG: 1059	Freq: 8.523000e+08	Update Retuning - New Freq: 8.511250e+08	Elapsed: 5s 	Since update: 2s
[2019-11-02 19:03:49.887167] (info)   	- Starting P25 Recorder Num [1]	TG: 1109	Freq: 8.516250e+08 	TDMA: false	Slot: 0
[2019-11-02 19:03:49.887714] (info)   [comirs]	TG: 1109	Freq: 8.516250e+08	Starting Recorder on Src: soapy=0,driver=sdrplay
[2019-11-02 19:03:52.083655] (info)   [comirs]	TG: 1059	Freq: 8.511250e+08	Update Retuning - New Freq: 8.563625e+08	Elapsed: 13s 	Since update: 2s
[2019-11-02 19:03:53.729387] (info)   [comirs]	TG: 1109	Freq: 8.516250e+08	Update Retuning - New Freq: 8.537875e+08	Elapsed: 4s 	Since update: 1s
[2019-11-02 19:03:57.224294] (info)   	- Starting P25 Recorder Num [2]	TG: 1227	Freq: 8.526875e+08 	TDMA: false	Slot: 0
[2019-11-02 19:03:57.224887] (info)   [comirs]	TG: 1227	Freq: 8.526875e+08	Starting Recorder on Src: soapy=0,driver=sdrplay
Segmentation fault (core dumped)

I poked around the backtrace and it appears to point to SoapySDRPlay:

(gdb) bt
#0  __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:262
#1  0x00007f38ba5cc366 in SoapySDRPlay::readStream(SoapySDR::Stream*, void* const*, unsigned long, int&, long long&, long) ()
   from /usr/local/lib/SoapySDR/modules0.8/libsdrPlaySupport.so
#2  0x00007f38be5d9c98 in soapy_source_c::work(int, std::vector<void const*, std::allocator<void const*> >&, std::vector<void*, std::allocator<void*> >&) ()
   from /usr/local/lib/libgnuradio-osmosdr.so.0.1.5git
#3  0x00007f38bfbca6f7 in gr::sync_block::general_work(int, std::vector<int, std::allocator<int> >&, std::vector<void const*, std::allocator<void const*> >&, std::vector<void*, std::allocator<void*> >&) () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.7.13
#4  0x00007f38bfb900a0 in gr::block_executor::run_one_iteration() () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.7.13
#5  0x00007f38bfbd3b29 in gr::tpb_thread_body::tpb_thread_body(boost::shared_ptr<gr::block>, boost::shared_ptr<boost::barrier>, int) ()
   from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.7.13
#6  0x00007f38bfbc5ec4 in ?? () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.7.13
#7  0x00007f38bfb7b342 in ?? () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.7.13
#8  0x00007f38be549f65 in ?? () from /lib/x86_64-linux-gnu/libboost_thread.so.1.67.0
#9  0x00007f38be51bfa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#10 0x00007f38bdcca4cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

I have the 600 megabyte core dump file if anyone would like to examine it in more detail. (Or let me know what else you would like to see... happy to run any gdb commands to poke at the core dump on request.)

For what it's worth, I've reproduced this issue five times so far (and I have core dumps for four of those times... they all look the same to me). I have only been able to reproduce the issue when the radio system is busy and trunk-recorder is using 3 or more recorder threads. I've run for quite a long time with 2 recorder threads with no segfaults; when I switch back to 3 recorder threads, a segfault comes along within an hour or so (whenever the radio system gets busy).

Version information:

  • SoapySDR 189bc69217846581e34456f391683cc07c43ae0d
  • SoapySDRPlay 14ec39e
  • gr-osmosdr 4d83c6067f059b0c5015c3f59f8117bbd361e877
  • trunk-recorder 668fe62224ef7fb772d4263ed6f8c773683c0626
  • All other software is the latest from Debian 10.1 repositories

I am not sure whether to file this bug report under SoapySDRPlay, gr-osmosdr, or trunk-recorder... based on the backtrace, I decided to start with SoapySDRPlay.

Please let me know if there's any additional testing I can do, or any additional information I can provide.

@SDRplay
Copy link
Contributor

SDRplay commented Nov 2, 2019

Something is very confusing, the debug at the bottom...

#1 0x00007f38ba5cc366 in SoapySDRPlay::readStream(SoapySDR::Stream*, void* const*, unsigned long, int&, long long&, long) ()
from /usr/local/lib/SoapySDR/modules0.8/libsdrPlaySupport.so
#2 0x00007f38be5d9c98 in soapy_source_c::work(int, std::vector<void const*, std::allocator<void const*> >&, std::vector<void*, std::allocator<void*> >&) ()
from /usr/local/lib/libgnuradio-osmosdr.so.0.1.5git

how can that be the case - why is it trying to use both?? Unless I'm misreading this? Can you not just get this to point to SoapySDR/SoapySDRPlay?

@dlaw
Copy link
Author

dlaw commented Nov 2, 2019

osmosdr supports SoapySDR as a source... so any radio that SoapySDR supports can be used with osmosdr acting as a "pass-through". The data flow is SDRPlay -> SoapySDR -> osmosdr -> trunk-recorder. (Unfortunately trunk-recorder does not have native SoapySDR support.)

Here is the osmosdr implementation:
http://git.osmocom.org/gr-osmosdr/tree/lib/soapy/soapy_source_c.cc

@SDRplay
Copy link
Contributor

SDRplay commented Nov 3, 2019

Set the sample rate to 2 MHz and see if the problem still exists.

@dlaw
Copy link
Author

dlaw commented Nov 3, 2019

Unfortunately I won't be able to reproduce the problem with 2 MHz of bandwidth, as that is not wide enough for me to activate 3 or more recorder threads in trunk-recorder. I am recompiling everything with debugging symbols now in order to gather some more information.

@dlaw
Copy link
Author

dlaw commented Nov 3, 2019

(gdb) frame 1
#1  0x00007f74f750881e in SoapySDRPlay::readStream (this=0x5632e4750c60, stream=0x5632e4750c60, buffs=0x7f74e00010e0, numElems=4096, 
    flags=@0x7f74ef1b8334: 0, timeNs=@0x7f74ef1b8338: 0, timeoutUs=100000) at /home/dlaw/SoapySDRPlay/Streaming.cpp:315
315	        std::memcpy(buff0, (float *)_currentBuff, returnedElems * 2 * sizeof(float));
(gdb) p/x $_siginfo
$1 = {si_signo = 0xb, si_errno = 0x0, si_code = 0x1, _sifields = {_pad = {0xf44ccac0, 0x7f74, 0x0 <repeats 26 times>}, _kill = {si_pid = 0xf44ccac0, 
      si_uid = 0x7f74}, _timer = {si_tid = 0xf44ccac0, si_overrun = 0x7f74, si_sigval = {sival_int = 0x0, sival_ptr = 0x0}}, _rt = {si_pid = 0xf44ccac0, 
      si_uid = 0x7f74, si_sigval = {sival_int = 0x0, sival_ptr = 0x0}}, _sigchld = {si_pid = 0xf44ccac0, si_uid = 0x7f74, si_status = 0x0, si_utime = 0x0, 
      si_stime = 0x0}, _sigfault = {si_addr = 0x7f74f44ccac0, _addr_lsb = 0x0, _addr_bnd = {_lower = 0x0, _upper = 0x0}}, _sigpoll = {
      si_band = 0x7f74f44ccac0, si_fd = 0x0}}}
(gdb) info frame
Stack level 1, frame at 0x7f74ef1b8320:
 rip = 0x7f74f750881e in SoapySDRPlay::readStream (/home/dlaw/SoapySDRPlay/Streaming.cpp:315); saved rip = 0x7f74fb562c98
 called by frame at 0x7f74ef1b8350, caller of frame at 0x7f74ef1b8290
 source language c++.
 Arglist at 0x7f74ef1b8310, args: this=0x5632e4750c60, stream=0x5632e4750c60, buffs=0x7f74e00010e0, numElems=4096, flags=@0x7f74ef1b8334: 0, 
    timeNs=@0x7f74ef1b8338: 0, timeoutUs=100000
 Locals at 0x7f74ef1b8310, Previous frame's sp is 0x7f74ef1b8320
 Saved registers:
  rbx at 0x7f74ef1b8308, rbp at 0x7f74ef1b8310, rip at 0x7f74ef1b8318
(gdb) info locals
buff0 = 0x7f74f42e0ab0
returnedElems = 4096
(gdb) print _currentBuff
$2 = (short *) 0x7f74f44ccac0

The siginfo si_addr is equal to _currentBuff, i.e. _currentBuff is set to an invalid address and SoapySDRPlay segfaults when we attempt to copy from that address into buff0.

@guruofquality
Copy link
Contributor

Attempt 2, I stared at this for a while and the buffer increment stuff in readstream looks fine. I'm a little suspicious of the buffer accounting in rx_callback:

  1. This code releases a buffer to the reader thread (aka readStream)
if ((_buffs[_buf_tail].size() + spaceReqd) >= (bufferLength / decM))
    {
  1. The next section starts to fill the next tail buffer, which using a resize. Resize could cause an allocation, which is something i'm a little weary about.
// get current fill buffer
    auto &buff = _buffs[_buf_tail];
    buff.resize(buff.size() + spaceReqd);

But I'm not sure the _buf_count is safe. We check that we have an available write buffer before 1) with if (_buf_count == numBuffers) but for section 2) this may not be the case if we just incremented _buf_count.

So now the current tail buffer could be a buffer already given to the read thread, which in turn will now get resized and re-allocated larger than we expected to a new pointer because its size didnt get cleared because it was not yet released by the read thread. -- just a guess, but it seems like a buffer accounting issue.

@dlaw
Copy link
Author

dlaw commented Nov 4, 2019

I just got a segfault with trunk-recorder running 2 recorder threads after about 30 hours of continuous operation. Normally, with 3 recorder threads, I get a segfault after somewhere between 10 seconds and 1 hour depending on how busy the receive threads are. The segfault always seems to happen when all available recorders are recording. My guess right now would be that the segfault is correlated with peak CPU usage, which makes sense if it is a race condition between the read and write threads.

@sciurus
Copy link

sciurus commented Jun 3, 2020

I can reproduce this issue.

SoapySDRPlay: ccc0746
trunk-recorder: fd9553a6716a2a86a6ef2eef2b82433aa831f043
SDRPlay RSP API: 2.13.1

All other software is from Debian 10.4 x86-64.

So far it seems that if I set "digitalRecorders": 1 I do not see a crash, but with any higher values I'll get a segmentation fault within a couple minutes. I've tested with 6 and 10 MHz sample rates.

For the core file, a gdb backtrace shows

(gdb) bt
#0  __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:262
#1  0x00007f50e74c7ed4 in SoapySDRPlay::readStream(SoapySDR::Stream*, void* const*, unsigned long, int&, long long&, long) () from /usr/local/lib/SoapySDR/modules0.6/libsdrPlaySupport.so
#2  0x00007f50ef159898 in ?? () from /lib/x86_64-linux-gnu/libgnuradio-osmosdr.so.0.1.4
#3  0x00007f50f03286f7 in gr::sync_block::general_work(int, std::vector<int, std::allocator<int> >&, std::vector<void const*, std::allocator<void const*> >&, std::vector<void*, std::allocator<void*> >&) ()
   from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.7.13
#4  0x00007f50f02ee0a0 in gr::block_executor::run_one_iteration() () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.7.13
#5  0x00007f50f0331b29 in gr::tpb_thread_body::tpb_thread_body(boost::shared_ptr<gr::block>, boost::shared_ptr<boost::barrier>, int) () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.7.13
#6  0x00007f50f0323ec4 in ?? () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.7.13
#7  0x00007f50f02d9342 in ?? () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.7.13
#8  0x00007f50f0793615 in ?? () from /lib/x86_64-linux-gnu/libboost_thread.so.1.67.0
#9  0x00007f50f0765fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#10 0x00007f50eebdf4cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

@joshuarubin
Copy link

I'm getting a similar segfault using linhpsdr (on Arch): g0orx/linhpsdr#84

@gvanem
Copy link

gvanem commented Nov 17, 2020

I also have this issue on Windows (x86, MSVC). I get a crash in both GQrx and Quisk using SoapSDR. Here is the calll-stack when using SDRPlay in Quisk:

VCRUNTIME140!memcpy(unsigned char * dst = 0x1ae9a170 "???", unsigned char * src = 0x6efc40c4 "@`???", 
unsigned long count = 0x80e8)+0x4e
SdrPlaySupport!SoapySDRPlay::readStream(class SoapySDR::Stream * stream = 0x1ae9a170,
void ** buffs = 0x6efc40c4, unsigned int numElems = 0x80e8, int * flags = 0x2294f2fc, int64 * timeNs = 0x2294f300, 
long timeoutUs = 0n30000)+0xf5
SoapySDR!SoapySDRDevice_readStream(struct SoapySDRDevice * device = 0x1ae9a170, struct SoapySDRStream * stream = 0x1ae9a170, 
void ** buffs = 0x6efc40c4, unsigned int numElems = 0x80e8, int * flags = 0x2294f2fc, int64 * timeNs = 0x2294f300, 
long timeoutUs = 0n30000)+0x5d
soapy+0x265a
python36!_PyCFunction_FastCallKeywords(struct _object * func = 0x03f327b0, struct _object ** stack = 0x04d794a0, int nargs = 0n150000, struct _object * kwnames = 0x00000000)+0x3e
...

Very annoying since I have to unplug the SDRPlay from the USB-port and restart the SDRPlayAPI service to make the radio work again in e.g. SDRUno.

Edit:
Oh, and a !analyze -v in WinDbg says APPLICATION_FAULT_INVALID_POINTER_READ for the std::memcpy()
in SoapySDRPlay/Streamin.cpp (at line 310).

@dlaw
Copy link
Author

dlaw commented Jan 3, 2021

For what it's worth, I updated to https://github.com/pothosware/SoapySDRPlay3/ and this issue appears to be fixed. I also pulled and rebuilt trunk-recorder, gr-osmosdr, and all of the soapy components.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants