Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

undefined symbol: clock_gettime #606

Closed
pshareghi opened this issue May 15, 2015 · 110 comments
Closed

undefined symbol: clock_gettime #606

pshareghi opened this issue May 15, 2015 · 110 comments
Labels

Comments

@pshareghi
Copy link

Using
Centos 6
GCC 4.9 using devtools-3
Java 7
Kernel 2.6.32-504.12.2.el6.x86_64

I am getting the following error after running a test for a while.
/usr/java/latest/bin/java: symbol lookup error: /tmp/librocksdbjni2974434001434564758..so: undefined symbol: clock_gettime

The same test runs fine on RocksDb 3.6. My investigation shows that librt.so is not linked with RocksDb correctly. My test (with 3.10.2) worked correctly after I ran
export LD_PRELOAD=/lib64/rtkaio/librt.so.1

To investigate the 3.10.2 library, I ran nm
$nm /tmp/librocksdbjni2974434001434564758..so | grep clock
0000000000332260 T _ZNSt6chrono3_V212steady_clock3nowEv
000000000037e33c R _ZNSt6chrono3_V212steady_clock9is_steadyE
0000000000332230 T _ZNSt6chrono3_V212system_clock3nowEv
000000000037e33d R _ZNSt6chrono3_V212system_clock9is_steadyE
U clock_gettime

I ran nm on the older rocksdb-3.6
nm /tmp/librocksdbjni1323312933457066341..so |grep clock
0000000000287390 T _ZNSt6chrono3_V212steady_clock3nowEv
00000000002c783c R _ZNSt6chrono3_V212steady_clock9is_steadyE
0000000000287360 T _ZNSt6chrono3_V212system_clock3nowEv
00000000002c783d R _ZNSt6chrono3_V212system_clock9is_steadyE

You can see that clock_gettime is undefined in 3.10.2 highlighted in the result of first nm command. Looking at the code, the single call to this function is only included in the C++ code if OS_LINUX or OS_FREEBSD is defined.

Judging from the above nm results, do you think, in 3.6, none of the two flags above were set, where as in 3.10, somehow at least one gets set?

@igorcanadi
Copy link
Collaborator

Interesting. We do set -lrt when we compile: https://github.com/facebook/rocksdb/blob/master/build_tools/build_detect_platform#L125

Do we need to set something other than -lrt? Do you see that flag in your compile log?

@DerekSchenk
Copy link
Contributor

I'm experiencing the same issue using the rocksdbjni 3.10.2 jar from Maven, running on Centos 6.6.

Using
Centos 6.6
2.6.32-504.16.2.el6.x86_64 #1 SMP Wed Apr 22 06:48:29 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
java version "1.8.0_40"

I was using version 3.9.1 previously, and had no issues. This is the nm results:

3.9.1: # nm librocksdbjni-linux64.so | grep clock
00000000002408b0 T _ZNSt6chrono12system_clock3nowEv

3.10.1: ]# nm librocksdbjni-linux64.so | grep clock
0000000000295690 T _ZNSt6chrono12system_clock3nowEv
                 U clock_gettime

I build the library from the latest, this is the results:

** To save others from possible frustration contrary to the INSTALL.md details...

  1. gcc version 4.8+ is required on Centos, not 4.7. Without this compiling fails with:
  CC       util/thread_status_impl.o
util/thread_status_impl.cc: In static member function ‘static std::map<std::basic_string<char>, long unsigned int> rocksdb::ThreadStatus::InterpretOperationProperties(rocksdb::ThreadStatus::OperationType, const uint64_t*)’:
util/thread_status_impl.cc:88:20: error: ‘class std::map<std::basic_string<char>, long unsigned int>’ has no member named ‘emplace’
util/thread_status_impl.cc:90:20: error: ‘class std::map<std::basic_string<char>, long unsigned int>’ has no member named ‘emplace’
util/thread_status_impl.cc:94:20: error: ‘class std::map<std::basic_string<char>, long unsigned int>’ has no member named ‘emplace’
util/thread_status_impl.cc:96:20: error: ‘class std::map<std::basic_string<char>, long unsigned int>’ has no member named ‘emplace’
util/thread_status_impl.cc:98:20: error: ‘class std::map<std::basic_string<char>, long unsigned int>’ has no member named ‘emplace’
util/thread_status_impl.cc:101:20: error: ‘class std::map<std::basic_string<char>, long unsigned int>’ has no member named ‘emplace’
make: *** [util/thread_status_impl.o] Error 1
  1. 'yum install gcc47-c++' does not work. Follow the instructions at http://superuser.com/questions/381160/how-to-install-gcc-4-7-x-4-8-x-on-centos for 4.8
wget http://people.centos.org/tru/devtools-2/devtools-2.repo -O /etc/yum.repos.d/devtools-2.repo
yum install devtoolset-2-gcc devtoolset-2-binutils devtoolset-2-gcc-c++
export PATH=/opt/rh/devtoolset-2/root/usr/bin:/sbin:/bin:/usr/sbin:/usr/bin
export CC=/opt/rh/devtoolset-2/root/usr/bin/gcc
export CPP=/opt/rh/devtoolset-2/root/usr/bin/cpp
export CXX=/opt/rh/devtoolset-2/root/usr/bin/c++

When running 'make shared_lib' it does appear to include the -lrt flag, but clock_gettime is still undefined:

/opt/rh/devtoolset-2/root/usr/bin/c++ -shared -Wl,-soname -Wl,librocksdb.so.3.11  -g -W -Wextra -Wall -Wsign-compare -Wshadow -Wno-unused-parameter -I. -I./include -std=c++11  -DROCKSDB_PLATFORM_POSIX  -DOS_LINUX -fno-builtin-memcmp -DROCKSDB_FALLOCATE_PRESENT -DSNAPPY -DGFLAGS=google -DZLIB -DBZIP2 -march=native   -isystem ./third-party/gtest-1.7.0/fused-src -O2 -fno-omit-frame-pointer -momit-leaf-frame-pointer -DNDEBUG -Woverloaded-virtual -Wnon-virtual-dtor -Wno-missing-field-initializers -fPIC db/builder.cc db/c.cc db/column_family.cc db/compaction.cc db/compaction_job.cc db/compaction_picker.cc db/db_filesnapshot.cc db/dbformat.cc db/db_impl.cc db/db_impl_debug.cc db/db_impl_readonly.cc db/db_impl_experimental.cc db/db_iter.cc db/experimental.cc db/event_logger_helpers.cc db/file_indexer.cc db/filename.cc db/flush_job.cc db/flush_scheduler.cc db/forward_iterator.cc db/internal_stats.cc db/log_reader.cc db/log_writer.cc db/managed_iterator.cc db/memtable_allocator.cc db/memtable.cc db/memtable_list.cc db/merge_helper.cc db/merge_operator.cc db/repair.cc db/slice.cc db/table_cache.cc db/table_properties_collector.cc db/transaction_log_impl.cc db/version_builder.cc db/version_edit.cc db/version_set.cc db/wal_manager.cc db/write_batch.cc db/write_batch_base.cc db/write_controller.cc db/write_thread.cc port/stack_trace.cc port/port_posix.cc table/adaptive_table_factory.cc table/block_based_filter_block.cc table/block_based_table_builder.cc table/block_based_table_factory.cc table/block_based_table_reader.cc table/block_builder.cc table/block.cc table/block_hash_index.cc table/block_prefix_index.cc table/bloom_block.cc table/cuckoo_table_builder.cc table/cuckoo_table_factory.cc table/cuckoo_table_reader.cc table/flush_block_policy.cc table/format.cc table/full_filter_block.cc table/get_context.cc table/iterator.cc table/merger.cc table/meta_blocks.cc table/plain_table_builder.cc table/plain_table_factory.cc table/plain_table_index.cc table/plain_table_key_coding.cc table/plain_table_reader.cc table/table_properties.cc table/two_level_iterator.cc util/arena.cc util/auto_roll_logger.cc util/bloom.cc util/build_version.cc util/cache.cc util/coding.cc util/comparator.cc util/crc32c.cc util/db_info_dumper.cc util/dynamic_bloom.cc util/env.cc util/env_hdfs.cc util/env_posix.cc util/file_util.cc util/filter_policy.cc util/hash.cc util/hash_cuckoo_rep.cc util/hash_linklist_rep.cc util/hash_skiplist_rep.cc util/histogram.cc util/instrumented_mutex.cc util/iostats_context.cc utilities/backupable/backupable_db.cc utilities/convenience/convenience.cc utilities/checkpoint/checkpoint.cc utilities/compacted_db/compacted_db_impl.cc utilities/document/document_db.cc utilities/document/json_document_builder.cc utilities/document/json_document.cc utilities/flashcache/flashcache.cc utilities/geodb/geodb_impl.cc utilities/leveldb_options/leveldb_options.cc utilities/merge_operators/put.cc utilities/merge_operators/string_append/stringappend2.cc utilities/merge_operators/string_append/stringappend.cc utilities/merge_operators/uint64add.cc utilities/redis/redis_lists.cc utilities/spatialdb/spatial_db.cc utilities/ttl/db_ttl_impl.cc utilities/write_batch_with_index/write_batch_with_index.cc utilities/write_batch_with_index/write_batch_with_index_internal.cc util/event_logger.cc util/ldb_cmd.cc util/ldb_tool.cc util/log_buffer.cc util/logging.cc util/memenv.cc util/murmurhash.cc util/mutable_cf_options.cc util/options_builder.cc util/options.cc util/options_helper.cc util/perf_context.cc util/rate_limiter.cc util/skiplistrep.cc util/slice.cc util/sst_dump_tool.cc util/statistics.cc util/status.cc util/string_util.cc util/sync_point.cc util/thread_local.cc util/thread_status_impl.cc util/thread_status_updater.cc util/thread_status_updater_debug.cc util/thread_status_util.cc util/thread_status_util_debug.cc util/vectorrep.cc util/xfunc.cc util/xxhash.cc   -lpthread -lrt -lsnappy -lgflags -lz -lbz2 -o librocksdb.so.3.11.0
# nm librocksdb.so | grep clock
                 U clock_gettime@@GLIBC_2.2.5
0000000000250630 T _ZNSt6chrono3_V212steady_clock3nowEv
0000000000262be0 R _ZNSt6chrono3_V212steady_clock9is_steadyE
0000000000250600 T _ZNSt6chrono3_V212system_clock3nowEv
0000000000262be1 R _ZNSt6chrono3_V212system_clock9is_steadyE

I'm not sure what to check next, however if you have suggestions I'm happy to try.

@pshareghi
Copy link
Author

One thing I don't understand is how clock_gettime was not included at all in Rocksdb 3.6, but now it shows up unlinked.

I don't have time to try and build but one suggestion I found online is to use
-Wl,--no-as-needed

http://stackoverflow.com/questions/17150075/undefined-reference-to-clock-gettime-although-lrt-is-given

From ld man:

--as-needed
       --no-as-needed
           This option affects ELF DT_NEEDED tags for dynamic libraries
           mentioned on the command line after the --as-needed option.
           Normally the linker will add a DT_NEEDED tag for each dynamic
           library mentioned on the command line, regardless of whether the
           library is actually needed or not.  --as-needed causes a DT_NEEDED
           tag to only be emitted for a library that satisfies an undefined
           symbol reference from a regular object file or, if the library is
           not found in the DT_NEEDED lists of other libraries linked up to
           that point, an undefined symbol reference from another dynamic
           library.  --no-as-needed restores the default behaviour.

@pshareghi
Copy link
Author

@igorcanadi if you can explain how clock_gettime was not included at all in Rocksdb 3.6 or 3.9.1, but now it shows up unlinked, that would be very helpful. Also, it looks like the code in older Rocksdb versions did not reference clock_gettime at all. I looked at build_detect_platform, but I don't see a big change that may have caused the code to start using clock_gettime. Any idea?

@DerekSchenk
Copy link
Contributor

I found the same article, and I tried to change the build_detect_platform and set the following:

Linux)
        PLATFORM=OS_LINUX
        COMMON_FLAGS="$COMMON_FLAGS -DOS_LINUX"
        if [ -z "$USE_CLANG" ]; then
            COMMON_FLAGS="$COMMON_FLAGS -fno-builtin-memcmp"
        fi
        PLATFORM_LDFLAGS="$PLATFORM_LDFLAGS -Wl,--no-as-needed -lpthread -lrt"

But the output of nm is still:

# nm librocksdb.so | grep clock
                 U clock_gettime@@GLIBC_2.2.5
0000000000250630 T _ZNSt6chrono3_V212steady_clock3nowEv
0000000000262be0 R _ZNSt6chrono3_V212steady_clock9is_steadyE
0000000000250600 T _ZNSt6chrono3_V212system_clock3nowEv
0000000000262be1 R _ZNSt6chrono3_V212system_clock9is_steadyE

I also see that clock_gettime was not included in the 3.9.1 release, so it looks new to 3.10.

It's worth noting that setting the workaround you defined
export LD_PRELOAD=/lib64/rtkaio/librt.so.1
worked to let me startup the code.

@pshareghi
Copy link
Author

@DerekSchenk What are you running that you get the clock_gettime error? I didn't see the error in our production env. I only noticed it when I tried to write a million dummy KVs to a dummy db to see the effect of Snappy compression vs NO_COMPRESSION. Do you know if it happens during read, write, or is it the statistics gathering piece of code that is causing this issue?

@DerekSchenk
Copy link
Contributor

@pshareghi I'm not 100% certain, but it seems to happen when I initialize the database. I think it's caused because I use ".createStatistics()", but I do have compression enabled as well. I'd need to try a few scenarios to see for sure. If it's in fact the statistics I can disable that and work without it, but at the moment I can't upgrade at all.

@pshareghi
Copy link
Author

I think it is statistics calculation that triggers the issue. I checked my LOG file when the problem happened, and it was in the middle of printing out the statistics.

@DerekSchenk
Copy link
Contributor

@pshareghi Thanks for the tip - I can confirm that the issue is triggered when including statistics. I removed the call to .createStatistics() and I can use the 3.10 version. This was not an issue in the 3.9 release. While I do use the stats I can live without them for know.

@igorcanadi
Copy link
Collaborator

@igorcanadi if you can explain how clock_gettime was not included at all in Rocksdb 3.6 or 3.9.1, but now it shows up unlinked, that would be very helpful. Also, it looks like the code in older Rocksdb versions did not reference clock_gettime at all. I looked at build_detect_platform, but I don't see a big change that may have caused the code to start using clock_gettime. Any idea?

RocksDB 3.6 did call clock_gettime: https://github.com/facebook/rocksdb/blob/rocksdb-3.6.1/util/env_posix.cc#L1362 so I have no idea why it's showing up as unlinked right now and it worked before :(

@igorcanadi
Copy link
Collaborator

@DerekSchenk thanks for the report that our compilation doesn't work on gcc 4.7. We fixed this with 74f3832. We should probably set up contbuild using 4.7 to make sure this doesn't happen again.

@igorcanadi
Copy link
Collaborator

Are you both guys running with shared library?

@DerekSchenk
Copy link
Contributor

@igorcanadi Something that occurred to me - is it possible that the newer build running against gcc 4.8 has linked a newer version of glibc (ie 2.2.5), and all my Centos 6.6 machines are runing glibc 2.12? That may explain how it worked in 3.9 (using gcc 4.7), and why it works on other platforms?

I am pulling the jar from Maven, and it uses the shared library. When I built it myself I also built shared.

@igorcanadi
Copy link
Collaborator

@DerekSchenk can you try building with gcc 4.7 on master now? If this works, it will partially verify your hypothesis.

@igorcanadi
Copy link
Collaborator

what does ldd binary give you?

@DerekSchenk
Copy link
Contributor

# ldd --version 
ldd (GNU libc) 2.12
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

@igorcanadi
Copy link
Collaborator

ldd librocksdb.so

@DerekSchenk
Copy link
Contributor

This is the output:

Version 3.9

r3.9 # ldd librocksdbjni-linux64.so 
    linux-vdso.so.1 =>  (0x00007fff074e8000)
    libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f4172820000)
    libm.so.6 => /lib64/libm.so.6 (0x00007f417259c000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f4172385000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f4171ff1000)
    /lib64/ld-linux-x86-64.so.2 (0x0000003862e00000)

Version 3.10

r3.10 # ldd librocksdbjni-linux64.so 
    linux-vdso.so.1 =>  (0x00007ffff0e3d000)
    libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f5fcafd1000)
    libm.so.6 => /lib64/libm.so.6 (0x00007f5fcad4d000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f5fcab36000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f5fca7a2000)
    /lib64/ld-linux-x86-64.so.2 (0x0000003862e00000)

And my attempt to build from head last week

# ldd librocksdb.so
    linux-vdso.so.1 =>  (0x00007fff09bff000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f3e8c21a000)
    librt.so.1 => /lib64/librt.so.1 (0x00007f3e8c012000)
    libsnappy.so.1 => /usr/lib64/libsnappy.so.1 (0x00007f3e8be0c000)
    libgflags.so.2 => not found
    libz.so.1 => /lib64/libz.so.1 (0x00007f3e8bbf6000)
    libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f3e8b9e4000)
    libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f3e8b6de000)
    libm.so.6 => /lib64/libm.so.6 (0x00007f3e8b45a000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f3e8b243000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f3e8aeaf000)
    /lib64/ld-linux-x86-64.so.2 (0x0000003862e00000)

I just updated on my build machine and will try building again using 4.7.

@DerekSchenk
Copy link
Contributor

@igorcanadi The master build 'make shared_lib' now works on gcc 4.7, however it still looks as though it's not linked properly.

I should have probably know this already as the gcc version 4.7 is not fully installed either. The most up to date version of gcc you can get for Centos 6 through yum is 4.6, so I've got 4.7 and 4.8 installed in alternate locations. The yum command `yum install gcc47-c++' just gets your hopes up, then yum lets you down ;)

This is the output of the version I just built:

# nm librocksdb.so.3.11.0 | grep clock
                 U clock_gettime@@GLIBC_2.2.5
                 U _ZNSt6chrono12system_clock3nowEv@@GLIBCXX_3.4.11

# ldd librocksdb.so.3.11.0
    linux-vdso.so.1 =>  (0x00007fff779b1000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f569d50f000)
    librt.so.1 => /lib64/librt.so.1 (0x00007f569d307000)
    libsnappy.so.1 => /usr/lib64/libsnappy.so.1 (0x00007f569d101000)
    libgflags.so.2 => not found
    libz.so.1 => /lib64/libz.so.1 (0x00007f569ceeb000)
    libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f569ccd9000)
    libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f569c9d3000)
    libm.so.6 => /lib64/libm.so.6 (0x00007f569c74f000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f569c538000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f569c1a4000)
    /lib64/ld-linux-x86-64.so.2 (0x0000003862e00000)

I wanted to run the test cases to see if it still fails, but 'make check' fails:

# make check
  GEN      util/build_version.cc
  CC       db/log_writer.o
db/log_writer.cc: In member function ‘rocksdb::Status rocksdb::log::Writer::AddRecord(const rocksdb::Slice&)’:
db/log_writer.cc:55:5: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]

@igorcanadi
Copy link
Collaborator

@DerekSchenk I'm not sure why our compiler doesn't complain about that. I fixed it here: 04feaee

@DerekSchenk
Copy link
Contributor

@igorcanadi Thanks - that fix worked. I was able to compile and run this time. I was hoping one of the tests would it the clock_gettime call and confirm if it was working or not. Do you know what test would be a good one to confirm?

Off topic, not sure if you want this in a different issue, but the check run fails on one of the environment tests.

[ RUN      ] EnvPosixTest.AllocateTest
util/env_test.cc:686: Failure
Expected: ((f_stat.st_size + kPageSize + kBlockSize - 1) / kBlockSize) >= ((unsigned int)f_stat.st_blocks), actual: 2056 vs 204800
terminate called after throwing an instance of 'testing::internal::GoogleTestFailureException'
  what():  util/env_test.cc:686: Failure
Expected: ((f_stat.st_size + kPageSize + kBlockSize - 1) / kBlockSize) >= ((unsigned int)f_stat.st_blocks), actual: 2056 vs 204800
/bin/sh: line 5: 15873 Aborted                 (core dumped) ./$t
make: *** [check] Error 1

@igorcanadi
Copy link
Collaborator

Yes, please create the different issue for this one.

db_test calls clock_gettime a lot, so if it succeeded, then all is fine.

@pshareghi
Copy link
Author

@igorcanadi I am building using
I am not building shared libraries. My build command is
make rocksdbjavastatic

Also, I am using GCC 4.9

yum install -y devtoolset-3-gcc-c++
ldd ./librocksdbjni-linux64.so
    linux-vdso.so.1 =>  (0x00007fffbb6f1000)
    libm.so.6 => /lib64/libm.so.6 (0x00007f025d0ac000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f025cd0b000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f025d952000)

@pshareghi
Copy link
Author

@DerekSchenk try this super simple test. I shamelessly modified RocksDbSample.java. Even though I commented out createStatistics(), it still dies halfway.
https://gist.github.com/pshareghi/4f8852fc92272098b8da

@DerekSchenk
Copy link
Contributor

@pshareghi I pulled down the RocksDbSample.java, and using the copy I built it worked (completed the test) if I removed the call to .createStatistics(), as well as the Statistics creation 'Statistics stats = options.statisticsPtr();', and the lines following 'db.getProperty("rocksdb.stats");'.

With those lines included the test fails every time with the missing symbol error.

I got looking at the build script for rocksdbjavastatic which doesn't work, and I realized the the LDFLAGS, notably -lrt isn't being passed when the library is created.

@igorcanadi Not sure if this is correct or not, but when I modified the Makefile, and changed the compile to this below line I'm able to build the Java library, and run the RocksDbSample.java file unmodified and it works properly

$(CXX) $(CXXFLAGS) -I./java/. $(JAVA_INCLUDE) -shared -fPIC \
          -o ./java/target/$(ROCKSDBJNILIB) $(JNI_NATIVE_SOURCES) \
          $(java_libobjects) $(COVERAGEFLAGS) \
          libz.a libbz2.a libsnappy.a liblz4.a $(LDFLAGS)
  • Note the inclusion of $(LDFLAGS)

Original Makefile output

# java -cp java/target/rocksdbjni-3.11.0-linux64.jar:. TestNoCompression /tmp/test.db
You get a car test!
Wrote 0 keys (cumulative).
Wrote 10000 keys (cumulative).
....
Wrote 200000 keys (cumulative).
java: symbol lookup error: /datex/nexus/tmp/librocksdbjni5613367140043562332..so: undefined symbol: clock_gettime

Output with the LDFLAGS modification

# java -cp java/target/rocksdbjni-3.11.0-linux64.jar:. TestNoCompression /tmp/test.db
You get a car test!
Wrote 0 keys (cumulative).
Wrote 10000 keys (cumulative).
....
Wrote 990000 keys (cumulative).
stats:

** Compaction Stats [default] **
Level    Files   Size(MB) Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) Stall(cnt)  KeyIn KeyDrop
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      2/1         90   0.5      0.0     0.0      0.0       0.4      0.4       0.0   0.0      0.0    137.7         3        10    0.327          0       0      0
  L1     10/3         94   0.8      0.4     0.4      0.1       0.4      0.4       0.0   1.2    145.9    145.8         3         7    0.437          6    734K      0
  L2     21/0        266   0.3      0.0     0.0      0.0       0.0      0.0       0.2   1.5    125.0    125.0         0         1    0.290          0     49K      0
 Sum     33/4        450   0.0      0.5     0.4      0.1       0.9      0.8       0.2   2.1     72.9    140.9         7        18    0.368          6    784K      0
 Int      0/0          0   0.0      0.5     0.4      0.1       0.8      0.7       0.2   2.3     78.3    136.6         6        16    0.386          6    784K      0
Flush(GB): cumulative 0.439, interval 0.351
Stalls(count): 0 level0_slowdown, 0 level0_numfiles, 0 memtable_compaction, 6 leveln_slowdown_soft, 0 leveln_slowdown_hard

** DB Stats **
Uptime(secs): 11.0 total, 8.2 interval
Cumulative writes: 1000K writes, 1000K keys, 1000K batches, 1.0 writes per batch, ingest: 0.49 GB, 45.15 MB/s
Cumulative WAL: 1000K writes, 1000K syncs, 1.00 writes per sync, written: 0.49 GB, 45.15 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 794K writes, 794K keys, 794K batches, 1.0 writes per batch, ingest: 394.56 MB, 47.86 MB/s
Interval WAL: 794K writes, 794K syncs, 1.00 writes per sync, written: 0.39 MB, 47.86 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent

@igorcanadi
Copy link
Collaborator

Oh interesting. I'm actually not familiar with RocksJava compilation, but this change looks good to me. @yhchiang @fyrz @adamretter can you please evaluate the change? @DerekSchenk feel free to send the PR :)

DerekSchenk added a commit to DerekSchenk/rocksdb that referenced this issue May 20, 2015
Includes the LDFLAGS so that the correct libraries will be linked.  This links rt to resolve the issue facebook#606.
@DerekSchenk
Copy link
Contributor

@igorcanadi I've created a PR (should have probably waited until morning :) ) for this issue. Hope everything is fine.

@jwlent55
Copy link

jwlent55 commented Oct 9, 2015

Created a pull request to document the changes discussed above.

@yhchiang
Copy link
Contributor

Hello, I just built and uploaded the 3.13.1 package once again using the fix by @jwlent55, and it seems OSSRH has updated the file. Can I know whether the package work now?

https://oss.sonatype.org/#nexus-search;quick~rocksdbjni

@dmittendorf
Copy link
Contributor

Hi @yhchiang. Thanks for merging in the change. I haven't tried out the build, but I think you will also need to back-port the following commit to the 3.13.1 branch in order for snappy compression to work with the java build.

a52888e

@jwlent55
Copy link

Just tested the build and @dmittendorf is correct. The dependencies look much better, but, because the Snappy header files were not available at compile time the Snappy code was not linked in.

voldemort.store.StorageInitializationException: org.rocksdb.RocksDBException: Invalid argument: Compression type Snappy is not linked with the binary.

$ ldd ~/Downloads/librocksdbjni-linux64.so
linux-vdso.so.1 => (0x00007fff80768000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f6430334000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f643012c000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f642fdae000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f642fab2000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f642f89b000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f642f4dc000)
/lib64/ld-linux-x86-64.so.2 (0x00007f6430ac9000)

@yhchiang
Copy link
Contributor

Thanks @jwlent55 and @dmittendorf, I will cherry-pick the change and build again.

@yhchiang
Copy link
Contributor

Have rebuilt and republished once again, can I know whether the new build solve the issue?
https://oss.sonatype.org/#nexus-search;quick~rocksdbjni

@jwlent55
Copy link

Thanks. Ran one quick test on a linux-64 box and it worked for me. Some output that verifies it:

2015/10/10-23:31:30.443170 7f3ffc553700 Compression algorithms supported:
2015/10/10-23:31:30.443172 7f3ffc553700 Snappy supported: 1
2015/10/10-23:31:30.443173 7f3ffc553700 Zlib supported: 1
2015/10/10-23:31:30.443173 7f3ffc553700 Bzip supported: 1
2015/10/10-23:31:30.443174 7f3ffc553700 LZ4 supported: 0

Note that LZ4 has the same issue. It could be solved the same way, but, this approach has issues:

  1. Forces the build to download and build the Snappy and LZ4 libraries twice each.
  2. Zlib and Bzip don't have this issue, but, instead they compile against the headers installed on the build machine, but, then link against the hand built versions of the libraries. This may be a bit fragile.

I don't think this needs to be addressed in 3.13.1, but, I have coded up a Makefile change that I think eliminates both these issues. I am no Makefile expert so there may be better ways to handle it and/or issues with the approach. I will send you another pull request just to document the approach and get feedback. Again I have only tested it on a linux-64 box.

@yhchiang
Copy link
Contributor

Thanks for verifying the build, @jwlent55! We are looking forward to your pull request!

@jwlent55
Copy link

Having OS X (portability?) issues with the latest build 3.13.1 build (not sure about earlier builds). When @dmittendorf tested it (via Voldemort) on his Mac we get:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x000000013898c7e6, pid=23522, tid=21763
#
# JRE version: Java(TM) SE Runtime Environment (8.0_25-b17) (build 1.8.0_25-b17)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.25-b02 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# C  [librocksdbjni685384818265826878..jnilib+0x11b7e6]  rocksdb::NewLRUCache(unsigned long, int)+0xb6
#

@dmittendorf rebuilt the 3.13.1 on his Mac under the following conditions:

  • Latest v3.13 branch - e44957c
  • Patched to include my Makefile pull request - 4a7970d
    • He does not have Snappy installed locally.

That build worked on his Mac.

Here is some shared library dependency info:

RocksDB library from Nexus:
$ otool -L librocksdbjni-osx.jnilib:
    ./java/target/librocksdbjni-osx.jnilib (compatibility version 0.0.0, current version 0.0.0)
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1213.0.0)

RocksDB library built locally:
$ otool -L ~/src/rocksdb/java/target/librocksdbjni-osx.jnilib
/Users/dmittendorf/src/rocksdb/java/target/librocksdbjni-osx.jnilib:
    ./java/target/librocksdbjni-osx.jnilib (compatibility version 0.0.0, current version 0.0.0)
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.1.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1225.1.1)

Here is an abbreviated stack trace:

Stack: [0x00000001381e5000,0x00000001382e5000],  sp=0x00000001382e1920,  free space=1010k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [librocksdbjni685384818265826878..jnilib+0x11b7e6]  rocksdb::NewLRUCache(unsigned long, int)+0xb6
C  [librocksdbjni685384818265826878..jnilib+0x11b723]  rocksdb::NewLRUCache(unsigned long)+0x13
C  [librocksdbjni685384818265826878..jnilib+0xe76e1]  rocksdb::BlockBasedTableFactory::BlockBasedTableFactory(rocksdb::BlockBasedTableOptions const&)+0x141
C  [librocksdbjni685384818265826878..jnilib+0x1a7c40]  rocksdb::ColumnFamilyOptions::ColumnFamilyOptions()+0x2a0
C  [librocksdbjni685384818265826878..jnilib+0x7a8f]  Java_org_rocksdb_ColumnFamilyOptions_getColumnFamilyOptionsFromProps+0x2f
j  org.rocksdb.ColumnFamilyOptions.getColumnFamilyOptionsFromProps(Ljava/lang/String;)J+0

Having problems attaching a full core dump, but, I can provide one.

@dmittendorf is busy on other more pressing issues and I don't have access to a Mac so we are not able to make a lot more progress on our end. Right now I am just searching the web for issues like this. Searches on "mixed mode bsd-amd64 compressed oops" have turned up some interesting reading.

Are there any known OS X portability issues with RocksDB?

@jwlent55
Copy link

I am not sure if the Snappy compression library has been linked into the linux-64 library included in the static jar for a long time now. What appears to have changed is that selecting a compression algorithm that is not supported is now an error in 3.13.1.

I ran some tests where I wrote 500,000 1024 byte identical records to a RocksDB database configured as follows:

compaction_style=kCompactionStyleUniversal
num_levels=1
max_write_buffer_number=3
write_buffer_size=67108864

I then tested the Snappy vs No Compression and 3.10.1 vs 3.13.1 (both downloaded from Maven Central). Only in the 3.13.1 case did I see a significant difference in DB size.

./1444689431859-0/data/rocksdb/benchmark_db/LOG:
2015/10/12-18:37:12.040974 7f0ca5775700 [WARN] RocksDB version: 3.13.1
2015/10/12-18:37:12.161440 7f0ca5775700 [WARN]          Options.compression: Snappy

du ./1444689431859-0/data/rocksdb/benchmark_db
349852  ./1444689431859-0/data/rocksdb/benchmark_db
-----
./1444689452173-0/data/rocksdb/benchmark_db/LOG:
2015/10/12-18:37:32.352984 7f48eba9c700 [WARN] RocksDB version: 3.13.1
2015/10/12-18:37:32.562454 7f48eba9c700 [WARN]          Options.compression: NoCompression

du ./1444689452173-0/data/rocksdb/benchmark_db
919464  ./1444689452173-0/data/rocksdb/benchmark_db
-----
./1444689708020-0/data/rocksdb/benchmark_db/LOG:
2015/10/12-18:41:48.194970 7f71a9a27700 RocksDB version: 3.10.0
2015/10/12-18:41:48.362297 7f71a9a27700          Options.compression: 1

du ./1444689708020-0/data/rocksdb/benchmark_db
945076  ./1444689708020-0/data/rocksdb/benchmark_db
-----
./1444689745193-0/data/rocksdb/benchmark_db/LOG:
2015/10/12-18:42:25.369485 7f46701be700 RocksDB version: 3.10.0
2015/10/12-18:42:25.647092 7f46701be700          Options.compression: 0

du ./1444689745193-0/data/rocksdb/benchmark_db
930596  ./1444689745193-0/data/rocksdb/benchmark_db

I then extracted the dynamic library from 5 different RocksDB jars files (including an old 3.13.1 I still had) and looked at the symbol table. Only the 3.13.1 library seemed to contain Snappy code:

$ nm -C ./3.5.1/librocksdbjni-linux64.so | grep snappy:: | wc
      0       0       0
$ nm -C ./3.6.2/librocksdbjni-linux64.so | grep snappy:: | wc
      0       0       0
$ nm -C ./3.10.1/librocksdbjni-linux64.so | grep snappy:: | wc
      0       0       0
$ nm -C ./3.13.1-old/librocksdbjni-linux64.so | grep snappy:: | wc
      0       0       0
$ nm -C ./3.13.1/librocksdbjni-linux64.so | grep snappy:: | wc
     46     227    3068

Some of the Snappy symbols in 3.13.1:

00000000002d2470 T snappy::Uncompress(char const*, unsigned long, std::string*)
00000000002d27d0 T snappy::RawCompress(char const*, unsigned long, char*, unsigned long*)
00000000002d2410 T snappy::RawUncompress(char const*, unsigned long, char*)
00000000002d1f70 T snappy::RawUncompress(snappy::Source*, char*)

I have never used the "nm" tool before so I may be misinterpreting the results and perhaps there is another explanation for the DB sizes (RocksDB is new to me) so take this all with a grain of salt.

@dmittendorf
Copy link
Contributor

Summarizing here since there has been a lot of churn from @jwlent55 and I...

  1. @jwlent55 found that Snappy support has likely never worked with the cross-platform java jar. It used to fail silently, but now an error is returned if you attempt to use Snappy compression with the cross-platform jar.
  2. Commit a52888e attempted to fix Snappy support for linux-32 and linux-64, but still didn't address build portability issues.
  3. @jwlent55 created pull request Ensure that the compression libraries are statically linked into dyna… #759 to address portability issues by forcing all compression libraries to be linked statically. This PR has been committed to master.
  4. @jwlent55 created another pull request (Modify the way java static builds are done so that: #760), which re-works how the compression libraries are installed during the java static builds. This prevents duplicate downloads, insures consistency between compiling and linking, and restores support for Snappy and LZ4 compression within the OSX binaries (prior to this commit the OSX binaries would only support Snappy and LZ4 if there were previously installed on the build machine)
  5. @yhchiang cherry picked 2. and 3. above and built/deployed a new 3.13.1 jar to maven central.
  6. When I tried the build locally on my mac, I receive a SIGILL error that is similar to the one seen in issue RocksDB jni error (rocksdb::MergeOperators::CreateFromStringId) #658
  7. I realized that if I do a clean clone/checkout of master and try to run make jclean clean rocksdbjavastaticrelease, I get the following error at the end of the build process when the final cross-platform jar is being packaged. I have submitted PR Fix crossbuild jar packaging #764 to address this.
cd java;jar -uf target/rocksdbjni-4.1.0.jar librocksdbjni-*.so librocksdbjni-*.jnilib
librocksdbjni-*.jnilib : no such file or directory
make: *** [rocksdbjavastaticrelease] Error 1

My suspicion is that when @yhchiang created the latest 3.13.1 build, there was an old librocksdbjni-*.jnilib lying around in the /java directory on the build machine which masked number 7. and caused 6.

My recommendation is to do the following:
a. Commit PR #760 to master.
b. Commit PR #764 to master.
c. Cherry pick the commits into the 3.13.fb branch
d. Perform fresh clone/checkout of 3.13.fb branch on build machine.
e. Rebuild/redeploy 3.13.1 jar.

@yhchiang
Copy link
Contributor

Hey @dmittendorf, thanks for the summarization and the fix! I will rebuild the package from clean 3.13.fb with the two fixes.

@yhchiang
Copy link
Contributor

Hello @dmittendorf, I've republished the package using the steps you suggested in https://oss.sonatype.org/#nexus-search;quick~rocksdb. Can I know whether the package work better now?

@dmittendorf
Copy link
Contributor

Thanks @yhchiang! Based on initial testing the new JAR looks good. I tested locally on my Mac with some ad-hoc testing as well as running the Samza test suite which uses Snappy compression, and all tests passed.

@dmittendorf
Copy link
Contributor

Also forgot to mention...the new JAR still hasn't synced to maven central, but I'm assuming that happens automatically, right?
http://search.maven.org/#artifactdetails%7Corg.rocksdb%7Crocksdbjni%7C3.13.1%7Cjar

@dmittendorf
Copy link
Contributor

Nevermind...must just be the search results...the mod date within maven central is up-to-date.
http://repo1.maven.org/maven2/org/rocksdb/rocksdbjni/3.13.1/

@jwlent55
Copy link

Just tested linux-64 with all 4 compression algorithms and no compression (using Voldemort). All ran OK and the compression db sizes were significantly smaller than the no compression db.

@yhchiang
Copy link
Contributor

That sounds great! Thank you so much for the great help, @jwlent55 and @dmittendorf!

@navina
Copy link

navina commented Oct 15, 2015

@yhchiang I am still seeing the same exception on my linux box. In fact, I tried it yesterday and I get the snappy not linked exception even on my Mac :( Could I be doing something wrong? The 3.13.1 rocksdb jni in maven central is the latest with the above suggested changes, right ?

@dmittendorf I noticed you tested the new rocksdb version with the samza test suite. I am curious if you changed anything in samza other than the rocksdb version in gradle dependency?

@dmittendorf
Copy link
Contributor

Hi Navina,

The only change I made to Samza was the bump the version of RocksDB to 3.13.1.

I saw the same thing initially when running the Samza tests against the latest build, and realized that there was a cached version of the JAR in my local Gradle cache. You can run the following to ensure that the dependencies are refreshed:

./gradlew clean test --refresh-dependencies

@navina
Copy link

navina commented Oct 15, 2015

refresh! doh! I have been clearing out my maven cache, instead of gradle.. #facepalm
It works now. Thanks a lot, @dmittendorf !

@deepeshj
Copy link

Getting below error when opening DB in c.
Invalid argument: Compression type Snappy is not linked with the binary. Open fail
I am opening DB with these options.
rocksdb_options_optimize_level_style_compaction(options, 0);
rocksdb_options_set_create_if_missing(options, 1);
rocksdb_options_set_compression(options,rocksdb_no_compression );
rocksdb_options_set_max_open_files(options,1000);

@igorcanadi
Copy link
Collaborator

@deepeshj it's weird that the open fails when you try with no compression. In any case, this should work if you install snappy library before compiling RocksDB (we automatically detect the presence of snappy)

@zzdever
Copy link

zzdever commented May 25, 2020

Just for reference: similar issue in my project, fixed by adding -lstdc++ to the linker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests