Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scenegraph/Serializer.h: load of misaligned address #5687

Closed
hexagonrecursion opened this issue Dec 22, 2023 · 5 comments · Fixed by #5692
Closed

scenegraph/Serializer.h: load of misaligned address #5687

hexagonrecursion opened this issue Dec 22, 2023 · 5 comments · Fixed by #5692

Comments

@hexagonrecursion
Copy link

hexagonrecursion commented Dec 22, 2023

Observed behaviour

UndefinedBehaviorSanitizer complains about "load of misaligned address" during game starttup. I see about a dozen messages like below.

/home/cdda/git/pioneer/src/scenegraph/Serializer.h:152:8: runtime error: load of misaligned address 0x7f768187841d for type 'const float', which requires 4 byte alignment
0x7f768187841d: note: pointer points here
 00 00 00 01 00 00 00  00 d4 bb 4b b5 00 00 80  3f 20 0e 04 c1 6f ea 9d  3e 26 ea af 41 0f 00 00  00
             ^ 
    #0 0x10d6257 in void Serializer::Reader::readObject<float>(float&) /home/cdda/git/pioneer/src/scenegraph/Serializer.h:152
    #1 0x10d6257 in Serializer::Reader& Serializer::Reader::operator>><float>(float&) /home/cdda/git/pioneer/src/scenegraph/Serializer.h:176
    #2 0x10d6257 in Serializer::Reader::readObject(vector3<float>&) /home/cdda/git/pioneer/src/scenegraph/Serializer.h:166
    #3 0x10d6257 in vector3<float> Serializer::Reader::obj<vector3<float> >() /home/cdda/git/pioneer/src/scenegraph/Serializer.h:115
    #4 0x10d6257 in Serializer::Reader::Vector3f() /home/cdda/git/pioneer/src/scenegraph/Serializer.h:206
    #5 0x10d6257 in SceneGraph::Thruster::Load(SceneGraph::NodeDatabase&) /home/cdda/git/pioneer/src/scenegraph/Thruster.cpp:133
    #6 0x21555d1 in std::function<SceneGraph::Node* (SceneGraph::NodeDatabase&)>::operator()(SceneGraph::NodeDatabase&) const /usr/include/c++/13/bits/std_function.h:591
    #7 0x21555d1 in SceneGraph::BinaryConverter::LoadNode(Serializer::Reader&) /home/cdda/git/pioneer/src/scenegraph/BinaryConverter.cpp:445
    #8 0x2155dec in SceneGraph::BinaryConverter::LoadNode(Serializer::Reader&) /home/cdda/git/pioneer/src/scenegraph/BinaryConverter.cpp:461
    #9 0x2155dec in SceneGraph::BinaryConverter::LoadNode(Serializer::Reader&) /home/cdda/git/pioneer/src/scenegraph/BinaryConverter.cpp:461
    #10 0x2155dec in SceneGraph::BinaryConverter::LoadNode(Serializer::Reader&) /home/cdda/git/pioneer/src/scenegraph/BinaryConverter.cpp:461
    #11 0x216046e in SceneGraph::BinaryConverter::CreateModel(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, Serializer::Reader&) /home/cdda/git/pioneer/src/scenegraph/BinaryConverter.cpp:244
    #12 0x216329a in SceneGraph::BinaryConverter::Load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, RefCountedPtr<FileSystem::FileData>) /home/cdda/git/pioneer/src/scenegraph/BinaryConverter.cpp:165
    #13 0x2165304 in SceneGraph::BinaryConverter::Load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/cdda/git/pioneer/src/scenegraph/BinaryConverter.cpp:212
    #14 0x2166584 in SceneGraph::BinaryConverter::Load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/cdda/git/pioneer/src/scenegraph/BinaryConverter.cpp:151
    #15 0x1ceb013 in SceneGraph::Loader::LoadModel(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/cdda/git/pioneer/src/scenegraph/Loader.cpp:166
    #16 0x1cee284 in SceneGraph::Loader::LoadModel(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/cdda/git/pioneer/src/scenegraph/Loader.cpp:136
    #17 0x188cebe in ModelCache::FindModel(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/cdda/git/pioneer/src/ModelCache.cpp:25
    #18 0x6d6879 in Pi::FindModel(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) /home/cdda/git/pioneer/src/Pi.cpp:1174
    #19 0x1719e3f in Intro::Intro(Graphics::Renderer*, int, int) /home/cdda/git/pioneer/src/Intro.cpp:50
    #20 0x6cc315 in MainMenu::Start() /home/cdda/git/pioneer/src/Pi.cpp:662
    #21 0x4ddc99 in Application::StartLifecycle() /home/cdda/git/pioneer/src/core/Application.cpp:118
    #22 0x4e0d0b in Application::Run() /home/cdda/git/pioneer/src/core/Application.cpp:180
    #23 0x4947f6 in main /home/cdda/git/pioneer/src/main.cpp:180
    #24 0x7f76adf61149 in __libc_start_call_main (/lib64/libc.so.6+0x28149) (BuildId: 788cdd41a15985bf8e0a48d213a46e07d58822df)
    #25 0x7f76adf6120a in __libc_start_main_impl (/lib64/libc.so.6+0x2820a) (BuildId: 788cdd41a15985bf8e0a48d213a46e07d58822df)
    #26 0x4c0364 in _start (/home/cdda/git/pioneer/build/pioneer+0x4c0364) (BuildId: 0b596787b6ec3e0de0d5dbbe9321ec7a8c870f56)

Expected behaviour

No misaligned reads

Note: the C++ standard allows compilers to assume misaligned reads (just like any other undefined behaviour) never happen. The compiler is than allowed to make an arbitrarily long and complex chain of logic deductions from that and use this information to mangle your code until it is completely unrecognisable. We should probably fix this.

Steps to reproduce

  1. Install libasan, libasan-static, libubsan, libubsan-static Note: I am not sure which one is required so I just installed all of them.
  2. Make sure the build/ directory does not exist. This is needed because cmake does not update CXXFLAGS and LDFLAGS for existing build directories.
  3. CXXFLAGS="-fsanitize=address,pointer-compare,pointer-subtract,undefined" LDFLAGS="$CXXFLAGS" cmake -S . -B build/ Note: this command assumes you are in the git root dirrectory rather than in build/
  4. make -C build/
  5. make -C build/ build-data
  6. UBSAN_OPTIONS=print_stacktrace=1 ./pioneer

My complier: g++ (GCC) 13.2.1 20231205 (Red Hat 13.2.1-6)

My pioneer version (and OS): 8b02b62 built from source (and Fedora release 39)

My output.txt (required) and game save (optional, but recommended) stdout&stderr.txt (note: there was no output.txt so I just captured stdout&stderr). No save necessary - the errors happen before the game shows the main menu.

Observations

I have found two places where we forgot to insert padding in the SGM format, but it would surprise me if there are not many more

  1. Thruster::Save & Thruster::Load - need 3 bytes of padding after linear because float wants to be 4-byte aligned. Plus: we are missing 4 bytes of padding at the end - just in case something later in the file contains a double or int64.
    Thruster *Thruster::Load(NodeDatabase &db)
    {
    const bool linear = db.rd->Bool();
    const vector3f dir = db.rd->Vector3f();
    const vector3f pos = db.rd->Vector3f();
    Thruster *t = new Thruster(db.loader->GetRenderer(), linear, pos, dir);
    return t;
    }
  2. Writer::Blob & Reader::Blob - need to pad to the 8 byte boundary because the next field might need 8 byte alignment and there is no convenient way for the caller to determine the necessary amount of padding
    ByteRange Blob()
    {
    auto len = Int32();
    if (len == 0) return ByteRange();
    if (len > (m_data.end - m_at))
    throw std::out_of_range("Serializer::Reader encountered truncated stream.");
    auto range = ByteRange(m_at, m_at + len);
    m_at += len;
    return range;
    }

Possible fixes

  1. Use memcpy() instead of reinterpret_cast()

     #include <cstring>
     float unalignedload(char const *packed)
     {
       float buffer;
       std::memcpy(&buffer, packed, sizeof(float));
       return buffer;
     }
    
  2. Make the public API of Serializer::Writer and Serializer::Reader always insert padding to the 8 byte boundary after writing/reading. This might be somewhat inefficient in terms of file size, but is relatively easy to implement.

  3. Make the public API of Serializer::Writer and Serializer::Reader calculate the minimum necessary padding for the next field and insert the padding before writing/reading

  4. Fix Thruster::Save&Thruster::Load, Writer::Blob & Reader::Blob and many other places to insert&skip the necessary padding. This is fragile - eventually someone will make a similar mistake again.

  5. Stop reinventing the wheel and replace Serializer with https://capnproto.org, https://protobuf.dev or https://flatbuffers.dev

Note: every option other than 1 will require bumping SGM_VERSION to a new version.

@hexagonrecursion
Copy link
Author

hexagonrecursion commented Dec 22, 2023

Found one more place. We are missing 1 byte of padding after use_pattern in BinaryConverter::LoadMaterials:

void BinaryConverter::LoadMaterials(Serializer::Reader &rd)
{
PROFILE_SCOPED()
for (Uint32 numMats = rd.Int32(); numMats > 0; numMats--) {
MaterialDefinition m("");
m.name = rd.String();
m.tex_diff = rd.String();
m.tex_spec = rd.String();
m.tex_glow = rd.String();
m.tex_ambi = rd.String();
m.tex_norm = rd.String();
m.diffuse = rd.Color4UB();
m.specular = rd.Color4UB();
m.ambient = rd.Color4UB();
m.emissive = rd.Color4UB();
m.shininess = rd.Int16();
m.opacity = rd.Int16();
m.alpha_test = rd.Bool();
m.unlit = rd.Bool();
m.use_pattern = rd.Bool();
if (m.use_pattern) m_patternsUsed = true;
ConvertMaterialDefinition(m);
}
}

Also: we have to insert 4 bytes of padding somewhere in here (either at the end or after reading numMats) because otherwise we are no longer 8 byte aligned what this function returns.

@hexagonrecursion
Copy link
Author

And one more example. After reading CollisionGeometry the next thing we read will not be 8 byte aligned. The bool at the end makes it start an an odd address. In the file I am looking at the next field we read happens to be a double, which wants to be 8 byte aligned. I am done. Don't think more examples are necessary. I think I have proven my point: the current API of the serializer is a maintenance nightmare - even if we fix all the bugs it is too easy to add new ones.

void CollisionGeometry::Save(NodeDatabase &db)
{
PROFILE_SCOPED()
Node::Save(db);
db.wr->Int32(m_vertices.size());
for (const auto &pos : m_vertices)
db.wr->Vector3f(pos);
db.wr->Int32(m_indices.size());
for (const auto idx : m_indices)
db.wr->Int32(idx);
db.wr->Int32(m_triFlag);
db.wr->Bool(m_dynamic);
}

@hexagonrecursion
Copy link
Author

The serializer API should encapsulate the alignment and padding concerns instead of making it the responsibility of the caller to call the right methods in the right order and carefully count the bytes.

@sturnclaw
Copy link
Member

This is a very well-documented issue, and I highly appreciate the time you took to delve deeply into the problem and propose solutions.

There are several in-progress efforts to modernize other parts of the codebase (mostly rendering related) which will necessitate a higher-level redesign/reconsideration of how certain parts of the SGM format operate, and for that reason (among others) a rewrite to use Protobuf et. al. is not a candidate solution at this time.

In the interest of completeness, I should point out that 8-byte alignment is not a hardware or software requirement for any operations of the serialization facilities except the very few sections of double-typed data being loaded to fill axis-aligned bounding box variables. The remaining used overloads are all 4-byte or 1-byte aligned. Additionally, on the only hardware platform Pioneer officially supports (x86/x86_64), unaligned loads are not an issue at the hardware level.

That being said, some simple benchmarking shows that invoking memcpy instead of reinterpret_cast doesn't result in any significant performance loss, and it is the cleanest and conceptually simplest way to resolve this issue without necessitating another SGM version bump.


note: there was no output.txt so I just captured stdout&stderr

This is possibly because the Pioneer user directory is now $XDG_DATA_HOME/pioneer instead of ~/.pioneer, for better compatibility with the XDG standards. ~/.pioneer is used if present from an older version of Pioneer however.

@hexagonrecursion
Copy link
Author

there was no output.txt

I indeed was searching in the wrong place - the root of the git repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants