Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserializing a complex struct fails #51

Closed
zittix opened this issue Feb 13, 2014 · 8 comments
Closed

Deserializing a complex struct fails #51

zittix opened this issue Feb 13, 2014 · 8 comments
Labels

Comments

@zittix
Copy link

zittix commented Feb 13, 2014

We have the following struct (with serialization method):

typedef struct ModelEntry {
    uint32_t modelEntryID;

    std::vector<uint32_t> hashEntriesIDs[4];

    template<class Archive>
    void serialize(Archive & archive)
    {
        archive( modelEntryID, hashEntriesIDs );
    }

} ModelEntry;

Serialization works fine but when unserializing, it tries to read way too much elements and crashes due to memory allocation problem (I tried Binary and JSON format, all fails):

std::ifstream f("file");
cereal::BinaryInputArchive input(fmodeljournal);
f.close();

Any idea?

Compiler is clang:

Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)
Target: x86_64-apple-darwin13.0.2
Thread model: posix
@zittix
Copy link
Author

zittix commented Feb 13, 2014

Update:

Modifying the serialize method to the following one fixes the problem:

template<class Archive>
void serialize(Archive & archive)
{
    archive( modelEntryID );
    for(int i=0;i<Hash::NB_HASH_FUNCTIONS;i++) {
        archive( hashEntriesIDs[i] );
    }
}

@randvoorhies
Copy link
Contributor

The original example works just fine in g++-4.9. Here is a complete code example, could you verify that it doesn't work in clang?

#include <cereal/cereal.hpp>
#include <cereal/archives/binary.hpp>
#include <cereal/archives/json.hpp>

#include <cereal/types/vector.hpp>

#include <iostream>
#include <fstream>

typedef struct ModelEntry {
    uint32_t modelEntryID;

    std::vector<uint32_t> hashEntriesIDs[4];

    template<class Archive>
    void serialize(Archive & archive)
    {
      archive( CEREAL_NVP(modelEntryID), CEREAL_NVP(hashEntriesIDs) );
    }

} ModelEntry;

int main()
{
  {
    ModelEntry entry;
    entry.modelEntryID = 666;

    entry.hashEntriesIDs[0].push_back(1);
    entry.hashEntriesIDs[0].push_back(2);
    entry.hashEntriesIDs[0].push_back(3);

    entry.hashEntriesIDs[1].push_back(10);
    entry.hashEntriesIDs[1].push_back(20);
    entry.hashEntriesIDs[1].push_back(30);

    entry.hashEntriesIDs[2].push_back(100);
    entry.hashEntriesIDs[2].push_back(200);
    entry.hashEntriesIDs[2].push_back(300);

    entry.hashEntriesIDs[3].push_back(1000);
    entry.hashEntriesIDs[3].push_back(2000);
    entry.hashEntriesIDs[3].push_back(3000);

    std::ofstream f("output.cereal");
    cereal::BinaryOutputArchive oarchive(f);

    oarchive(entry);
  }

  {
    std::ifstream f("output.cereal");
    cereal::BinaryInputArchive iarchive(f);

    ModelEntry entry;
    iarchive(entry);

    cereal::JSONOutputArchive pretty_print(std::cout);
    pretty_print(entry);
  }

  return 0;
}

@AzothAmmo
Copy link
Contributor

I can't reproduce this using g++ 4.8, 4.7 or clang++ 3.3 using libstdc++ on Linux.

I'm assuming you are using libc++?

@AzothAmmo
Copy link
Contributor

Also can't reproduce with libc++ and libc++abi on Linux:

clang++ -std=c++11 -g -o test test.cpp -I workspace/cereal/include/ --stdlib=libc++ -lc++abi

@zittix
Copy link
Author

zittix commented Feb 14, 2014

Update:

Here is a sample program which isolates the problem. It crashes every time the serialized output is loaded back:

Assertion failed: (IsObject()), function MemberBegin, file cereal/external/rapidjson/document.h, line 245.
#include <stdio.h>
#include <stdlib.h>

#include <cereal/types/vector.hpp>
#include <cereal/types/unordered_map.hpp>
#include <cereal/archives/json.hpp>
#include <fstream>


int main(int argc, char *argv[])
{

    struct ModelEntry {
    public:
        uint32_t modelEntryID;

        std::vector<uint32_t> hashEntriesIDs[3];

        template<class Archive>
        void serialize(Archive & archive)
        {
            archive( modelEntryID,hashEntriesIDs );
        }

    };


    std::unordered_map<uint32_t, ModelEntry  > modelJournal; //Relation between model ID and offset in descr memory

    ModelEntry m;
    m.modelEntryID = 1;
    for(int i=0;i<3;i++) {
        for(int j=0;j<1200;j++) {
            m.hashEntriesIDs[i].push_back(j);
        }
    }

    modelJournal[1] = m;

    std::ofstream fmodeljournal("out");
    cereal::JSONOutputArchive output(fmodeljournal);
    output( CEREAL_NVP(modelJournal) );
    fmodeljournal.close();

    modelJournal.clear();

    std::ifstream ifmodeljournal("out");
    cereal::JSONInputArchive input(ifmodeljournal);
    input( CEREAL_NVP(modelJournal) ); //Crash here.
    ifmodeljournal.close();


    return 0;
}

@zittix
Copy link
Author

zittix commented Feb 14, 2014

From my above comment (sorry for the spam), it seems the JSON writer forgets to put a closing } at the end, hence the bug.

I traced down the bug and it is actually due to the fact the writer writes the end } when the destructor is called. In my code, the stream is already closed and therefore nothing is written.

In json.hpp:95 we have:

    //! Destructor, flushes the JSON
      ~JSONOutputArchive()
      {
        itsWriter.EndObject();
      }

The EndObject should be called at the end of the serialization instead at destruction time.

@zittix
Copy link
Author

zittix commented Feb 14, 2014

To follow up on the above problem, it seems the JSON parsing library fails with libc++. Indeed the peek function might return an EOF character (which is -1) . The character is casted to the Ch class of the JSON parser which translates to 0xFF and therefore the end condition (i.e. file cursor should be at the end) is not satisfied. I propose the following fix:

Index: cereal/external/rapidjson/reader.h
===================================================================
--- cereal/external/rapidjson/reader.h  (revision 1359)
+++ cereal/external/rapidjson/reader.h  (working copy)
@@ -245,8 +245,8 @@
                default: RAPIDJSON_PARSE_ERROR("Expect either an object or array at root", stream.Tell());
            }
            SkipWhitespace(stream);
-
-           if (stream.Peek() != '\0')
+            Ch endc = stream.Peek();
+           if (endc != '\0' && endc!=static_cast<Ch>(std::char_traits<Ch>::eof()))
                RAPIDJSON_PARSE_ERROR("Nothing should follow the root object or array.", stream.Tell());
        }

@AzothAmmo
Copy link
Contributor

Regarding the trailing } output by the JSON archive - this is intentional that it only gets output when the destructor is executed. There's no good way for the archive to know you are done with your serialization without it being explicitly told, or in our case, destroyed. All archives are designed to be used in an RAII fashion.

This is also the reason your example is not working properly; you are closing the file before the destructor of the cereal archive has executed, preventing the closing } from being written. Your code should look like this instead:

#include <stdio.h>
#include <stdlib.h>

#include <cereal/types/vector.hpp>
#include <cereal/types/unordered_map.hpp>
#include <cereal/archives/json.hpp>
#include <fstream>

struct ModelEntry
{
  public:
    uint32_t modelEntryID;

    std::vector<uint32_t> hashEntriesIDs[3];

    template<class Archive>
    void serialize(Archive & archive)
    {
      archive( modelEntryID,hashEntriesIDs );
    }
};

int main(int argc, char *argv[])
{
  std::unordered_map<uint32_t, ModelEntry> modelJournal;

  ModelEntry m;
  m.modelEntryID = 1;
  for(int i=0;i<3;i++) {
    for(int j=0;j<1200;j++) {
      m.hashEntriesIDs[i].push_back(j);
    }
  }

  modelJournal[1] = m;

  {
    std::ofstream fmodeljournal("out");
    cereal::JSONOutputArchive output(fmodeljournal);
    output( CEREAL_NVP(modelJournal) );
  }

  modelJournal.clear();

  {
    std::ifstream ifmodeljournal("out");
    cereal::JSONInputArchive input(ifmodeljournal);
    input( CEREAL_NVP(modelJournal) ); //Crash here.
  }
  return 0;
}

You can let RAII take care of your file handles as well, since it will properly close them.

randvoorhies added a commit that referenced this issue Feb 14, 2014
The credit to this fix is due to @zittix. This closes issue #51.

This is due to a bug in the mac version of libc++ and should be fixed in
HEAD. See here for a related stackoverflow post:
http://stackoverflow.com/questions/14147667/clang-and-libc-istreampeek-does-not-set-eof-flag
@AzothAmmo AzothAmmo added the bug label Feb 15, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants