Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CR] Seamless loading of gzipped world/maps/x.y.z/x.y.z.map files #43986

Closed
wants to merge 43 commits into from

Conversation

int-ua
Copy link
Contributor

@int-ua int-ua commented Sep 6, 2020

Superseded

by #44218

Summary

SUMMARY: Performance "Seamless loading of gzipped world/maps/x.y.z/x.y.z.map files"

Purpose of change

Ability to drastically reduce save folder sizes with (hopefully) only a marginal performance impact.

Describe the solution

Checking first two bytes of files before loading JSON and if they are compressed use gzstream instead of std::ifstream to read them.
Reference: kevingranade/Cataclysm@e751f0b

Testing

#!/bin/bash
WORLD_NAME="test"
./cataclysm-tiles --world "$WORLD_NAME"
cp -r "save/${WORLD_NAME}" "save/${WORLD_NAME}_gz"
find "save/${WORLD_NAME}_gz/maps/" -type f -exec gzip "{}" \; -exec mv "{}.gz" "{}" \;
./cataclysm-tiles --world "${WORLD_NAME}_gz"
du -sh "save/${WORLD_NAME}/maps/"
du -sh "save/${WORLD_NAME}_gz/maps/"

Additional context

#10784 #14307 #19192 #24916 #32530

@int-ua int-ua changed the title [CR] Support for seamless loading of gzipped map files [CR] Draft for support for seamless loading of gzipped map files Sep 6, 2020
@anothersimulacrum anothersimulacrum added the [C++] Changes (can be) made in C++. Previously named `Code` label Sep 6, 2020
@int-ua
Copy link
Contributor Author

int-ua commented Sep 6, 2020

Not sure I will have time in the coming days to work more on this PR unfortunately. But I definitely want this feature and no one else seems to be available and interested simultaneosly.

@int-ua
Copy link
Contributor Author

int-ua commented Sep 13, 2020

libgzstream-dev package is not available for Ubuntu Xenial. Is there anything that can be done about that?

@int-ua
Copy link
Contributor Author

int-ua commented Sep 13, 2020

Is it ok if I add gzstream as a git submodule? It's easier to just go with a folder I guess but would you (maintainers) mind if try submodule?

@jbytheway
Copy link
Contributor

Of course it is up to @kevingranade but I'd strongly recommend against using submodules. In my experience they always cause more issues than they solve. For example, new people checking out the codebase have extra steps they need to take or things behave strangely.

You already ran into a related issue in needing to alter the actions to checkout the submodules. There are so many little places where similar issues will arise; it's just not worth the trouble.

@int-ua
Copy link
Contributor Author

int-ua commented Sep 14, 2020

In my opinion the issue is highly outweighed by the benefits of not forcing everyone to download the history of everything. Although it's quite late at this point I still think that this project could benefit from git submodules and I'm trying to establish a precedent. It's just an additional command or two to run that can save you ghiblibytes of space in the future. I'm genuinely interested, what are these little places?

@kevingranade
Copy link
Member

kevingranade commented Sep 14, 2020

If you're trying to set a precedent for more extensive use for git submodules, stop now, that's not happening.
I was interested in seeing the outcome of using submodules for an optional dependency, but as I'm positively allergic to taking dependencies in the first place, I don't anticipate there being many opportunities for using them in the future.

I'm not clear what the mechanism would be for having people not download the entire project history. We can possibly shift out the tilesets, but even that would be a massive undertaking of rebasing the entire project history to remove them from the main repository, and nothing else is coming to mind that is optional for download.

@jbytheway
Copy link
Contributor

Since you asked, here are some of the annoyances from submodules I'm aware of:

  • New users/developers need to do the extra commands to get everything. This is yet another obstacle to the already-large issues with learning git that new contributors have, especially on a project like this where many contributors are not previously git users.
  • If you clone the repo without the submodule and try to work somewhere without Internet access then it's too late to download the submodule.
  • GitHub zip downloads of repos don't include submodules.
  • git archive doesn't include submodules (which is the reason for the above point).
  • Every system that automatically obtains and builds the code needs to be updated to take account of this. This is not just every CI system we set up for the project, but also each one of the many Linux distros that ship a CDDA package.
  • If the module you reference ceases to exist or rewrites its history then historical versions of CDDA become impossible to build, because the referenced commits cannot be found. Even if the referenced repo merely moves to a new host, then you must use little-known git configuration options to redirect to the new location, and you have to change that configuration depending on the point in history you're trying to build, which is messy at best.

@int-ua
Copy link
Contributor Author

int-ua commented Sep 14, 2020

stop now, that's not happening.

ok, I'll just copy the files.

optional for download

It's becoming off-topic and irrelevant here but just to explain my opinion, in the order of being suitable for offloading into submodules (more to less):

  • gfx/*: I forgot to include these at first, added by editing.
  • data/mods/*: as a person who mostly plays default/vanilla
  • android and maybe other platform-specific folders
  • lang/po: as a person who plays without l10n. Although I've tried Transifex for my lang but it's just too slow. This folder is larger than everything else combined.

Copy link
Contributor

@olanti-p olanti-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a bunch of changes that should make it compile / be a bit cleaner

.github/workflows/CBA.yml Outdated Show resolved Hide resolved
src/gzstream.cpp Outdated Show resolved Hide resolved
src/gzstream.cpp Outdated Show resolved Hide resolved
src/cata_utility.cpp Outdated Show resolved Hide resolved
src/gzstream.h Outdated Show resolved Hide resolved
src/gzstream.h Show resolved Hide resolved
src/gzstream.h Outdated Show resolved Hide resolved
src/gzstream.h Outdated Show resolved Hide resolved
src/gzstream.h Outdated Show resolved Hide resolved
int-ua and others added 2 commits September 15, 2020 22:34
Co-authored-by: Oleg Antipin <60584843+olanti-p@users.noreply.github.com>
Co-authored-by: Oleg Antipin <60584843+olanti-p@users.noreply.github.com>
@int-ua
Copy link
Contributor Author

int-ua commented Sep 15, 2020

Thank you! Let's see if I missed anything...

@int-ua
Copy link
Contributor Author

int-ua commented Sep 15, 2020

fatal error: SDL_version.h: No such file or directory

Is it really a problem that was introduced by this PR? Is g++-7 build different in that it needs something like one of the configs in CMakeModules/?

CMakeLists.txt Outdated
@@ -369,6 +369,8 @@ IF(USE_HOME_DIR)
ENDIF(USE_HOME_DIR)

add_subdirectory(src)
add_executable(main src/main.cpp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't want to add a new executable like this; it needs to be made a dependency of the existing executables.

zlib is already a dependency of the binaries in TILES mode, so you can see how this is done (in src/CMakeLists.txt, not this file) and mimic that. Will need something like

target_include_directories(cataclysm-common PUBLIC ${ZLIB_INCLUDE_DIRS})
target_link_libraries(cataclysm-common ${ZLIB_LIBRARIES})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not realizing there is a src/CMakeLists.txt was my major problem until now, thank you!

@int-ua
Copy link
Contributor Author

int-ua commented Sep 16, 2020

Still ld: ../src/CMakeFiles/cataclysm-tiles-common.dir/gzstream.cpp.o: undefined reference to symbol 'gzread'

@int-ua
Copy link
Contributor Author

int-ua commented Sep 16, 2020

//lib/x86_64-linux-gnu/libz.so.1: error adding symbols: DSO missing from command line
Ok, looks like g++-7 is still missing -lz somewhere.

@int-ua
Copy link
Contributor Author

int-ua commented Sep 16, 2020

/usr/local/android-sdk-25.2.3/ndk-bundle/build/core/build-binary.mk:651: Android NDK: Module main depends on undefined modules: z

@int-ua int-ua changed the title [CR] Draft for support for seamless loading of gzipped map files [CR] Support for seamless loading of gzipped save/map/x.y.z/x.y.z.map files Sep 16, 2020
@int-ua int-ua changed the title [CR] Support for seamless loading of gzipped save/map/x.y.z/x.y.z.map files [CR] Support for seamless loading of gzipped world/maps/x.y.z/x.y.z.map files Sep 16, 2020
@int-ua int-ua changed the title [CR] Support for seamless loading of gzipped world/maps/x.y.z/x.y.z.map files [CR] Seamless loading of gzipped world/maps/x.y.z/x.y.z.map files Sep 16, 2020
@jbytheway
Copy link
Contributor

I looked at the src/CMakeLists.txt a little more closely. It appears that zlib is only being made a dependency of cataclysm-tiles-common when DYNAMIC_LINKING is OFF (which is not the default). That dependency needs to be made unconditional; you can move the ZLIB parts down to the target_include_directories and target_link_libraries after ENDIF (NOT DYNAMIC_LINKING).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[C++] Changes (can be) made in C++. Previously named `Code`
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants