-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cached file management #434
Conversation
Write metadata for each cached http file once the file is downloaded completely. This metadata file contains basic information that would help us to have some management of cached files in system.
Added `duckdb.cache_info()` and `duckdb.cache_delete(cache_key TEXT)` functions can be used to manage cached files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sql/pg_duckdb--0.1.0--0.2.0.sql
Outdated
CREATE TYPE duckdb.cache_info AS ( | ||
cache_key TEXT, | ||
remote_path TEXT, | ||
file_size TEXT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we add the file's mtime
here? it would be useful for clearing out older versions of files, basic LRU functions, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added timestamp cache creation time.
string metadata_info = cache_key + "," + remote_path + "," + std::to_string(total_size); | ||
handle->Write((void *)metadata_info.c_str(), metadata_info.length(), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't Write
take a const char*
?
string metadata_info = cache_key + "," + remote_path + "," + std::to_string(total_size); | |
handle->Write((void *)metadata_info.c_str(), metadata_info.length(), 0); | |
handle->Write(cache_key.c_str(), cache_key.length(), 0); | |
handle->Write(",", 1, 0); | |
handle->Write(remote_path.c_str(), remote_path.length(), 0); | |
handle->Write(",", 1, 0); | |
auto s = std::to_string(total_size); | |
handle->Write(s.c_str(), s.length(), 0); |
I'm also surprised we don't need a new line after?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write complains unless it is void * so we need to cast. Also, writing commas will also fail.
IMO, constructing single metadata line is simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fair enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the extra iteration!
For each cached file we will now create additional metadata file that contains simple metadata. Added funcitons
duckdb.cache_info()
andduckdb.cache_delete(cache_key TEXT)
that are used for managing cached files.