Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support parsing tz timezone data #10456

Merged
merged 5 commits into from
Jan 2, 2022
Merged

Support parsing tz timezone data #10456

merged 5 commits into from
Jan 2, 2022

Conversation

Aransentin
Copy link
Contributor

@Aransentin Aransentin commented Dec 30, 2021

I added support for parsing tz timezone database files. This will be useful for future date support in the standard library.

Some caveats:
It does not support the legacy pre-2005 format. That one stored timestamps as 32-bit signed integers, and as such will stop working completely in 2038. It does support "thick" tz files that have both legacy and modern data in the same file, by just skipping over the legacy cruft.

I prepared the leap second data structure to fit into 8 bytes (instead of 16, with padding), which strictly speaking violates the spec but will never matter in practice; it can still represent leap seconds nine million years in the future.

The parser rejects all "MUST NOT" from rfc8536, and one SHOULD NOT - timezone type names (e.g. EST, CET...) cannot be more than 6 characters, which is banned by POSIX anyway.

@Aransentin Aransentin marked this pull request as ready for review December 30, 2021 22:10
Copy link
Contributor

@daurnimator daurnimator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure you add .tzif to exclude_extensions in build.zig

FYI I wrote a tzfile parser in lua over at https://github.com/daurnimator/luatz/blob/master/luatz/tzfile.lua
I'd like to think its pretty easy to read.

lib/std/tz.zig Outdated Show resolved Hide resolved
lib/std/tz.zig Outdated Show resolved Hide resolved
lib/std/tz.zig Outdated
const version = data2[4];
if (!std.mem.eql(u8, magic, "TZif")) return error.BadHeader;
if (version != '2' and version != '3') return error.BadVersion;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should verify that it is followed by 15 nulls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFC8536 doesn't actually say that they should be null, just that they are reserved.

It's even implied that you shouldn't care about them - the spec depends on that legacy parsers that doesn't understand the "version" field not bailing when it sees a '2' or '3' in that byte, otherwise the "fat" version of the files wouldn't work.

lib/std/tz.zig Show resolved Hide resolved
@daurnimator daurnimator added the standard library This issue involves writing Zig code for the standard library. label Dec 31, 2021
lib/std/tz.zig Outdated Show resolved Hide resolved
lib/std/tz.zig Outdated Show resolved Hide resolved

pub const Timetype = struct {
offset: i32,
flags: u8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have this as flags instead of bool fields for each?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes Timetype pack better; 12 bytes instead of 16. I can change it if the optimization isn't worth the complexity of having getters for each value.

lib/std/tz.zig Outdated Show resolved Hide resolved
@Aransentin Aransentin requested a review from daurnimator January 1, 2022 15:19
Copy link
Contributor

@daurnimator daurnimator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Though without algorithms that use the data it's all a bit useless right?
I assume you aim to have a followup PR in mind to make use of this code?

@Aransentin
Copy link
Contributor Author

Though without algorithms that use the data it's all a bit useless right? I assume you aim to have a followup PR in mind to make use of this code?

I want to write some Date/Calendar utilities, yeah. Even if that doesn't go anywhere splitting up the task into smaller subtasks (like the timezone parsing here) means some of the work will already be done for the next person who picks up the torch.

@andrewrk andrewrk merged commit 36b0699 into ziglang:master Jan 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants