Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CBL-5438: DateTime standard format parser #213

Closed
wants to merge 8 commits into from
Closed

Conversation

callumbirks
Copy link
Contributor

@callumbirks callumbirks commented Mar 11, 2024

A proper parser for DateTime formats, rather than just parsing format string as a DateTime.
This has benefits such as additional flexibility in future, but also allows us to keep consistent with the strange "almost-ISO" date format we shipped with CBL 3.1, because keeping format as a DateTime does not allow us to optionally use colon or no-colon for Timezone offset (i.e. +0500 / +05:00).
I added in parsing for a "subset" of https://howardhinnant.github.io/date/date.html#to_stream_formatting, but it's not necessary and if anyone thinks we don't need it, happy to take it out. Think it's a lot better than the weird "1111-11-11T11:11......` format we copied from Server though.

@callumbirks
Copy link
Contributor Author

See Core PR couchbase/couchbase-lite-core#1977

@callumbirks
Copy link
Contributor Author

First issue: Unsure whether CBL3.1 included milliseconds by default or not. I thought it didn't, so I removed milliseconds from the default format (found at DateFormat::kISO8601). However, this has broken some Fleece tests that don't expect to lose the milliseconds precision.

@borrrden
Copy link
Member

Are there tests for this somewhere? Could they be added to the Fleece tests?

Also I'm kind of unsure if this qualifies as feature creep or not since this is getting out of the scope of a binary encoding. @snej Do you have an opinion on this? This functionality needs to be somewhere accessible to Fleece obviously but is it ok to live next to fleece itself?

@callumbirks
Copy link
Contributor Author

I added a couple tests in Core. Probably does need more.

And I'm not sure. Previously we parse date format strings as DateTime, which restricts us severely. For example, we must use no-colon by default in Timezone specifier, but using DateTime as format we can't offer colon delimited Timezone. So as far as the parser goes, I don't see that the scope is any bigger than it already was (although it may have already had scope creep). However, the %-format is not strictly necessary (although much nicer).

@borrrden
Copy link
Member

Can those Core tests be moved here in some fashion and use the underlying classes directly?

@callumbirks
Copy link
Contributor Author

That would be a good idea... Having to run a Core PR pointing here to make sure nothing's broken over here isn't the best way to write software....

@callumbirks
Copy link
Contributor Author

@borrrden I've added some DateFormat tests. See SupportTests.cc.

Copy link
Contributor

@snej snej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top-level comment is that this is functionality only needed by CBL N1QL functions, not Fleece itself, so I'd rather the code were put in LiteCore.

static slice format(char buf[], int64_t timestamp, std::chrono::minutes tzoffset,
std::optional<DateFormat> fmt);

static DateFormat kISO8601;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be const?


bool operator==(const YMD& other) const;

static YMD kISO8601;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be const?

};

// 1111-11-11T11:11:11(Z)
DateFormat(YMD ymd, Separator separator, HMS hms, const std::optional<Timezone> tz = {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The const here doesn't do anything, since tz is not a pointer or reference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, it is essentially a variable declaration. Which (IMO) should be const if it isn't modified in the function.

explicit DateFormat(YMD ymd) : ymd{ymd} {}

// 11:11:11(Z)
explicit DateFormat(HMS hms, const std::optional<Timezone> tz = {}) : hms{hms}, tz{tz} {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another const that doesn't do anything

// %F == %Y-%m-%d
if ( formatString[1] == 'F' ) {
// ISO YMD, so we don't need to change `ymd`
formatString.setStart(formatString.begin() + 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slice_istream would be pretty useful here, making the code simpler & safer

enum class Timezone : uint8_t { NoColon, Colon };
enum class Separator : char { Space = ' ', T = 'T' };

struct YMD {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks as though the only parts of YMD and HMS that are used by this PR are the separators. I understand wanting room to grow, but all the work of defining types and fields and structs that aren't used yet seems too much ... It seems like each of these structs could be reduced to the Separator enum for now, and it wouldn't be difficult to refactor them back to structs in the future if/when we need it.

Copy link
Contributor Author

@callumbirks callumbirks Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... I laid out the types before I'd written the functionality, and before I realized how restrictive we have to be. It started out with a decent number of variants in a few of these types, but they were removed as I progressed along building up the format we actually allow. I've stripped these out now, in the Core PR.

@borrrden
Copy link
Member

I believe the counterpoint to "this is only used in LiteCore" is "we were using some of this for a specific function in here which writes a datetime to a stream". In that case at least some of this needs to be accessible here which makes breaking it up somewhat of a pain.

Copy link
Member

@borrrden borrrden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jens points out some nice feedback so I'll +1 the code parts that he mentioned.

@callumbirks
Copy link
Contributor Author

Closing. Moved DateFormat to Core. See couchbase/couchbase-lite-core#1977

@callumbirks
Copy link
Contributor Author

callumbirks commented Mar 21, 2024

@borrrden Regarding

some of this needs to be accessible here

For now I'm just going along with the duplicated code. Jens said he expected some duplicated code around this, and would rather this not be in Fleece. I definitely agree it doesn't really belong in Fleece, but I think the only solution which doesn't involve duplicated code would be adding a whole new dependency. Which I feel would be more pain than it's worth. It would involve ripping out basically all Date-related functionality from Fleece, and requiring it depend on this dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants