Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix millisecond fraction being handled with wrong scale #65

Merged
merged 7 commits into from
Jun 26, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 11 additions & 7 deletions src/date.rs
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ impl FromStr for Date {

// 2e10 if greater than this, the number is in ms, if less than or equal, it's in seconds
// (in seconds this is 11th October 2603, in ms it's 20th August 1970)
const MS_WATERSHED: i64 = 20_000_000_000;
pub(crate) const MS_WATERSHED: i64 = 20_000_000_000;
// 1600-01-01 as a unix timestamp used for from_timestamp below
const UNIX_1600: i64 = -11_676_096_000;
// 9999-12-31T23:59:59 as a unix timestamp, used as max allowed value below
Expand Down Expand Up @@ -206,9 +206,13 @@ impl Date {
/// assert_eq!(d.to_string(), "2022-06-07");
/// ```
pub fn from_timestamp(timestamp: i64, require_exact: bool) -> Result<Self, ParseError> {
let (timestamp_second, _) = Self::timestamp_watershed(timestamp)?;
let (timestamp_second, millis) = Self::timestamp_watershed(timestamp)?;
let d = Self::from_timestamp_calc(timestamp_second)?;
if require_exact {
if millis != 0 {
return Err(ParseError::DateNotExact);
}

davidhewitt marked this conversation as resolved.
Show resolved Hide resolved
let time_second = timestamp_second.rem_euclid(86_400);
if time_second != 0 {
return Err(ParseError::DateNotExact);
Expand Down Expand Up @@ -275,11 +279,11 @@ impl Date {

pub(crate) fn timestamp_watershed(timestamp: i64) -> Result<(i64, u32), ParseError> {
let ts_abs = timestamp.checked_abs().ok_or(ParseError::DateTooSmall)?;
let (mut seconds, mut microseconds) = if ts_abs > MS_WATERSHED {
(timestamp / 1_000, timestamp % 1_000 * 1000)
} else {
(timestamp, 0)
};
if ts_abs <= MS_WATERSHED {
return Ok((timestamp, 0));
}
let mut seconds = timestamp / 1_000;
let mut microseconds = ((timestamp % 1_000) * 1000) as i32;
if microseconds < 0 {
seconds -= 1;
microseconds += 1_000_000;
Expand Down
40 changes: 31 additions & 9 deletions src/datetime.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
use crate::numbers::{float_parse_bytes, IntFloat};
use crate::TimeConfigBuilder;
use crate::date::MS_WATERSHED;
use crate::{int_parse_bytes, TimeConfigBuilder};
use crate::{time::TimeConfig, Date, ParseError, Time};
use std::cmp::Ordering;
use std::fmt;
Expand Down Expand Up @@ -339,14 +339,36 @@ impl DateTime {
pub fn parse_bytes_with_config(bytes: &[u8], config: &TimeConfig) -> Result<Self, ParseError> {
match Self::parse_bytes_rfc3339_with_config(bytes, config) {
Ok(d) => Ok(d),
Err(e) => match float_parse_bytes(bytes) {
IntFloat::Int(int) => Self::from_timestamp_with_config(int, 0, config),
IntFloat::Float(float) => {
let micro = (float.fract() * 1_000_000_f64).round() as u32;
Self::from_timestamp_with_config(float.floor() as i64, micro, config)
Err(e) => {
let mut split = bytes.splitn(2, |&b| b == b'.');
let Some(timestamp) =
int_parse_bytes(split.next().expect("splitn always returns at least one element"))
else {
return Err(e);
};
let float_fraction = split.next();
debug_assert!(split.next().is_none()); // at most two elements
match float_fraction {
Some(fract) => {
// fraction is either:
// - up to 3 digits of millisecond fractions, i.e. microseconds
// - or up to 6 digits of second fractions, i.e. milliseconds
let max_digits = if timestamp > MS_WATERSHED { 3 } else { 6 };
let Some(fract_integers) = int_parse_bytes(fract) else {
return Err(e);
};
let multiple = 10f64.powf(max_digits as f64 - fract.len() as f64);
Comment on lines +360 to +362
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing my issue (#61). But I wonder about this strategy here, reusing an integer parsing function for a fractional part:

  • does this handle leading zeros correctly? (I think yes, thanks to the $10^d$ scaling trick.)
  • does this handle other aspects of integer notation correctly, e.g. signs? (I think not.)
  • does this handle long fractions if the "precision overflow" check isn't used? (I think yes, thanks to the $10^d$ scaling trick.)

As an example, consider an input like 1711445175.+07186 which should probably be invalid, not parsed as if the + were a zero.

Perhaps the i = int_parse_bytes(fract) function could be copied as a (i, magnitude) = fract_parse_bytes(fract) function, where the magnitude would be safer than relying on fract.len()?

Self::from_timestamp_with_config(
timestamp,
// FIXME should we error if the fraction is too long?
// We have TimeConfig truncate / error option.
(fract_integers as f64 * multiple).round() as u32,
config,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say if we use a max_digits pattern above, it seems intuitive to then error if the number of decimal digits violates that standard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a behavioural change but I think correct given the spirit of the config.

}
None => Self::from_timestamp_with_config(timestamp, 0, config),
}
IntFloat::Err => Err(e),
},
}
}
}

Expand Down
15 changes: 14 additions & 1 deletion tests/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,7 @@ fn date_comparison() {
}

#[test]
fn date_timestamp() {
fn date_timestamp_exact() {
let d = Date::from_timestamp(1_654_560_000, true).unwrap();
assert_eq!(d.to_string(), "2022-06-07");
assert_eq!(d.timestamp(), 1_654_560_000);
Expand All @@ -239,6 +239,16 @@ fn date_timestamp() {
Ok(d) => panic!("unexpectedly valid, {d}"),
Err(e) => assert_eq!(e, ParseError::DateNotExact),
}

// milliseconds
let d = Date::from_timestamp(1_654_560_000_000, true).unwrap();
assert_eq!(d.to_string(), "2022-06-07");
assert_eq!(d.timestamp(), 1_654_560_000);

match Date::from_timestamp(1_654_560_000_001, true) {
Ok(d) => panic!("unexpectedly valid, {d}"),
Err(e) => assert_eq!(e, ParseError::DateNotExact),
}
}

macro_rules! date_from_timestamp {
Expand Down Expand Up @@ -854,6 +864,9 @@ param_tests! {
dt_unix1: ok => "1654646400", "2022-06-08T00:00:00";
dt_unix2: ok => "1654646404", "2022-06-08T00:00:04";
dt_unix_float: ok => "1654646404.5", "2022-06-08T00:00:04.500000";
dt_unix_float_limit: ok => "1654646404.123456", "2022-06-08T00:00:04.123456";
dt_unix_float_ms: ok => "1654646404000.5", "2022-06-08T00:00:04.000500";
dt_unix_float_ms_limit: ok => "1654646404123.456", "2022-06-08T00:00:04.123456";
dt_short_date: err => "xxx", TooShort;
dt_short_time: err => "2020-01-01T12:0", TooShort;
dt: err => "202x-01-01", InvalidCharYear;
Expand Down
Loading