Skip to content

Commit

Permalink
Update the documentation for parsing integers.
Browse files Browse the repository at this point in the history
  • Loading branch information
Alexhuszagh committed Jan 5, 2025
1 parent c836054 commit 54835ce
Show file tree
Hide file tree
Showing 12 changed files with 520 additions and 405 deletions.
534 changes: 237 additions & 297 deletions lexical-parse-float/src/options.rs

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions lexical-parse-float/tests/options_tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@ fn builder_test() {
}

#[test]
#[allow(deprecated)]
fn options_test() {
let mut opts = Options::new();

Expand Down
3 changes: 0 additions & 3 deletions lexical-parse-integer/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,6 @@ exclude = [
"cargo-timing*.html"
]

[dependencies]
static_assertions = "1"

[dependencies.lexical-util]
version = "1.0.5"
path = "../lexical-util"
Expand Down
148 changes: 129 additions & 19 deletions lexical-parse-integer/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,111 @@
//! Fast lexical string-to-integer conversion routines.
//!
//! This contains high-performance methods to parse integers from bytes.
//! Using [`from_lexical`] is analogous to [`parse`][`core-parse`],
//! just enabled parsing from bytes as well as [`str`].
//!
//! [`from_lexical`]: FromLexical::from_lexical
//! [`core-parse`]: core::str::FromStr
//!
//! # Getting Started
//!
//! To parse a number from bytes, use [`from_lexical`]:
//!
//! ```rust
//! # #[no_std]
//! # use core::str;
//! use lexical_parse_integer::FromLexical;
//!
//! let value = u64::from_lexical("1234".as_bytes());
//! assert_eq!(value, Ok(1234));
//! ```
//!
//! # Features
//!
//! * `std` - Disable for use in a [`no_std`] environment.
//! * `power-of-two` - Add support for parsing power-of-two integer strings.
//! * `radix` - Add support for strings of any radix.
//! * `format` - Add support for parsing custom integer formats.
//! * `compact` - Reduce code size at the cost of performance.
//!
//! [`no_std`]: https://docs.rust-embedded.org/book/intro/no-std.html
//!
//! #### power-of-two
//!
//! Enable parsing numbers that are powers of two, that is, `2`, `4`, `8`, `16`,
//! and `32`.
//!
//! ```rust
//! # #[no_std]
//! # #[cfg(feature = "power-of-two")] {
//! # use core::str;
//! use lexical_parse_integer::{FromLexicalWithOptions, NumberFormatBuilder, Options};
//!
//! const BINARY: u128 = NumberFormatBuilder::binary();
//! let options = Options::new();
//! let value = u64::from_lexical_with_options::<BINARY>("10011010010".as_bytes(), &options);
//! assert_eq!(value, Ok(1234));
//! # }
//! ```
//!
//! #### power-of-two
//!
//! Enable parsing numbers using all radixes from 2-36. This requires more
//! static storage than [`power-of-two`][crate#power-of-two], and increases
//! compile times, but can be quite useful for esoteric programming languages
//! which use duodecimal integers.
//!
//! ```rust
//! # #[no_std]
//! # #[cfg(feature = "radix")] {
//! # use core::str;
//! use lexical_parse_integer::{FromLexicalWithOptions, NumberFormatBuilder, Options};
//!
//! const BINARY: u128 = NumberFormatBuilder::from_radix(12);
//! let options = Options::new();
//! let value = u64::from_lexical_with_options::<BINARY>("86A".as_bytes(), &options);
//! assert_eq!(value, Ok(1234));
//! # }
//! ```
//!
//! #### compact
//!
//! Reduce the generated code size at the cost of performance. This minimizes
//! the number of static tables, inlining, and generics used, drastically
//! reducing the size of the generated binaries. However, this resulting
//! performance of the generated code is much lower.
//!
//! #### format
//!
//! Add support custom float formatting specifications. This should be used in
//! conjunction with [`Options`] for extensible integer parsing. This allows
//! changing the use of digit separators, requiring or not allowing signs, and
//! more. For a list of all supported fields, see [Parse Integer
//! Fields][NumberFormatBuilder#parse-integer-fields].
//!
//! ```rust
//! # #[cfg(feature = "format")] {
//! # use core::{num, str};
//! use lexical_parse_integer::{NumberFormatBuilder, Options, FromLexicalWithOptions};
//!
//! const FORMAT: u128 = NumberFormatBuilder::new()
//! .required_mantissa_sign(true)
//! .integer_internal_digit_separator(true)
//! .digit_separator(num::NonZeroU8::new(b'_'))
//! .base_prefix(num::NonZeroU8::new(b'd'))
//! .build_strict();
//! let options = Options::new();
//!
//! let value = u64::from_lexical_with_options::<FORMAT>("+12_3_4".as_bytes(), &options);
//! assert_eq!(value, Ok(1234));
//!
//! let value = u64::from_lexical_with_options::<FORMAT>("+0d12_3_4".as_bytes(), &options);
//! assert_eq!(value, Ok(1234));
//! # }
//! ```
//!
//! # Algorithm
//!
//! The default implementations are highly optimized both for simple
//! strings, as well as input with large numbers of digits. In order to
//! keep performance optimal for simple strings, we avoid overly branching
Expand All @@ -18,30 +124,34 @@
//! unnecessary branching and produces smaller binaries, but comes
//! at a significant performance penalty for integers with more digits.
//!
//! # Features
//! To optimize for smaller integers at the expense of performance of larger
//! ones, you can use [`OptionsBuilder::no_multi_digit`].
//!
//! * `std` - Disable for use in a [`no_std`] environment.
//! * `power-of-two` - Add support for parsing power-of-two integer strings.
//! * `radix` - Add support for strings of any radix.
//! * `format` - Add support for parsing custom integer formats.
//! * `compact` - Reduce code size at the cost of performance.
//! ```rust
//! # use core::{num, str};
//! use lexical_parse_integer::{NumberFormatBuilder, Options, FromLexicalWithOptions};
//!
//! [`no_std`]: https://docs.rust-embedded.org/book/intro/no-std.html
//! const FORMAT: u128 = NumberFormatBuilder::new().build_strict();
//! let options = Options::builder()
//! .no_multi_digit(true)
//! .build_strict();
//!
//! // a bit faster
//! let value = u64::from_lexical_with_options::<FORMAT>(b"12", &options);
//! assert_eq!(value, Ok(12));
//!
//! // a lot slower
//! let value = u64::from_lexical_with_options::<FORMAT>(b"18446744073709551615", &options);
//! assert_eq!(value, Ok(0xffffffffffffffff));
//! ```
//!
//! # Note
//! # Higher-Level APIs
//!
//! Only documented functionality is considered part of the public API:
//! any of the modules, internal functions, or structs may change
//! release-to-release without major or minor version changes. Use
//! internal implementation details at your own risk.
//! If you would like an API that supports multiple numeric conversions rather
//! than just writing integers, use [`lexical`] or [`lexical-core`] instead.
//!
//! lexical-parse-integer mainly exists as an implementation detail for
//! lexical-core, although its API is stable. If you would like to use
//! a high-level API that writes to and parses from `String` and `&str`,
//! respectively, please look at [lexical](https://crates.io/crates/lexical)
//! instead. If you would like an API that supports multiple numeric
//! conversions, please look at [lexical-core](https://crates.io/crates/lexical-core)
//! instead.
//! [`lexical`]: https://crates.io/crates/lexical
//! [`lexical-core`]: https://crates.io/crates/lexical-core
//!
//! # Version Support
//!
Expand Down
46 changes: 29 additions & 17 deletions lexical-parse-integer/src/options.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,19 @@
//! Configuration options for parsing integers.
//!
//! # Examples
//!
//! ```rust
//! use lexical_parse_integer::options::Options;
//!
//! # pub fn main() {
//! let options = Options::builder()
//! .no_multi_digit(true)
//! .build_strict();
//! # }
//! ```
use lexical_util::options::ParseOptions;
use lexical_util::result::Result;
use static_assertions::const_assert;

/// Builder for [`Options`].
#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)]
Expand Down Expand Up @@ -45,7 +56,7 @@ impl OptionsBuilder {

// BUILDERS

/// Check if the builder state is valid.
/// Check if the builder state is valid (always [`true`]).
#[inline(always)]
pub const fn is_valid(&self) -> bool {
true
Expand All @@ -59,11 +70,7 @@ impl OptionsBuilder {
}
}

/// Build the [`Options`] struct, panicking if the builder is invalid.
///
/// # Panics
///
/// If the built options are not valid.
/// Build the [`Options`] struct. This can never panic.
#[inline(always)]
pub const fn build_strict(&self) -> Options {
match self.build() {
Expand All @@ -72,7 +79,7 @@ impl OptionsBuilder {
}
}

/// Build the [`Options`] struct.
/// Build the [`Options`] struct. Always [`Ok`].
#[inline(always)]
pub const fn build(&self) -> Result<Options> {
Ok(self.build_unchecked())
Expand All @@ -86,7 +93,7 @@ impl Default for OptionsBuilder {
}
}

/// Immutable options to customize writing integers.
/// Immutable options to customize parsing integers.
///
/// # Examples
///
Expand Down Expand Up @@ -120,7 +127,7 @@ impl Options {

// GETTERS

/// Check if the options state is valid.
/// Check if the builder state is valid (always [`true`]).
#[inline(always)]
pub const fn is_valid(&self) -> bool {
self.rebuild().is_valid()
Expand All @@ -135,11 +142,19 @@ impl Options {
// SETTERS

/// Set if we disable the use of multi-digit optimizations.
#[deprecated = "Setters should have a `set_` prefix. Use `set_no_multi_digit` instead. Will be removed in 2.0."]
#[inline(always)]
pub fn no_multi_digit(&mut self, no_multi_digit: bool) {
self.no_multi_digit = no_multi_digit;
}

/// Set if we disable the use of multi-digit optimizations.
#[deprecated = "Options should be treated as immutable, use `OptionsBuilder` instead. Will be removed in 2.0."]
#[inline(always)]
pub fn set_no_multi_digit(&mut self, no_multi_digit: bool) {
self.no_multi_digit = no_multi_digit;
}

// BUILDERS

/// Get [`OptionsBuilder`] as a static function.
Expand Down Expand Up @@ -177,18 +192,15 @@ impl ParseOptions for Options {
/// Standard number format.
#[rustfmt::skip]
pub const STANDARD: Options = Options::new();
const_assert!(STANDARD.is_valid());

/// Options optimized for small numbers.
#[rustfmt::skip]
pub const SMALL_NUMBERS: Options = Options::builder()
.no_multi_digit(true)
.build_unchecked();
const_assert!(SMALL_NUMBERS.is_valid());
.no_multi_digit(true)
.build_strict();

/// Options optimized for large numbers and long strings.
#[rustfmt::skip]
pub const LARGE_NUMBERS: Options = Options::builder()
.no_multi_digit(false)
.build_unchecked();
const_assert!(LARGE_NUMBERS.is_valid());
.no_multi_digit(false)
.build_strict();
16 changes: 16 additions & 0 deletions lexical-util/src/format_builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -518,12 +518,20 @@ impl NumberFormatBuilder {
}

/// Get the optional character for the base prefix.
///
/// This character will come after a leading zero, so for example
/// setting the base prefix to `x` means that a leading `0x` will
/// be ignore, if present.
#[inline(always)]
pub const fn get_base_prefix(&self) -> OptionU8 {
self.base_prefix
}

/// Get the optional character for the base suffix.
///
/// This character will at the end of the buffer, so for example
/// setting the base prefix to `x` means that a trailing `x` will
/// be ignored, if present.
#[inline(always)]
pub const fn get_base_suffix(&self) -> OptionU8 {
self.base_suffix
Expand Down Expand Up @@ -792,6 +800,10 @@ impl NumberFormatBuilder {
}

/// Set the optional character for the base prefix.
///
/// This character will come after a leading zero, so for example
/// setting the base prefix to `x` means that a leading `0x` will
/// be ignore, if present.
#[inline(always)]
#[cfg(all(feature = "power-of-two", feature = "format"))]
#[cfg_attr(docsrs, doc(cfg(all(feature = "power-of-two", feature = "format"))))]
Expand All @@ -801,6 +813,10 @@ impl NumberFormatBuilder {
}

/// Set the optional character for the base suffix.
///
/// This character will at the end of the buffer, so for example
/// setting the base prefix to `x` means that a trailing `x` will
/// be ignored, if present.
#[inline(always)]
#[cfg(all(feature = "power-of-two", feature = "format"))]
#[cfg_attr(docsrs, doc(cfg(all(feature = "power-of-two", feature = "format"))))]
Expand Down
1 change: 1 addition & 0 deletions lexical-util/src/options.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ pub trait WriteOptions: Default {
///
/// Using `buffer_size_const` lets you create static arrays at compile time,
/// rather than dynamically-allocate memory or know the value ahead of time.
#[deprecated = "Use `buffer_size_const` instead. Will be removed in 2.0."]
fn buffer_size<T: FormattedSize, const FORMAT: u128>(&self) -> usize;
}

Expand Down
17 changes: 2 additions & 15 deletions lexical-write-float/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
//!
//! # Getting Started
//!
//! To serialize a number to string, use [`to_lexical`]:
//! To write a number to bytes, use [`to_lexical`]:
//!
//! [`to_lexical`]: ToLexical::to_lexical
//!
Expand Down Expand Up @@ -165,7 +165,6 @@
//! more. For a list of all supported fields, see [Write Float
//! Fields][NumberFormatBuilder#write-float-fields].
//!
//!
//! ```rust
//! # #[cfg(feature = "radix")] {
//! # use core::str;
Expand Down Expand Up @@ -225,19 +224,7 @@
//! The radix algorithm is adapted from the V8 codebase, and may be found
//! [here](https://github.com/v8/v8).
//!
//! # Public API
//!
//! <div class="warning">
//!
//! Only documented functionality is considered part of the public API:
//! any of the modules, internal functions, or structs may change
//! release-to-release without major or minor version changes. Use
//! internal implementation details at your own risk.
//!
//! The documented API, however, follows semantic versioning and is stable,
//! particularly the [`ToLexical`] and [`ToLexicalWithOptions`] traits,
//! as well as [`NumberFormatBuilder`] and [`Options`].
//! </div>
//! # Higher-Level APIs
//!
//! If you would like to use a high-level API that writes to [`String`],
//! respectively, please look at [`lexical`] instead. If you would like
Expand Down
Loading

0 comments on commit 54835ce

Please sign in to comment.