Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(strings): make strings compatible with compact list #2002

Merged
merged 1 commit into from
Feb 14, 2025

Conversation

tmontaigu
Copy link
Contributor

@tmontaigu tmontaigu commented Jan 24, 2025

closes: please link all relevant issues

PR content/description

Check-list:

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • Relevant issues are marked as resolved/closed, related issues are linked in the description
  • Check for breaking changes (including serialization changes) and add them to commit message following the conventional commit specification

This change is Reviewable

@cla-bot cla-bot bot added the cla-signed label Jan 24, 2025
@tmontaigu tmontaigu force-pushed the tm/strings-in-compact branch 3 times, most recently from deef0a6 to fe88827 Compare January 27, 2025 16:19
@tmontaigu tmontaigu requested a review from mayeul-zama January 29, 2025 10:51
@tmontaigu tmontaigu force-pushed the tm/strings-in-compact branch from fe88827 to 56ebefa Compare February 6, 2025 13:51
Copy link
Contributor

@mayeul-zama mayeul-zama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 12 of 13 files at r1, 3 of 4 files at r2, all commit messages.
Reviewable status: 11 of 13 files reviewed, 9 unresolved discussions (waiting on @tmontaigu)


tfhe/src/strings/ciphertext.rs line 78 at r1 (raw file):

        let n_chars = num_blocks.map_or(self.str.len(), |n_blocks| {
            n_blocks / blocks_per_char as usize
        });

This function could take padded_size: Option<usize> instead of num_blocks: Option<usize> (with n_chars = padded_size.unwrap_or(self.str.len()) here instead)
It would be more ergonomic IMO

Code quote:

        let n_chars = num_blocks.map_or(self.str.len(), |n_blocks| {
            n_blocks / blocks_per_char as usize
        });

tfhe/src/strings/ciphertext.rs line 96 at r1 (raw file):

impl crate::integer::ciphertext::CompactCiphertextListBuilder {
    pub fn push_with_padding(

Could be named push_string_with_padding


tfhe/src/strings/ciphertext.rs line 99 at r1 (raw file):

        &mut self,
        clear_string: &ClearString,
        padding_count: u32,

It seems more ergonomic to me to have a size_after_padding instead of padding_count
That could prevent users from using a single padding_count which would make the size predicatble


tfhe/src/strings/ciphertext.rs line 103 at r1 (raw file):

        let message_modulus = self.pk.key.message_modulus();
        let blocks_per_char = 7u32.div_ceil(message_modulus.0.ilog2());
        let _unpadded_kind = clear_string.compact_into(&mut self.messages, message_modulus, None);

compact_into is always called with None, is that correct?
Shouldn't we let this function add the padding data?


tfhe/src/error.rs line 37 at r1 (raw file):

#[cfg(feature = "integer")]
pub(crate) use error;

Could be added in a separate commit

Code quote:

#[cfg(feature = "integer")]
macro_rules! error{
    ($($arg:tt)*) => {
        $crate::error::Error::new(::std::format!($($arg)*))
    }
}

#[cfg(feature = "integer")]
pub(crate) use error;

tfhe/src/integer/ciphertext/compact_list.rs line 32 at r1 (raw file):

use tfhe_versionable::Versionize;

/// Unpack message and carries and additionally sanitizes blocks

The comment could explain what happens in boolean and string cases


tfhe/src/integer/ciphertext/compact_list.rs line 492 at r1 (raw file):

                    for _ in 0..*n_chars {
                        push_functions(
                            blocks_per_char.saturating_sub(1) as usize,

It seems to me that blocks_per_char can never be zero
Which means the saturating is unnecessary

Applies to other places


tfhe/src/integer/ciphertext/compact_list.rs line 520 at r1 (raw file):

            .map(|x| x.num_blocks(self.message_modulus))
            .sum();
        let mut functions = Vec::with_capacity(total_block_count);

Using the same pattern (Vec of empty Vecs which we individually push to in the loop) as done in function generate_sanitize_without_unpacking_functions would help understanding
Maybe it's possible to share some code between the 2 functions

Also applies to generate_sanitize_without_unpacking_luts and generate_unpacked_and_sanitize_luts

@tmontaigu
Copy link
Contributor Author

tfhe/src/strings/ciphertext.rs line 99 at r1 (raw file):

Previously, mayeul-zama wrote…

It seems more ergonomic to me to have a size_after_padding instead of padding_count
That could prevent users from using a single padding_count which would make the size predicatble

It also seems better to me (which is why is hlapi, there's en encrypt_with_fixed_size), but in string API, the encrypt function accepts a padding count, so to stay consistent,push_string_with_padding should have the same behaviour

We could maybe have a push_string_with_fixed_size

@tmontaigu tmontaigu force-pushed the tm/strings-in-compact branch from 56ebefa to 3ed4f96 Compare February 13, 2025 11:59
Copy link
Contributor Author

@tmontaigu tmontaigu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 of 13 files reviewed, 8 unresolved discussions (waiting on @mayeul-zama)


tfhe/src/integer/ciphertext/compact_list.rs line 32 at r1 (raw file):

Previously, mayeul-zama wrote…

The comment could explain what happens in boolean and string cases

Done.


tfhe/src/integer/ciphertext/compact_list.rs line 492 at r1 (raw file):

Previously, mayeul-zama wrote…

It seems to me that blocks_per_char can never be zero
Which means the saturating is unnecessary

Applies to other places

Done.


tfhe/src/strings/ciphertext.rs line 78 at r1 (raw file):

Previously, mayeul-zama wrote…

This function could take padded_size: Option<usize> instead of num_blocks: Option<usize> (with n_chars = padded_size.unwrap_or(self.str.len()) here instead)
It would be more ergonomic IMO

Hum we can't really do that as it's a trait we are implementing, and the trait passes a number of blocks

I updated this so that the function pads or truncates depending on the requested number of blocks


tfhe/src/strings/ciphertext.rs line 96 at r1 (raw file):

Previously, mayeul-zama wrote…

Could be named push_string_with_padding

Done.


tfhe/src/strings/ciphertext.rs line 103 at r1 (raw file):

Previously, mayeul-zama wrote…

compact_into is always called with None, is that correct?
Shouldn't we let this function add the padding data?

Yes because the optional number of blocks is not related to padding with the traits was originaly designed

Copy link
Contributor

@mayeul-zama mayeul-zama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Thanks!

Reviewed 1 of 4 files at r2, 3 of 5 files at r3, all commit messages.
Reviewable status: 11 of 13 files reviewed, 2 unresolved discussions (waiting on @tmontaigu)


tfhe/src/strings/ciphertext.rs line 99 at r1 (raw file):

Previously, tmontaigu (tmontaigu) wrote…

It also seems better to me (which is why is hlapi, there's en encrypt_with_fixed_size), but in string API, the encrypt function accepts a padding count, so to stay consistent,push_string_with_padding should have the same behaviour

We could maybe have a push_string_with_fixed_size

Created issue https://github.com/zama-ai/tfhe-rs-internal/issues/919

@tmontaigu tmontaigu merged commit cda43fd into main Feb 14, 2025
103 of 106 checks passed
@tmontaigu tmontaigu deleted the tm/strings-in-compact branch February 14, 2025 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants