Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aes: autodetection support for AES-NI #208

Merged
merged 1 commit into from
Dec 2, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions .github/workflows/aes.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ jobs:
- run: cargo build --release --target ${{ matrix.target }}
- run: cargo build --release --target ${{ matrix.target }} --features compact
- run: cargo build --release --target ${{ matrix.target }} --features ctr
- run: cargo build --release --target ${{ matrix.target }} --features compact,ctr
- run: cargo build --release --target ${{ matrix.target }} --features force-soft
- run: cargo build --release --target ${{ matrix.target }} --all-features

# Tests for the portable software backend
soft:
Expand Down Expand Up @@ -73,6 +74,7 @@ jobs:
- run: cargo test --release --target ${{ matrix.target }}
- run: cargo test --release --target ${{ matrix.target }} --features compact
- run: cargo test --release --target ${{ matrix.target }} --features ctr
- run: cargo test --release --target ${{ matrix.target }} --features force-soft
- run: cargo test --release --target ${{ matrix.target }} --all-features

# Tests for the AES-NI backend
Expand Down Expand Up @@ -111,6 +113,7 @@ jobs:
- run: cargo test --release --target ${{ matrix.target }}
- run: cargo test --release --target ${{ matrix.target }} --features compact
- run: cargo test --release --target ${{ matrix.target }} --features ctr
- run: cargo test --release --target ${{ matrix.target }} --features force-soft
- run: cargo test --release --target ${{ matrix.target }} --all-features

# Cross-compiled tests
Expand Down Expand Up @@ -144,4 +147,5 @@ jobs:
- run: cross test --release --target ${{ matrix.target }}
- run: cross test --release --target ${{ matrix.target }} --features compact
- run: cross test --release --target ${{ matrix.target }} --features ctr
- run: cross test --release --target ${{ matrix.target }} --features compact,ctr
- run: cross test --release --target ${{ matrix.target }} --features force-soft
- run: cross test --release --target ${{ matrix.target }} --all-features
7 changes: 7 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 6 additions & 2 deletions aes/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,13 @@ opaque-debug = "0.3"
[dev-dependencies]
cipher = { version = "=0.3.0-pre", features = ["dev"] }

[target.'cfg(any(target_arch = "x86_64", target_arch = "x86"))'.dependencies]
cpuid-bool = "0.2"

[features]
compact = [] # Reduce code size at the cost of performance
compact = [] # Reduce code size at the cost of slower performance
force-soft = [] # Disable support for AES hardware intrinsics

[package.metadata.docs.rs]
all-features = true
features = ["ctr"]
rustdoc-args = ["--cfg", "docsrs"]
201 changes: 201 additions & 0 deletions aes/src/autodetect.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
//! Autodetection support for hardware accelerated AES backends with fallback
//! to the fixsliced "soft" implementation.

use crate::{Block, ParBlocks};
use cipher::{
consts::{U16, U24, U32, U8},
generic_array::GenericArray,
BlockCipher, BlockDecrypt, BlockEncrypt, NewBlockCipher,
};

cpuid_bool::new!(aes_cpuid, "aes");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We either need to add ssse3 to this list or use a separate aes_ssse3_cpuid module as you did before. Also it may be worth to a comment that SSE2 is implied by AES-NI, so we don't need to check for it separately. Or we could simply add sse2 to the list to be completely thorough. :)

Copy link
Member Author

@tarcieri tarcieri Dec 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my comments above, as far as I can tell every CPU with AES-NI has both SSE2 and SSSE3.

AES-NI was introduced in the Westmere architecture, which also has SSSE3. Unless there's some strange AMD/other CPU that has AES-NI but not SSSE3, I don't think it will be a problem.

If you're worried though, I can add back aes_ssse3_cpuid (perhaps as a separate PR).

Copy link
Member

@newpavlov newpavlov Dec 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per reference we can omit sse2 check if we have verified that aes or ssse3 is enabled, but it does not indicate any implicit dependency between aes and ssse3. So even though in practice there is probably no such CPU, for our code to be correct we have to check both those features. The check is cheap enough and will be executed only once, so I think it's worth to be extra-safe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ok, sorry just merged but I can follow up with this.


macro_rules! define_aes_impl {
(
$name:tt,
$module:tt,
$key_size:ty,
$doc:expr
) => {
#[doc=$doc]
#[derive(Clone)]
pub struct $name {
inner: $module::Inner,
token: aes_cpuid::InitToken
}

mod $module {
#[derive(Copy, Clone)]
pub(super) union Inner {
pub(super) ni: crate::ni::$name,
pub(super) soft: crate::soft::$name,
}
}

impl NewBlockCipher for $name {
type KeySize = $key_size;

#[inline]
fn new(key: &GenericArray<u8, $key_size>) -> Self {
let (token, aesni_present) = aes_cpuid::init_get();

let inner = if aesni_present {
$module::Inner { ni: crate::ni::$name::new(key) }
} else {
$module::Inner { soft: crate::soft::$name::new(key) }
};

Self { inner, token }
}
}

impl BlockCipher for $name {
type BlockSize = U16;
type ParBlocks = U8;
}

impl BlockEncrypt for $name {
#[inline]
fn encrypt_block(&self, block: &mut Block) {
if self.token.get() {
unsafe { self.inner.ni.encrypt_block(block) }
} else {
unsafe { self.inner.soft.encrypt_block(block) }
}
}

#[inline]
fn encrypt_par_blocks(&self, blocks: &mut ParBlocks) {
if self.token.get() {
unsafe { self.inner.ni.encrypt_par_blocks(blocks) }
} else {
unsafe { self.inner.soft.encrypt_par_blocks(blocks) }
}
}
}

impl BlockDecrypt for $name {
#[inline]
fn decrypt_block(&self, block: &mut Block) {
if self.token.get() {
unsafe { self.inner.ni.decrypt_block(block) }
} else {
unsafe { self.inner.soft.decrypt_block(block) }
}
}

#[inline]
fn decrypt_par_blocks(&self, blocks: &mut ParBlocks) {
if self.token.get() {
unsafe { self.inner.ni.decrypt_par_blocks(blocks) }
} else {
unsafe { self.inner.soft.decrypt_par_blocks(blocks) }
}
}
}

opaque_debug::implement!($name);
}
}

define_aes_impl!(Aes128, aes128, U16, "AES-128 block cipher instance");
define_aes_impl!(Aes192, aes192, U24, "AES-192 block cipher instance");
define_aes_impl!(Aes256, aes256, U32, "AES-256 block cipher instance");

#[cfg(feature = "ctr")]
pub(crate) mod ctr {
use super::{aes_cpuid, Aes128, Aes192, Aes256};
use cipher::{
block::BlockCipher,
generic_array::GenericArray,
stream::{
FromBlockCipher, LoopError, OverflowError, SeekNum, SyncStreamCipher,
SyncStreamCipherSeek,
},
};

macro_rules! define_aes_ctr_impl {
(
$name:tt,
$cipher:ident,
$module:tt,
$doc:expr
) => {
#[doc=$doc]
#[cfg_attr(docsrs, doc(cfg(feature = "ctr")))]
pub struct $name {
inner: $module::Inner,
}

mod $module {
#[allow(clippy::large_enum_variant)]
pub(super) enum Inner {
Ni(crate::ni::$name),
Soft(crate::soft::$name),
}
}

impl FromBlockCipher for $name {
type BlockCipher = $cipher;
type NonceSize = <$cipher as BlockCipher>::BlockSize;

fn from_block_cipher(
cipher: $cipher,
nonce: &GenericArray<u8, Self::NonceSize>,
) -> Self {
let inner = if aes_cpuid::get() {
$module::Inner::Ni(
crate::ni::$name::from_block_cipher(
unsafe { cipher.inner.ni },
nonce
)
)
} else {
$module::Inner::Soft(
crate::soft::$name::from_block_cipher(
unsafe { cipher.inner.soft },
nonce
)
)
};

Self { inner }
}
}

impl SyncStreamCipher for $name {
#[inline]
fn try_apply_keystream(&mut self, data: &mut [u8]) -> Result<(), LoopError> {
match &mut self.inner {
$module::Inner::Ni(aes) => aes.try_apply_keystream(data),
$module::Inner::Soft(aes) => aes.try_apply_keystream(data)
}
}
}

impl SyncStreamCipherSeek for $name {
#[inline]
fn try_current_pos<T: SeekNum>(&self) -> Result<T, OverflowError> {
match &self.inner {
$module::Inner::Ni(aes) => aes.try_current_pos(),
$module::Inner::Soft(aes) => aes.try_current_pos()
}
}

#[inline]
fn try_seek<T: SeekNum>(&mut self, pos: T) -> Result<(), LoopError> {
match &mut self.inner {
$module::Inner::Ni(aes) => aes.try_seek(pos),
$module::Inner::Soft(aes) => aes.try_seek(pos)
}
}
}

opaque_debug::implement!($name);
}
}

define_aes_ctr_impl!(Aes128Ctr, Aes128, aes128ctr, "AES-128 in CTR mode");
define_aes_ctr_impl!(Aes192Ctr, Aes192, aes192ctr, "AES-192 in CTR mode");
define_aes_ctr_impl!(Aes256Ctr, Aes256, aes256ctr, "AES-256 in CTR mode");
}
18 changes: 7 additions & 11 deletions aes/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -63,21 +63,17 @@ use cfg_if::cfg_if;

cfg_if! {
if #[cfg(all(
target_feature = "aes",
target_feature = "sse2",
any(target_arch = "x86_64", target_arch = "x86"),
not(feature = "force-soft")
))] {
mod autodetect;
mod ni;
pub use ni::{Aes128, Aes192, Aes256};
mod soft;

pub use autodetect::{Aes128, Aes192, Aes256};

#[cfg(feature = "ctr")]
cfg_if! {
if #[cfg(target_feature = "ssse3")] {
pub use ni::{Aes128Ctr, Aes192Ctr, Aes256Ctr};
} else {
compile_error!("Please enable the +ssse3 target feature to use `ctr` with AES-NI")
}
}
pub use autodetect::ctr::{Aes128Ctr, Aes192Ctr, Aes256Ctr};
} else {
mod soft;
pub use soft::{Aes128, Aes192, Aes256};
Expand All @@ -87,7 +83,7 @@ cfg_if! {
}
}

pub use cipher::{self, BlockCipher, NewBlockCipher};
pub use cipher::{self, BlockCipher, BlockDecrypt, BlockEncrypt, NewBlockCipher};

/// 128-bit AES block
pub type Block = cipher::generic_array::GenericArray<u8, cipher::consts::U16>;
Expand Down
4 changes: 4 additions & 0 deletions aes/src/ni/aes128.rs
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ impl BlockDecrypt for Aes128 {
// Safety: `loadu` and `storeu` support unaligned access
#[allow(clippy::cast_ptr_alignment)]
let mut b = _mm_loadu_si128(block.as_ptr() as *const __m128i);

b = _mm_xor_si128(b, keys[10]);
b = _mm_aesdec_si128(b, keys[9]);
b = _mm_aesdec_si128(b, keys[8]);
Expand All @@ -123,6 +124,9 @@ impl BlockDecrypt for Aes128 {
b = _mm_aesdec_si128(b, keys[2]);
b = _mm_aesdec_si128(b, keys[1]);
b = _mm_aesdeclast_si128(b, keys[0]);

// Safety: `loadu` and `storeu` support unaligned access
#[allow(clippy::cast_ptr_alignment)]
_mm_storeu_si128(block.as_mut_ptr() as *mut __m128i, b);
}

Expand Down
4 changes: 4 additions & 0 deletions aes/src/ni/aes192.rs
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ impl BlockDecrypt for Aes192 {
// Safety: `loadu` and `storeu` support unaligned access
#[allow(clippy::cast_ptr_alignment)]
let mut b = _mm_loadu_si128(block.as_ptr() as *const __m128i);

b = _mm_xor_si128(b, keys[12]);
b = _mm_aesdec_si128(b, keys[11]);
b = _mm_aesdec_si128(b, keys[10]);
Expand All @@ -127,6 +128,9 @@ impl BlockDecrypt for Aes192 {
b = _mm_aesdec_si128(b, keys[2]);
b = _mm_aesdec_si128(b, keys[1]);
b = _mm_aesdeclast_si128(b, keys[0]);

// Safety: `loadu` and `storeu` support unaligned access
#[allow(clippy::cast_ptr_alignment)]
_mm_storeu_si128(block.as_mut_ptr() as *mut __m128i, b);
}

Expand Down
4 changes: 4 additions & 0 deletions aes/src/ni/aes256.rs
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,7 @@ impl BlockDecrypt for Aes256 {
// Safety: `loadu` and `storeu` support unaligned access
#[allow(clippy::cast_ptr_alignment)]
let mut b = _mm_loadu_si128(block.as_ptr() as *const __m128i);

b = _mm_xor_si128(b, keys[14]);
b = _mm_aesdec_si128(b, keys[13]);
b = _mm_aesdec_si128(b, keys[12]);
Expand All @@ -133,6 +134,9 @@ impl BlockDecrypt for Aes256 {
b = _mm_aesdec_si128(b, keys[2]);
b = _mm_aesdec_si128(b, keys[1]);
b = _mm_aesdeclast_si128(b, keys[0]);

// Safety: `loadu` and `storeu` support unaligned access
#[allow(clippy::cast_ptr_alignment)]
_mm_storeu_si128(block.as_mut_ptr() as *mut __m128i, b);
}

Expand Down
Loading