Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in the Arc<E> ref count? #497

Open
rhishikeshj opened this issue Feb 12, 2025 · 1 comment
Open

Bug in the Arc<E> ref count? #497

rhishikeshj opened this issue Feb 12, 2025 · 1 comment

Comments

@rhishikeshj
Copy link

Hi,
As documented and explained in #328 the return from try_* fns in the library is a Result<T, Arc<E>>
And the idiomatic way to extract the error to pass it along, is to call Arc::into_inner
But from my observations, it seems there is a mismatch in the behaviour this code snippet is showing

    use std::sync::Arc;
    #[tokio::test]
    async fn moka_test() {
        let cache: Cache<String, String> = Cache::builder()
            .max_capacity(10)
            .eviction_policy(EvictionPolicy::tiny_lfu())
            .build();

        let r = cache
            .try_get_with("abc".to_string(), async { Err(42) })
            .await;

        match r {
            Ok(_) => panic!("not expected"),
            Err(ae) => {
                let c = Arc::strong_count(&ae);
                assert_eq!(c, 1); // is actually 2
                let e = Arc::into_inner(ae); // so this is None
                assert!(e.is_some());
            }
        }
    }

The strong_count actually shows up as 2 making the into_inner method unusable to extract the inner Error.
Is there something missing in my understanding?
Or is this a bug?

@tatsuya6502
Copy link
Member

tatsuya6502 commented Feb 13, 2025

Hi. Technically, it is not a bug, but in the future, I would change the behavior of the method and the return type to Result<V, E> instead of Result<V, Arc<E>>.

If we add crossbeam_epoch::pin().flush(); to your code, it will work as expected.

// Cargo.toml
// [dependencies]
//
// crossbeam-epoch = "0.9.18"
// moka = { version = "0.12.10", features = ["future"] }
// tokio = { version = "1.43.0", features = ["rt-multi-thread", "macros"] }

    ...
    let r = cache
        .try_get_with("abc".to_string(), async { Err(42) })
        .await;

    crossbeam_epoch::pin().flush(); // Add this.

    match r {
        ...

But flush() is an expensive operation, and I believe it is not guaranteed to work in all cases, so it will not be a good solution.

For now, you can use moka's entry and_try_compute method, which returns raw E type instead of Arc<E> (doc).

use moka::{future::Cache, ops::compute::Op, policy::EvictionPolicy};

#[tokio::main]
async fn main() {
    let cache: Cache<String, String> = Cache::builder()
        .max_capacity(10)
        .eviction_policy(EvictionPolicy::tiny_lfu())
        .build();

    let r = cache
        .entry("abc".to_string())
        .and_try_compute_with(|v| async move {
            if v.is_none() {
                Err(42)
            } else {
                Ok(Op::Nop)
            }
        })
        .await;

    assert!(matches!(r, Err(42)));
}

Note that try_get_with and entry and_try_compute_with work a bit different when they called from two callers at the (almost) same time for the same key.

  • try_get_with:
    1. Only the first caller's init future should be resolved to an instance of E
    2. Both callers should get the same instance of the Arc<E>.
  • entry and_try_compute_with:
    1. The first caller's init future should be resolved to an instance of E.
    2. Only the first caller should get the E.
    3. Then, the second caller's init future should be resolved (either to E or succeeds).
    4. Only the second caller should get the result.

In the future, I would change the behavior of try_get_with to be the same as entry and_try_compute_with and return raw E type instead of Arc<E>. This will allow you to avoid flush(). However, this is a breaking change, so it will be in v0.13.0, instead of v0.12.x.

The reason that the Arc<E> returned from the current try_get_with has the strong count 2 is that the resolved Arc<E> is stored in a concurrent hash table (cht) in the cache, so that the second caller can get the same instance. The cht uses a concurrent garbage collector called crossbeam-epoch to remove drop the entry when the epoch is advanced enough to ensure that no thread is holding the reference to the entry. It is necessary to call flush twice after removing the value Arc<E> from the cht to drop it (so the strong count becomes 1).

I said twice, the one in the first code snippet above is the second one. The first one is in the try_get_with method: https://github.com/moka-rs/moka/blob/v0.12.10/src/future/cache.rs#L1816-L1818

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants