fix: fix the ghost queue bug with s3fifo #432

xiaguan · 2024-04-26T11:23:10Z

What's changed and what's your intention?

Fixed up the ghost queue issue with s3fifo and also added some real workload tests. Another thing I wanna point out is that the algorithms that perform really well on zipf usually have lower robustness, meaning they're a bit more limited in where they can be applied.

Checklist

I have written the necessary rustdoc comments
I have added the necessary unit tests and integration tests
I have passed make all (or make fast instead if the old tests are not modified) in my local environment.

Related issues or PRs (optional)

xiaguan · 2024-04-26T11:26:24Z

Master

  cache_size                  fifo            lru             lfu             s3fifo (0g)     s3fifo (1g)     moka            
50000                     67.50%          70.51%          74.99%          70.88%          71.28%          64.70%
zif_exp, cache_size           fifo            lru             lfu             s3fifo (0g)     s3fifo (1g)     moka            
  0.90,  0.005                16.24%          19.21%          32.34%          32.14%          20.93%          33.43%
  0.90,   0.01                22.56%          26.23%          38.56%          39.24%          27.90%          37.91%
  0.90,   0.05                41.10%          45.61%          55.42%          56.68%          47.55%          55.31%
  0.90,    0.1                51.06%          55.70%          63.82%          65.24%          57.67%          64.17%
  0.90,   0.25                66.81%          71.18%          76.23%          77.62%          73.27%          77.15%
  1.00,  0.005                26.61%          31.06%          44.13%          44.49%          33.33%          45.64%
  1.00,   0.01                34.41%          39.18%          50.67%          51.26%          41.14%          50.79%
  1.00,   0.05                54.05%          58.77%          66.81%          67.96%          60.56%          66.75%
  1.00,    0.1                63.16%          67.63%          73.94%          75.00%          69.41%          74.42%
  1.00,   0.25                76.15%          79.92%          83.59%          84.53%          81.50%          84.27%
  1.05,  0.005                32.65%          37.71%          50.27%          50.25%          39.80%          51.78%
  1.05,   0.01                40.97%          46.10%          56.76%          57.48%          47.91%          57.13%
  1.05,   0.05                60.45%          65.07%          72.04%          73.07%          66.82%          72.36%
  1.05,    0.1                68.95%          73.13%          78.51%          79.45%          74.61%          78.98%
  1.05,   0.25                80.43%          83.79%          86.84%          87.53%          85.09%          87.48%
  1.10,  0.005                38.98%          44.46%          56.24%          56.38%          46.54%          57.86%
  1.10,   0.01                47.61%          52.93%          62.62%          63.55%          54.66%          63.11%
  1.10,   0.05                66.56%          70.91%          76.91%          77.83%          72.42%          77.18%
  1.10,    0.1                74.23%          78.05%          82.52%          83.24%          79.35%          82.99%
  1.10,   0.25                84.17%          87.08%          89.56%          90.09%          88.16%          90.08%
  1.50,  0.005                81.18%          85.29%          88.90%          89.30%          86.02%          89.92%
  1.50,   0.01                86.93%          89.89%          92.26%          92.69%          90.42%          92.79%
  1.50,   0.05                94.77%          96.05%          96.96%          97.08%          96.30%          97.10%
  1.50,    0.1                96.66%          97.52%          98.07%          98.09%          97.69%          98.16%
  1.50,   0.25                98.36%          98.82%          99.04%          99.02%          98.91%          99.09%

After fix

  cache_size                  fifo            lru             lfu             s3fifo (0g)     s3fifo (1g)     moka            
50000                     67.50%          70.51%          74.99%          70.88%          72.33%          64.70%
zif_exp, cache_size           fifo            lru             lfu             s3fifo (0g)     s3fifo (1g)     moka            
  0.90,  0.005                16.24%          19.20%          32.38%          32.06%          31.94%          33.44%
  0.90,   0.01                22.55%          26.21%          38.56%          39.27%          38.46%          37.86%
  0.90,   0.05                41.10%          45.61%          55.41%          56.64%          55.37%          55.19%
  0.90,    0.1                51.05%          55.69%          63.81%          65.27%          63.61%          64.16%
  0.90,   0.25                66.76%          71.15%          76.17%          77.53%          75.68%          77.11%
  1.00,  0.005                26.59%          31.05%          44.11%          44.37%          43.54%          45.54%
  1.00,   0.01                34.36%          39.13%          50.64%          51.40%          50.59%          50.69%
  1.00,   0.05                54.03%          58.75%          66.80%          67.94%          66.81%          66.91%
  1.00,    0.1                63.16%          67.62%          73.93%          75.01%          73.83%          74.38%
  1.00,   0.25                76.17%          79.93%          83.61%          84.53%          83.27%          84.36%
  1.05,  0.005                32.64%          37.67%          50.23%          50.31%          49.63%          51.83%
  1.05,   0.01                40.90%          46.02%          56.71%          57.59%          56.67%          56.99%
  1.05,   0.05                60.44%          65.03%          72.05%          73.02%          72.09%          72.31%
  1.05,    0.1                68.90%          73.10%          78.50%          79.36%          78.39%          78.98%
  1.05,   0.25                80.34%          83.71%          86.77%          87.47%          86.49%          87.40%
  1.10,  0.005                38.98%          44.45%          56.25%          56.62%          55.62%          57.89%
  1.10,   0.01                47.66%          52.98%          62.65%          63.55%          62.62%          63.22%
  1.10,   0.05                66.55%          70.91%          76.90%          77.77%          76.96%          77.22%
  1.10,    0.1                74.27%          78.07%          82.56%          83.38%          82.51%          82.97%
  1.10,   0.25                84.19%          87.10%          89.56%          90.08%          89.34%          90.08%
  1.50,  0.005                81.20%          85.30%          88.90%          89.32%          88.79%          89.92%
  1.50,   0.01                86.91%          89.87%          92.25%          92.66%          92.29%          92.76%
  1.50,   0.05                94.77%          96.05%          96.96%          97.08%          96.96%          97.10%
  1.50,    0.1                96.64%          97.50%          98.05%          98.10%          98.04%          98.12%
  1.50,   0.25                98.37%          98.82%          99.05%          99.03%          99.03%          99.10%

codecov · 2024-04-26T11:27:51Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.82%. Comparing base (e612b5a) to head (3a561d6).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #432      +/-   ##
==========================================
+ Coverage   80.63%   80.82%   +0.19%     
==========================================
  Files          53       53              
  Lines        7173     7166       -7     
==========================================
+ Hits         5784     5792       +8     
+ Misses       1389     1374      -15

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

MrCroxx

Thanks for the efforts. 🥰

This PR generally LGTM. The following comments should be resolved.

Could you please provide where the idea comes from?

I rechecked the original paper of S3Fifo, and found its idea is not like either of our implements. 🥲

foyer-memory/Cargo.toml

MrCroxx · 2024-04-27T04:16:33Z

foyer-memory/benches/bench_hit_ratio.rs

+    // Try to read the csv file path by environment variable.
+    let path = std::env::var("TWITTER_TRACE_PATH").ok();
+    if let Some(path) = path {
+        // Limit the number of keys to read.
+        // MAX means read all keys, which may take a really long time.


Add an interactive command to download the dataset?

Download this is a little bit complex.

Download this is a little bit complex.

Hi @xiaguan , where can I find the dataset? Let me handle the download part.

https://github.com/twitter/cache-trace, I think I used miss ratio related cluster's data.
I use SNIA to download. In this way, I could just download part of it(6GB).

https://github.com/twitter/cache-trace this one?

Yep, it is really big. 🤣

foyer-memory/src/eviction/s3fifo.rs

MrCroxx · 2024-04-27T04:28:13Z

Just attach some discussion of foyer with S3FIFO.

#306

MrCroxx · 2024-04-27T04:37:32Z

Master

  cache_size                  fifo            lru             lfu             s3fifo (0g)     s3fifo (1g)     moka            
50000                     67.50%          70.51%          74.99%          70.88%          71.28%          64.70%
zif_exp, cache_size           fifo            lru             lfu             s3fifo (0g)     s3fifo (1g)     moka            
  0.90,  0.005                16.24%          19.21%          32.34%          32.14%          20.93%          33.43%
  0.90,   0.01                22.56%          26.23%          38.56%          39.24%          27.90%          37.91%
  0.90,   0.05                41.10%          45.61%          55.42%          56.68%          47.55%          55.31%
  0.90,    0.1                51.06%          55.70%          63.82%          65.24%          57.67%          64.17%
  0.90,   0.25                66.81%          71.18%          76.23%          77.62%          73.27%          77.15%
  1.00,  0.005                26.61%          31.06%          44.13%          44.49%          33.33%          45.64%
  1.00,   0.01                34.41%          39.18%          50.67%          51.26%          41.14%          50.79%
  1.00,   0.05                54.05%          58.77%          66.81%          67.96%          60.56%          66.75%
  1.00,    0.1                63.16%          67.63%          73.94%          75.00%          69.41%          74.42%
  1.00,   0.25                76.15%          79.92%          83.59%          84.53%          81.50%          84.27%
  1.05,  0.005                32.65%          37.71%          50.27%          50.25%          39.80%          51.78%
  1.05,   0.01                40.97%          46.10%          56.76%          57.48%          47.91%          57.13%
  1.05,   0.05                60.45%          65.07%          72.04%          73.07%          66.82%          72.36%
  1.05,    0.1                68.95%          73.13%          78.51%          79.45%          74.61%          78.98%
  1.05,   0.25                80.43%          83.79%          86.84%          87.53%          85.09%          87.48%
  1.10,  0.005                38.98%          44.46%          56.24%          56.38%          46.54%          57.86%
  1.10,   0.01                47.61%          52.93%          62.62%          63.55%          54.66%          63.11%
  1.10,   0.05                66.56%          70.91%          76.91%          77.83%          72.42%          77.18%
  1.10,    0.1                74.23%          78.05%          82.52%          83.24%          79.35%          82.99%
  1.10,   0.25                84.17%          87.08%          89.56%          90.09%          88.16%          90.08%
  1.50,  0.005                81.18%          85.29%          88.90%          89.30%          86.02%          89.92%
  1.50,   0.01                86.93%          89.89%          92.26%          92.69%          90.42%          92.79%
  1.50,   0.05                94.77%          96.05%          96.96%          97.08%          96.30%          97.10%
  1.50,    0.1                96.66%          97.52%          98.07%          98.09%          97.69%          98.16%
  1.50,   0.25                98.36%          98.82%          99.04%          99.02%          98.91%          99.09%

After fix

  cache_size                  fifo            lru             lfu             s3fifo (0g)     s3fifo (1g)     moka            
50000                     67.50%          70.51%          74.99%          70.88%          72.33%          64.70%
zif_exp, cache_size           fifo            lru             lfu             s3fifo (0g)     s3fifo (1g)     moka            
  0.90,  0.005                16.24%          19.20%          32.38%          32.06%          31.94%          33.44%
  0.90,   0.01                22.55%          26.21%          38.56%          39.27%          38.46%          37.86%
  0.90,   0.05                41.10%          45.61%          55.41%          56.64%          55.37%          55.19%
  0.90,    0.1                51.05%          55.69%          63.81%          65.27%          63.61%          64.16%
  0.90,   0.25                66.76%          71.15%          76.17%          77.53%          75.68%          77.11%
  1.00,  0.005                26.59%          31.05%          44.11%          44.37%          43.54%          45.54%
  1.00,   0.01                34.36%          39.13%          50.64%          51.40%          50.59%          50.69%
  1.00,   0.05                54.03%          58.75%          66.80%          67.94%          66.81%          66.91%
  1.00,    0.1                63.16%          67.62%          73.93%          75.01%          73.83%          74.38%
  1.00,   0.25                76.17%          79.93%          83.61%          84.53%          83.27%          84.36%
  1.05,  0.005                32.64%          37.67%          50.23%          50.31%          49.63%          51.83%
  1.05,   0.01                40.90%          46.02%          56.71%          57.59%          56.67%          56.99%
  1.05,   0.05                60.44%          65.03%          72.05%          73.02%          72.09%          72.31%
  1.05,    0.1                68.90%          73.10%          78.50%          79.36%          78.39%          78.98%
  1.05,   0.25                80.34%          83.71%          86.77%          87.47%          86.49%          87.40%
  1.10,  0.005                38.98%          44.45%          56.25%          56.62%          55.62%          57.89%
  1.10,   0.01                47.66%          52.98%          62.65%          63.55%          62.62%          63.22%
  1.10,   0.05                66.55%          70.91%          76.90%          77.77%          76.96%          77.22%
  1.10,    0.1                74.27%          78.07%          82.56%          83.38%          82.51%          82.97%
  1.10,   0.25                84.19%          87.10%          89.56%          90.08%          89.34%          90.08%
  1.50,  0.005                81.20%          85.30%          88.90%          89.32%          88.79%          89.92%
  1.50,   0.01                86.91%          89.87%          92.25%          92.66%          92.29%          92.76%
  1.50,   0.05                94.77%          96.05%          96.96%          97.08%          96.96%          97.10%
  1.50,    0.1                96.64%          97.50%          98.05%          98.10%          98.04%          98.12%
  1.50,   0.25                98.37%          98.82%          99.05%          99.03%          99.03%          99.10%

And could you please sync this result to the comments in the bench_hit_ratio.rs file? 🥰

xiaguan · 2024-04-27T05:47:16Z

Could you please provide where the idea comes from?

The implementation on the master branch doesn't evict objects from the ghost queue. I mainly focused on fixing this bug.

MrCroxx · 2024-04-27T06:00:52Z

Could you please provide where the idea comes from?

The implementation on the master branch doesn't evict objects from the ghost queue. I mainly focused on fixing this bug.

Oh, hack. You are right. 🥲 I just decreased the reference count. BTW, how do you like the reference count? I see you removed it.

xiaguan · 2024-04-27T06:05:08Z

BTW, how do you like the reference count? I see you removed it.

I just follow the paper. The paper doesn't maintain the reference count in ghost queue?

MrCroxx · 2024-04-27T06:07:09Z

Could you please provide where the idea comes from?

The implementation on the master branch doesn't evict objects from the ghost queue. I mainly focused on fixing this bug.

Okay. Let me check then. I will update this branch directly if necessary and merge it.

MrCroxx · 2024-04-27T07:56:08Z

Could you please provide where the idea comes from?

The implementation on the master branch doesn't evict objects from the ghost queue. I mainly focused on fixing this bug.

~~Emmm, wait again. The implement on the main branch does evict objects from the ghost queue. It evicts only when the reference count hits 0.~~

~~This PR, on the other hand, removes all keys with the same no matter if there is a hash collision. This behavior will make the admission less aggressive.~~

I got you.

MrCroxx · 2024-04-27T07:56:50Z

Hi @PsiACE , do you have any idea about the reference count? Glad to hear from you as another reference.

PsiACE · 2024-04-27T09:59:27Z

Hi @PsiACE , do you have any idea about the reference count? Glad to hear from you as another reference.

s3fifo does not need to keep reference counting for ghost, as reference counting will cause Quick Demotion failure, that is, incorrect estimation of popularity.

After I noticed this and made the corresponding modifications, my algorithm also approached the state-of-the-art level, but it still needs further optimization.

MrCroxx · 2024-04-27T11:23:07Z

s3fifo does not need to keep reference counting for ghost, as reference counting will cause Quick Demotion failure, that is, incorrect estimation of popularity.

Even when the ghost queue uses key hash not key itself? 🤔 This looks a bit counterintuitive.

MrCroxx · 2024-04-27T12:45:14Z

I plan to rewrite a cache benchmark tool like mokabench or enhance it to make benchmarking more automatic.

This PR fixed that the ghost queue doesn't add weights when pushing, but removes the hash reference count. I'm kind of concerned about removing the hash reference count, but for most workloads, I think it is okay. So let's merge it first.

MrCroxx · 2024-04-27T12:45:56Z

Thanks again for your efforts, @xiaguan and @PsiACE . I've learnt a lot in the journey.

PsiACE · 2024-04-27T12:51:23Z

I plan to rewrite a cache benchmark tool like mokabench or enhance it to make benchmarking more automatic.

It can be almost directly reused.

#[derive(Clone)]
pub struct S3FIFOCache {
    config: Arc<Config>,
    cache: Arc<foyer::Cache<Key, Value>>,
}

impl S3FIFOCache {
    pub fn new(config: &Config, capacity: usize) -> Self {
        if let Some(_ttl) = config.ttl {
            todo!()
        }
        if let Some(_tti) = config.tti {
            todo!()
        }
        if config.size_aware {
            todo!()
        }

        // TinyUFO does not support custom hasher. Use its default hasher.
        Self {
            config: Arc::new(config.clone()),
            cache: Arc::new(foyer::CacheBuilder::new(capacity)
            .with_shards(1)
            .with_eviction_config(foyer::S3FifoConfig {
                small_queue_capacity_ratio: 0.1,
                ghost_queue_capacity_ratio: 0.9,
                small_to_main_freq_threshold: 2,
            })
            .with_object_pool_capacity(16)
            .build()),
        }
    }

    fn get(&self, key: &usize) -> bool {
        self.cache.get(key).is_some()
    }

    fn insert(&self, key: usize, req_id: usize) {
        let value = super::make_value(&self.config, key, req_id);
        super::sleep_thread_for_insertion(&self.config);
        self.cache.insert(key, value);
    }
}

MrCroxx · 2024-04-27T12:54:27Z

I plan to rewrite a cache benchmark tool like mokabench or enhance it to make benchmarking more automatic.

It can be almost directly reused.

#[derive(Clone)]
pub struct S3FIFOCache {
    config: Arc<Config>,
    cache: Arc<foyer::Cache<Key, Value>>,
}

impl S3FIFOCache {
    pub fn new(config: &Config, capacity: usize) -> Self {
        if let Some(_ttl) = config.ttl {
            todo!()
        }
        if let Some(_tti) = config.tti {
            todo!()
        }
        if config.size_aware {
            todo!()
        }

        // TinyUFO does not support custom hasher. Use its default hasher.
        Self {
            config: Arc::new(config.clone()),
            cache: Arc::new(foyer::CacheBuilder::new(capacity)
            .with_shards(1)
            .with_eviction_config(foyer::S3FifoConfig {
                small_queue_capacity_ratio: 0.1,
                ghost_queue_capacity_ratio: 0.9,
                small_to_main_freq_threshold: 2,
            })
            .with_object_pool_capacity(16)
            .build()),
        }
    }

    fn get(&self, key: &usize) -> bool {
        self.cache.get(key).is_some()
    }

    fn insert(&self, key: usize, req_id: usize) {
        let value = super::make_value(&self.config, key, req_id);
        super::sleep_thread_for_insertion(&self.config);
        self.cache.insert(key, value);
    }
}

Yep. I've already sent a PR 🥰

xiaguan changed the title ~~Fix s3fifo~~ fix: fix the ghost queue bug with s3fifo Apr 26, 2024

xiaguan mentioned this pull request Apr 26, 2024

Ghost queue #306

Draft

3 tasks

PsiACE approved these changes Apr 26, 2024

View reviewed changes

xiaguan added 4 commits April 27, 2024 02:54

add real workload bench

a86af83

fix s3fifo impl

e095e17

make fast

387953f

Merge branch 'main' into fix_s3fifo

03073d0

MrCroxx approved these changes Apr 27, 2024

View reviewed changes

MrCroxx added the bug Something isn't working label Apr 27, 2024

MrCroxx added this to the v0.8 milestone Apr 27, 2024

MrCroxx merged commit 4033b00 into foyer-rs:main Apr 27, 2024
16 checks passed

fix pull request comments

3a561d6

xiaguan deleted the fix_s3fifo branch April 29, 2024 11:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: fix the ghost queue bug with s3fifo #432

fix: fix the ghost queue bug with s3fifo #432

xiaguan commented Apr 26, 2024

xiaguan commented Apr 26, 2024

codecov bot commented Apr 26, 2024 •

edited

Loading

MrCroxx left a comment •

edited

Loading

MrCroxx Apr 27, 2024

xiaguan Apr 27, 2024

MrCroxx Apr 27, 2024

xiaguan Apr 27, 2024

MrCroxx Apr 27, 2024

MrCroxx Apr 27, 2024

MrCroxx commented Apr 27, 2024

MrCroxx commented Apr 27, 2024

xiaguan commented Apr 27, 2024

MrCroxx commented Apr 27, 2024

xiaguan commented Apr 27, 2024

MrCroxx commented Apr 27, 2024

MrCroxx commented Apr 27, 2024 •

edited

Loading

MrCroxx commented Apr 27, 2024

PsiACE commented Apr 27, 2024 •

edited

Loading

MrCroxx commented Apr 27, 2024

MrCroxx commented Apr 27, 2024

MrCroxx commented Apr 27, 2024

PsiACE commented Apr 27, 2024 •

edited

Loading

MrCroxx commented Apr 27, 2024

fix: fix the ghost queue bug with s3fifo #432

fix: fix the ghost queue bug with s3fifo #432

Conversation

xiaguan commented Apr 26, 2024

What's changed and what's your intention?

Checklist

Related issues or PRs (optional)

xiaguan commented Apr 26, 2024

codecov bot commented Apr 26, 2024 • edited Loading

Codecov Report

MrCroxx left a comment • edited Loading

Choose a reason for hiding this comment

MrCroxx Apr 27, 2024

Choose a reason for hiding this comment

xiaguan Apr 27, 2024

Choose a reason for hiding this comment

MrCroxx Apr 27, 2024

Choose a reason for hiding this comment

xiaguan Apr 27, 2024

Choose a reason for hiding this comment

MrCroxx Apr 27, 2024

Choose a reason for hiding this comment

MrCroxx Apr 27, 2024

Choose a reason for hiding this comment

MrCroxx commented Apr 27, 2024

MrCroxx commented Apr 27, 2024

xiaguan commented Apr 27, 2024

MrCroxx commented Apr 27, 2024

xiaguan commented Apr 27, 2024

MrCroxx commented Apr 27, 2024

MrCroxx commented Apr 27, 2024 • edited Loading

MrCroxx commented Apr 27, 2024

PsiACE commented Apr 27, 2024 • edited Loading

MrCroxx commented Apr 27, 2024

MrCroxx commented Apr 27, 2024

MrCroxx commented Apr 27, 2024

PsiACE commented Apr 27, 2024 • edited Loading

MrCroxx commented Apr 27, 2024

codecov bot commented Apr 26, 2024 •

edited

Loading

MrCroxx left a comment •

edited

Loading

MrCroxx commented Apr 27, 2024 •

edited

Loading

PsiACE commented Apr 27, 2024 •

edited

Loading

PsiACE commented Apr 27, 2024 •

edited

Loading