Few implementations for pool of locks.
MemoryCache
has an issue with reentrancy (see this). To tackle this for async, coupled with SemaphoreSlim
being IDisposable
, a lock pool of SemaphoreSlim
s can be used.
# key-specific lock with async
memoryCache = new
lockPool = new semaphoreSlimPool(1, 1)
task<T> getAsync(key, valueFactoryAsync):
if memoryCache.tryGetValue(key, out value):
return value
lockForKey = lockPool.get(key)
await lockForKey.waitAsync()
try:
if memoryCache.tryGetValue(key, out value):
return value
value = await valueFactoryAsync()
memoryCache.set(key, value, ttl)
return value
finally:
lockForKey.release()
A lock pool of Object
s for sync.
# key-specific lock with sync (instead, can also use a concurrent map)
memoryCache = new
lockPool = new lockPool()
T get(key, valueFactory):
if memoryCache.tryGetValue(key, out value):
return value
lockForKey = lockPool.get(key)
lock lockForKey:
if memoryCache.tryGetValue(key, out value):
return value
value = valueFactory()
memoryCache.set(key, value, ttl)
return value
Below are different cache impls to compare against.
CacheWithBlockingThread
,CacheWithLock
(uber lock), andCacheWithNoLock
are not really good impls.CacheWithBlockingThread
uses the sync-over-async anti-pattern,CacheWithLock
uses an uber lock thereby blocking factory operations of other keys while performing one on a key,CacheWithNoLock
has no critical section and blasts the downstream dep'y with multiple calls potentially causing cache stampede.CacheWithTextbookPattern
is how the guidance is currently, but it's reentrant.CacheWithLazyCache
uses LazyCache which solves the rentrancy issue usingLazy<T>
,AsyncLazy<T>
, and lock pool.CacheWithFusionCache
uses FusionCache. It has many different concurrency implementations.CacheWithTask
uses a technique to cache the task itself instead of the value, and the task is then awaited to get the value. The technique is quite clever, but tricky to follow.CacheWithLockPool
uses a lock pool where a key maps to the same lock using hashing, and multiple keys also map to a lock. It loses some parallelism when thevalueFactory[Async]
calls for a key error out as they get synchronized.poolSize
is1024
(processorCount
of16
*factor
of64
), which is bit high to not cause a bottleneck.CacheWithTaskAndLockPool
reaps the best of both impls; cache the task itself instead of the value, and also use a lock pool to create the task in a critical section. The task is awaited outside of thelock
(monitor
) to get the value.
Verify_Perf
for 200 batches where each batch consists of Environment.ProcessorCount
tasks (total 16,000 tasks), 1,000 keys (cache items), 25ms cache ttl, 10ms of valueFactory delay. The test result shows that when the cache is write-heavy, CacheWithLockPool
is slower than CacheWithLazyCache
, CacheWithTask
, and CacheWithTaskAndLockPool
.
Impl (net6.0) | Time (secs) | Fault % | Tx/s | Issues |
---|---|---|---|---|
CacheWithBlockingThread | 50.4 | 0 | 317 | rentrant, slow, sync-over-async |
CacheWithLock | 50.4 | 0 | 317 | slow |
CacheWithNoLock | 3.3 | 0 | 4848 | highly rentrant |
CacheWithTextbookPattern | 3.4 | 0 | 4705 | highly rentrant |
CacheWithLazyCache | 3.4 | 0 | 4705 | |
CacheWithFusionCache | 3.4 | 0 | 4705 | |
CacheWithTask | 3.3 | 0 | 4848 | slightly rentrant |
CacheWithLockPool | 3.7 | 0 | 4324 | slightly slow |
CacheWithTaskAndLockPool | 3.3 | 0 | 4848 |
Verify_Perf
for 1,000 batches where each batch consists of Environment.ProcessorCount
tasks (total 16,000 tasks), 1,000 keys (cache items), high cache ttl, 10ms of valueFactory delay. In contrast to above test, this test result showcases that all impls perf the same when the cache is read-heavy which is the typical case.
Impl (net6.0) | Time (secs) | Fault % | Tx/s | Issues |
---|---|---|---|---|
CacheWithLazyCache | 3.5 | 0 | 4571 | |
CacheWithFusionCache | 3.5 | 0 | 4571 | |
CacheWithTask | 3.5 | 0 | 4571 | slightly rentrant |
CacheWithLockPool | 3.5 | 0 | 4571 | slightly slow |
CacheWithTaskAndLockPool | 3.5 | 0 | 4571 |
Verify_Perf
for 1,000 batches where each batch consists of Environment.ProcessorCount
tasks (total 16,000 tasks), 1,000 keys (cache items), high cache ttl, 10ms of valueFactory delay, and faults. With faults, CacheWithLockPool
is slower than others, CacheWithLazyCache
, CacheWithTask
, CacheWithTaskAndLockPool
. This is because others will hand out the same task for the concurrent calls on a key, whereas CacheWithLockPool
will synchronize them taking more time due to backpressure. CacheWithTask
, CacheWithTaskAndLockPool
do evict the task if it eventually faults/cancels using an IChangeToken
so as to not cache errors, same with CacheWithLazyCache
. CacheWithLazyCache
, CacheWithTaskAndLockPool
perf the same as CacheWithTask
, and don't have CacheWithTask
's reentrancy, and CacheWithLockPool
's backpressure issues. CacheWithFusionCache
is negligibly slower as the faults increase, but perfs similarly.
Impl (net6.0) | Time (secs) | Fault % | Tx/s | Issues |
---|---|---|---|---|
CacheWithLazyCache | 3.5 | 0 | 4571 | |
CacheWithFusionCache | 3.6 | 0 | 4444 | |
CacheWithTask | 3.5 | 0 | 4571 | slightly rentrant |
CacheWithLockPool | 3.5 | 0 | 4571 | slightly slow |
CacheWithTaskAndLockPool | 3.5 | 0 | 4571 | |
CacheWithLazyCache | 3.9 | 10 | 4102 | |
CacheWithFusionCache | 4.0 | 10 | 4000 | |
CacheWithTask | 3.9 | 10 | 4102 | slightly rentrant |
CacheWithLockPool | 4.0 | 10 | 4000 | slightly slow |
CacheWithTaskAndLockPool | 3.9 | 10 | 4102 | |
CacheWithLazyCache | 7.0 | 50 | 2285 | |
CacheWithFusionCache | 7.1 | 50 | 2254 | |
CacheWithTask | 7.0 | 50 | 2285 | slightly rentrant |
CacheWithLockPool | 7.2 | 50 | 2222 | slightly slow |
CacheWithTaskAndLockPool | 7.0 | 50 | 2285 | |
CacheWithLazyCache | 16.5 | 100 | 969 | |
CacheWithFusionCache | 17.1 | 100 | 936 | |
CacheWithTask | 16.5 | 100 | 969 | slightly rentrant |
CacheWithLockPool | 19.9 | 100 | 804 | slightly slow |
CacheWithTaskAndLockPool | 16.5 | 100 | 969 |
Use CacheWithLazyCache
(uses LazyCache), CacheWithFusionCache
(uses FusionCache), or CacheWithTaskAndLockPool
.
# key-specific lock with async
lockForKeyPool = new lockPool()
task<T> getAsync(key, valueFactoryAsync):
if !cache.tryGetValue(key, out valueTask):
lockForKey = lockForKeyPool.get(key)
lock lockForKey:
if !cache.tryGetValue(key, out valueTask):
valueTask = cache.getOrAdd(key, (cacheEntry) =>
{
cacheEntry.ttl = ttl
var task = valueFactoryAsync<T>(...)
cacheEntry.addExpirationToken(new evictErroredCachedTaskChangeToken(task))
})
return await valueTask
# evict task that errors out
class evictErroredCachedTaskChangeToken : iChangeToken
changed = false
ctor(task):
changed = isEvictable(task)
task.continueWith(t =>
changed = isEvictable(t))
bool isEvictable(task) => task.canceled || task.faulted
hasChange => changed
# ...