-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add Cache Manager #39
Comments
Regarding 3, here is a proposed solution.
This allows the user to specify how each cache will be partitioned; such that the same map isn't being used by the entire application. The user will have access to the address of each The alternative to caching by resource in this solution is eliminating the indirection caused by the ExampleA common usecase among developers is the need to cache users in a guild. In the following example, a bot provides services (commands) for multiple guilds. Current caching solutions (by other API Wrappers) will solve this problem by storing every object in a ProposedThe proposed solution solves this by allowing the user to create any number of cache objects. One could direct the Cache Manager to store a map of Guild ID's to cache := CacheManager.Get("GuildID") // gets Cache by Guild ID, Client ID, etc. returns *Cache
cacheUsers := cache.Get("user") // gets User Cache by predefined application constant. returns *sync.Map
cachedUser := cacheUser.Get("12345678") // gets User by Snowflake. returns `any`
user, ok := cachedUser.(*disgo.User) // user from sync.Map requires type assertion returns *User
// equivalent to the following.
cache := CacheManager.Get("GuildID") // gets Cache by Guild ID, Client ID, etc. returns *Cache
cache.GetUser("12345678") // only possible because cache structure is pre-defined with resources. returns *User This allows the user to partition a cache in any manner (sharding, etc), but limits each cache to access resources using the same map (without storing a cache manager in a cache manager). AlternativeThe alternative solution solves this in the same way the proposed solution does, but does so by eliminating the second indirection. This means the same map will be used to access every resource. Resources pertaining to each other will be stored in the same map. cache := CacheManager.Get("GuildID") // gets Cache by Guild ID, Client ID, etc. returns *Cache
cachedUser := cacheUser.Get("user12345678") // gets User by predefined application constant + UserID. returns `any`
user, ok := cachedUser.(*disgo.User) // user from sync.Map requires type assertion. returns *User
// equivalent to the following.
cache := CacheManager.Get("GuildID") // gets Cache by Guild ID, Client ID, etc. returns *Cache
cache.GetUser("12345678") // only possible because constants are pre-defined. returns *User In the alternative solution, the caches within the cache of ID ComparisonUpon closer inspection, there is not much difference between the proposed and alternative solution. Both involve the use of pre-defined application constants and can result in an object within two function calls. However, the proposed solution explicitly limits how the map is accessed to the resource-type (all users within a guild use the same map), while the alternative solution limits how the map is accessed to the caching identifier (all users with RequirementsBoth solutions are flexible: The user can cache in any manner and in multiple ways (since the the CacheManager can contain two entries that point to the same sync.Map address). Caches do not experience data race issues through the use of sync.Map: The cache manager itself is a sync.Map abstraction while caches themselves are sync.Maps with functions. Each solution simplifies complexity during development for the developer. Thus, the only difference between solution is map access. The decision as to how to implement a solution (proposed, alternative, or both) lies in how many levels of indirection are allowed (such that caches are/aren't stored in other caches). |
Following #39 (comment), the cache manager can be limited to three levels of indirection ("CacheManager(ID) to Cache", "Cache(HASH) to sync.Map", "sync.Map(ResourceID) to Resource") by allowing the user to specify the hashing mechanism for each resource. cache := CacheManager.Get("GuildID") // returns *Cache, sync.Map (return type differ on solution)
cache.StoreUser("12345678", u)
// The following pseudocode refers to the code used INTERNALLY.
//
// when caching by resource (default)
// cachemanager.get returns *Cache full of *sync.Map
userCache := cache.Get(HASH) // where HASH="user", returns *sync.Map full of users
userCache.Store("12345678", u) // stores user by ID
cache.GetUser("12345678") // returns user from sync.Map of Cache sync.Map, all other user calls from this cache use the same map
// when caching per identifier
// cachemanager.get returns *Cache full of any
resourceCache.Store("user12345678", u) // stores user by constant + ID
cache.GetUser("12345678", u) // returns user from Cache *sync.Map full of any, all resources access same map within cache Thus both implementations are able to be implemented by having the user specify whether the cache should store objects directly (i.e field |
An example applicable to 4 (cache setup) has been created: #35 (comment). |
Problem
Users (developers) want an easy way to fetch information from the Discord Environment that the bot has access to.
The Disgo Cache Manager aims to solve this.
Caching
The difference between a "cache" and a "cache manager" is that a "cache manager" manages other "caches". The entire point of a cache is to minimize load on the application and network. However, the optimal way to do this will be dependent on the users (developers) application and code. Such that implementing a standard method of caching that every bot adheres to is an anti-pattern. Other Go Discord API Wrappers are either based on a cache (i.e Disgord), or implement a mandatory cache (i.e DiscordGo State). At minimum, this adds overhead to the program. In the worst case, it adds complexity to the end user. Let's analyze the following code.
Caching Overhead
The following code from Disgord showcases how a cache adds overhead.
The problem here is not necessarily that the user will always have to specify the usage of a cache, but that the cache is always involved. It does not matter if the user creates a program that has no use for the cache: Providing an option to
ignore
the cache implies that requests are always cached. When this is the case, a large amount of memory is spent storing unnecessary entries (especially given the nature of Discord's Models). In the case of Disgord, it is stated that "the cache is immutable by default", such that "every incoming and outgoing data of the cache is deep copied". This adds even more overhead for applications which handle millions of requests.Caching Complexity
The second issue with mandatory caching is the complexity that is added to the developer. In Disgord's case, you are unable to control your cache and unable to prevent data from being stored. In other cases, it can be even more problematic. Let's analyze the following code from DiscordGo.
The following code takes a string ID value and turns it into a channel. Not too bad of an idea. However, the context of this function is that it's called after a user (developer) has received an event (with Application Command Options) from Discord. Such that the user (developer) may not expect the remaining channel data to come from a cache, but from Discord itself.
When the object is in the cache (and a session parameter is provided), the program becomes incorrect: The state of the channel from the cache is not guaranteed to match the state of the Discord Channel. When the object is not in the cache, the program adds overhead by creating an additional blocking network call; in a function for "casting" nonetheless. However, the latter behavior is stated.
In a similar manner to other API Wrappers, DiscordGo's cache (State) is structured in a way that does not allow the user (developer) to manage cached resources (due to unexpected fields). Such that developer is only able to solve the problem of incorrectness by manually calling the network themselves, defeating the purpose of the "typecast" function.
Solution
The solution to this problem is to implement a separated Cache Manager module. This cache manager should make it easy for the user to setup caching, but also operate the cache themselves. In this way, the user can ensure correctness of their program while minimizing overhead of the cache. In addition, external caching solutions (such as Redis or MemCache) can be used by making the cache an exported interface.
Implementation
Users (developers) would add the Cache Manager to their application. Then, the cache manager can be used in a manner similar to the rate limiter.
CacheManager interface
which defines necessary functions used throughout the cache manager.CacheManager
. During development of the actual cache manager, this in addition to the following steps may be completed prior to 1.sync.Map
to create a cache struct that stores objects in a given manner, such that they can be retrieved in a given manner, without data race issues. No specifics are provided in this step because a solution that is flexible to the end-user has yet to be designed. The solution must account for caching by the bot and by resource._examples/bot
.The text was updated successfully, but these errors were encountered: