Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache project metadata and tokens #1099

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Conversation

minwoox
Copy link
Contributor

@minwoox minwoox commented Feb 14, 2025

Motivation:
Currently, only the raw contents of ProjectMetadata and Tokens are cached when fetched from MetadataService. However, deserialization takes a significant amount of time, so caching the deserialized instances would improve performance.

Modifications:

  • Introduced CacheableCall interface.
  • Added Repository.execute(CacheableCall) to cache calls.
  • Cached deserialized instances of ProjectMetadata and Tokens retrieved by MetadataService.
  • Remove stripe locks that were used for CacheableCompareTreesCall.

Result:

  • Improved performance by avoiding redundant deserialization of ProjectMetadata and Tokens.

Motivation:
Currently, only the raw contents of `ProjectMetadata` and `Tokens` are cached when fetched from `MetadataService`.
However, deserialization takes a significant amount of time, so caching the deserialized instances would improve performance.

Modifications:
- Renamed `CacheableCall` to `AbstractCacheableCall`.
- Introduced `CacheableCall` interface.
- Added `Repository.execute(CacheableCall)` to cache calls.
- Cached deserialized instances of `ProjectMetadata` and `Tokens` retrieved by `MetadataService`.

Result:
- Improved performance by avoiding redundant deserialization of `ProjectMetadata` and `Tokens`.
@minwoox minwoox added this to the 0.74.0 milestone Feb 14, 2025
entry -> convertWithJackson(entry, ProjectMetadata.class));
tokenRepo = new RepositorySupport<>(projectManager, executor,
entry -> convertWithJackson(entry, Tokens.class));
metadataRepo = new RepositorySupport<>(projectManager, executor, ProjectMetadata.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, ProjectMetadata is already cached in DefaultProject and automatically refreshed when the raw data is updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of replacing the implementation of fetchMetadata0() with projectManager.get(projectName).metadata()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't replace it completely because fetching is still used here:


Let me add getMetdata instead. 😉

Comment on lines 251 to 253
return unsafeCast(cache.get(cacheableCall).handleAsync((unused, cause) -> {
throwUnsafelyIfNonNull(cause);
return unused;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like that unused is a proper name.

Suggested change
return unsafeCast(cache.get(cacheableCall).handleAsync((unused, cause) -> {
throwUnsafelyIfNonNull(cause);
return unused;
return unsafeCast(cache.get(cacheableCall).handleAsync((value, cause) -> {
throwUnsafelyIfNonNull(cause);
return value;

// Cached already.
return existingValue;
}
final CompletableFuture<T> future = new CompletableFuture<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question) Should we check if the cache was populated by a previous call first before creating a new future?

Copy link
Contributor Author

@minwoox minwoox Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a future is cheaper than calling Cache.getIfPresent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach is recommended by the team. (Sorry for not giving you enough information.)
https://github.com/ben-manes/caffeine/wiki/Faq#recursive-computations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to know what your reasoning is. Could you elaborate on that? Is .getIfPresent more expensive cost than .putIfAbsent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to check if the cache was popolulated, you need to call getIfPresent first and you need to call put or putIfAbsent if it's not created.
But in this approach, we can just call putIfAbsent only once. Did I miss something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your intention. There is a trade-off for that. If the cache hit ratio is high, most calls will be handled by getIfPresent().

As the change is trivial, let's keep it as it. I just wondered the reason. :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is also true.
Creating a future is relatively cheap and the code is shorter so I prefer this way. 😉

entry -> convertWithJackson(entry, ProjectMetadata.class));
tokenRepo = new RepositorySupport<>(projectManager, executor,
entry -> convertWithJackson(entry, Tokens.class));
metadataRepo = new RepositorySupport<>(projectManager, executor, ProjectMetadata.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of replacing the implementation of fetchMetadata0() with projectManager.get(projectName).metadata()?

@ikhoon
Copy link
Contributor

ikhoon commented Feb 18, 2025

Suggestion) It would be worth caching Tokens in a strong reference so we don't lose the cache.
How about caching Tokens using RepositoryListener? We may remove some costs such as creating CompletableFuture, hash lookup, and cache miss.

@minwoox
Copy link
Contributor Author

minwoox commented Feb 18, 2025

Suggestion) It would be worth caching Tokens in a strong reference so we don't lose the cache.

That's a good suggestion. Let me handle it in this PR. 👍

@minwoox
Copy link
Contributor Author

minwoox commented Feb 18, 2025

Suggestion) It would be worth caching Tokens in a strong reference so we don't lose the cache.

@ikhoon MetadataService is created multiple places so I need some refactoring. Let me create a separate PR for that if you don't mind.

@ikhoon
Copy link
Contributor

ikhoon commented Feb 18, 2025

Let me create a separate PR for that if you don't mind.

Let's do that.

Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the CI builds pass. 👍


private final Map<String, CompletableFuture<Revision>> reposInAddingMetadata = new ConcurrentHashMap<>();

/**
* Creates a new instance.
*/
public MetadataService(ProjectManager projectManager, CommandExecutor executor) {
public MetadataService(ProjectManager projectManager, CommandExecutor executor,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reveiew this class with the hiding whitespace mode. 😉

@minwoox
Copy link
Contributor Author

minwoox commented Feb 18, 2025

MetadataService is created multiple places so I need some refactoring. Let me create a separate PR for that if you don't mind.

I put the patches in this PR. 😓
The Tokens are cached and retrieved via InternalProjectInitializer.
Uncached tokens are also fetched when the TokenService needs to retrieve the tokens right away after adding, updating and deleting a token. PTAL.

if (loginUser.isSystemAdmin()) {
return mds.getTokens()
.thenApply(tokens -> tokens.appIds().values());
return mds.getTokens().appIds().values();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simply use the cached tokens in this case.

@@ -127,12 +123,12 @@ public CompletableFuture<HttpResult<Token>> createToken(@Param String appId,
tokenFuture = mds.createToken(author, appId, isSystemAdminToken);
}
return tokenFuture
.thenCompose(unused -> mds.findTokenByAppId(appId))
.thenCompose(unused -> fetchTokensByAppId(appId))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to fetch the tokens because the tokens are just updated.

@minwoox minwoox requested review from jrhee17 and ikhoon February 20, 2025 09:45
Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked through the code, and the direction looks good to me.

I didn't check cases where a create directly leads to a get - hopefully the following comment is addressed before RepositoryListener is used more extensively
ref: #1053 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants