Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[leo_gateway] Allow user to choose strong consistency or performance of inner cache #459

Closed
windkit opened this issue Mar 6, 2016 · 10 comments

Comments

@windkit
Copy link
Contributor

windkit commented Mar 6, 2016

Description

When using the inner cache in leo_gateway, ETag of cache is compared with the one in backend leo_storage for every cache hit.

The round trip is not favorable for small read intensive workload, users may want to relax the consistency for better performance.

Proposal

Reference to the expire mechanism in http cache, allow the user to choose from "Strictly checked" to "check once 5 minutes" by setting the cache.cache_expire

Setting to 0 means checking every time and other values mean check once x seconds

Related PRs

leo-project/leo_gateway#36

@windkit
Copy link
Contributor Author

windkit commented Mar 6, 2016

For 4KB Objects, the difference is significant. It would increase from ~30000 ops to ~45000 ops

@yosukehara
Copy link
Member

Thanks for your proposal.

I've understood your request but there are some issues in it.

  • All objects under buckets are affected to this feature and its configuration, and LeoFS is used by multi users, not one. If its configuration is able to set each bucket, it's complicated.
  • It seems this feature's purpose is LeoFS' benchmark.

Actually, I've been considering the metadata-cache feature from last week. If we reached the consensus, I'll share that.

@windkit
Copy link
Contributor Author

windkit commented Mar 7, 2016

Sorry I don't understand the concern, this suggestion does not aim to remove the check completely but rather allow user to choose.

  1. What bothers me at first is the inconsistency between http cache and inner cache, one is done async (check after Expire), one is "semi-sync" (check checksum, serve data from cache).
  2. Yes, every performance tuning would affect benchmark. No, it would affect lots of workload with "write once read many" characteristics.

To me, one of the strengths of LeoFS would be low latency and so I would like to improve it in this aspect.

Once again, I just want to give the freedom to user given that the code/flow change is not much.

@yosukehara
Copy link
Member

At the beginning of implementation of the cache feature, we considered consistency between gateway-node(s) and a storage cluster. Our viewpoint is users expect strong consistency, we chose the strong consistency solution. If it's inconsistency between http-cache and inner-cache, we need to fix http cache features which is similar to the inner cache feature to avoid confusion.

@ZhaoX
Copy link

ZhaoX commented Mar 16, 2016

@yosukehara Just for your consideration.
In my situation, I store data in LeoFS and never update them. So, If the consistency between leo_gateway cache and leo_storage data is configurable, I would prefer it.

@yosukehara
Copy link
Member

@ZhaoX In our company, LeoFS has been used by lots of services. We cannot handle their needs both a strong consistency and an weak consistency with the LeoGateway's configuration.

In order to improve LeoFS performance, we've implemented concurrency of read-operation in LeoStorage.

By the feature, LeoFS performance is dramatically improved:

@ZhaoX
Copy link

ZhaoX commented Mar 16, 2016

@yosukehara Thank you for your reply.
What's the LeoFS version being benchmarked? The title is "Benchmark LeoFS v1.2.20-dev" and the environment part said "LeoFS: v1.4.0-pre.3-dev".

@windkit
Copy link
Contributor Author

windkit commented Mar 16, 2016

@ZhaoX Sorry I made the mistakes in the environment session, it is done with 1.2.20-dev

@windkit
Copy link
Contributor Author

windkit commented Mar 16, 2016

@ZhaoX Thank you for you concern about this issue. But after a second thought, the benefit of the design not as huge as I first suggested.
It is really a rare case that application keep getting the same objects within a small period of time, if there is such kind of need, it would be more common to be solved by proxy in front of LeoFS.

@yosukehara
Copy link
Member

@windkit @ZhaoX Thank you for your comment and suggestion. I've closed this because we reached the consensus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants