Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Add proposal for instance scoped variables #30558
docs: Add proposal for instance scoped variables #30558
Changes from 11 commits
737e629
81b0905
fa6f9c7
23d3d53
d538a7d
c6eb9b6
253926e
154c98e
ef2eefc
8843d89
66af311
3274f69
9147794
9eda156
2519b6d
e54cff3
a05e8f2
024e1f8
b4c82a4
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you supply a real user scenario that would require this non-persisted instance scoped variables? From my perspective, setting a variable in instance level but not persisting it will lead to confusion in many cases. For example, if I really would like to enable a feature for only one instance, I would be very surprised that the change is reverted at some point, cause TiDB is designed to be stateless and frequent restart is allowed to happen.
More over, considering that in many cases user is using TiDB behind a LB, and there may be even no way to access individual instances (for example, in TiDB Cloud), would it be better to allow remotely setting the instance variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is almost more a question of what should be a candidate for instance scope. Eason's opinion is that we should always default to global scope for management simplicity, and actually few things should be instance. I think CRDB gets this right in that the node specific options almost always refer to local resources (max memory, tmpdir path, log file path, port, socket). We probably get a few things wrong currently, but are mostly on the right path.
If we look from these examples: you might need to shuffle around the
tmpdir
on a particular TiDB server instance to keep it up and running (and might not want to restart it). So you would have to log into the TiDB instance to make this change, but it would not affect users running through a LB.As to the semantic of "reverted", if we were to persist the change, it is not really any less confusing because there is still a precedence problem of config file vs. saved in TiDB (i.e. "I set this option in the config file but it's not working!"). The semantic is also how MySQL works (although there is a newer feature called SET PERSIST), TiDB is special in that it persists all global config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a section on "Documentation Changes", PTAL. I think this is easier to explain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the supply! Does this mean that, we will have a clear rule that for these settings, an INSTANCE variable can be used to temporarily updating the config (and taking effect immediately), while user should use config file for persistently updating the config (but taking effect after a reboot)?
I do like CRDB's config mechanism a lot. There aren't a lot of ways like TiDB to configure. They only have two, which is very easy for user to learn. User will not be confused about persistence or things like that:
a. CLI Args - Instance level, restart to take effect, persisted
b. Variable - Cluster level, take effect immediately, persisted
Notice that under CRDB's model it is not possible to adjust some instance level things and take effect without a restart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly. The other limitation is explaining why 2 nodes are behaving differently: in MySQL
SHOW GLOBAL VARIABLES
captures all configuration, but since memory in CRDB is a node-level option, it wouldn't be captured, but it is in this proposal since everything has a sysvar mapping (step 2).Edit: even if TiDB only has 2 sources currently, the split between them is not intuitive to users. We will benefit from a source of truth.
Yes, I propose in documentation we describe the semantic as "Persists to Cluster: Bool" (instead of
INSTANCE
vs.GLOBAL
).Persists to Cluster: No
is identical to how MySQL global variables, so it is easy to explain. We do not need to use the term Instance scope vs. global scope for users - since they are both set withSET GLOBAL
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this confuse the customer that
SET GLOBAL
is used by instance and global sysvars? How could they know if a var is instance or global level?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you will need to read the docs to see if a variable is
INSTANCE
orGLOBAL
scoped. This confusion is existing with the session scope, but actually worse since it's not handled by the framework the docs for manyINSTANCE
variables say they areSESSION
scoped (we will finally be able to auto generate docs and say something is "INSTANCE" scoped versus a manual workaround: https://github.com/pingcap/docs/blob/8600643f2fcd20d70d8ad8f637009995948c48c6/scripts/generate-system-variables.go#L142-L152 ).The semantics for
INSTANCE
are also much closer to global than session, with only two exceptions:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a section on "Documentation Changes", PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking together with instance semantic: https://github.com/pingcap/tidb/pull/30558/files#diff-af084f8d970428eecaaeb42d9b43dcd9e0155602726dec822bda47e0071a0896R51
I guess when
tidb_enable_legacy_instance_scope
is enabled, the user may face instance-level var with both 'SESSION' and 'GLOBAL' keywords:Both of the two
SET
s are actually working in 'instance' scope. Am I right? if so I think this can be confusing.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. We should add a warning for the
SET SESSION
version. I agree it is confusing, which is why I suggest we consider later changing the default.Unfortunately it is required to keep compatibility. But it is important to transition
INSTANCE
to using theGLOBAL
keyword, becauseSESSION
semantics are different.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that if I change the value of an system variable, the value in the configuration file will be changed as well ? or those var from config file is readonly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default will be that we map variables as read-only, so we don't need to think about the specific cases (i.e. you can't easily change the socket or port). This helps us achieve completeness first, and then we can then evaluate which ones can be made dynamic.
When you modify an instance scoped system variable, it modifies in memory only (similar to
SET GLOBAL
in MySQL). This does not propose modifying or rewriting the config file (that would make it lose formatting, and it gets complex quickly).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added this clarification to the proposal, so I'm going to resolve this conversation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal is to (1) have system variables show the superset of all configuration and (2) have unified names between configuration. These do technically improve MySQL compatibility, but I would say these are also good practices.
In the case of instance variables, they do not persist on restart so a configuration file is required to handle this case. In cockroachdb they don't have this though: node (their term for instance) is only supported as startup parameters, everything else is a cluster wide variable. But I think this is also an incompatible change.. by making every config setting available through systerm variables and settable in the config file using the system variable name (under
[instance]
heading) gives us a better transition state to unification.Because instance values do not persist, this behavior is straight forward. The configuration file values will be taken, and if not specified, compiled server defaults. This is also technically a MySQL compatible way for GLOBAL variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't get it: why do we replace
global
withinstance
instead of replacingsession
withinstance
? To users, they are both hard to understand and you need to explain it in the user docs anyway.Are there any cases where a system variable is both session and instance scoped? I don't think so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, there are no cases where a variable is both session and instance.
Session by semantics is supposed to be thread-local-storage. When the initial connection is created to MySQL it will copy all the values of global variables that also have session scope, and any further changes to global will not affect those session changes. It should only be changes that you make in your connection that change the value.
What I am calling
INSTANCE
is actually the identical semantics to MySQLGLOBAL
. They are much closer in behavior: the only difference is you will need to check the docs to see if it is an INSTANCE-GLOBAL or a GLOBAL-GLOBAL if you want to know if your change (1) propagates to the cluster and (2) persists.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is listed as motivation #3 in "Motivation and Background".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the semantics between
session
andinstance
is different, but the semantics betweeninstance
andglobal
is also different. From users' view:So I don't think
instance
is closer tosession
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the case since ~TiDB 5.2. On a single-instance basis if you set a
GLOBAL
system variable and then connect from another session it will take effect unless the scope isSESSION + GLOBAL
, and the other session pre-originated the setting. On a multi-instance basis there is an etcd notification sent immediately, but there is a race where you could beat it.I was arguing it is closer to MySQL
GLOBAL
, or rather it is identical. TiDBGLOBAL
does not exist in MySQL.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a section on "Documentation Changes", PTAL