Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admin: methods to select replicas for log stream #393

Closed
4 of 6 tasks
ijsong opened this issue Apr 4, 2023 · 0 comments · Fixed by #394, #395, #396, #402 or #399
Closed
4 of 6 tasks

admin: methods to select replicas for log stream #393

ijsong opened this issue Apr 4, 2023 · 0 comments · Fixed by #394, #395, #396, #402 or #399
Assignees

Comments

@ijsong
Copy link
Member

ijsong commented Apr 4, 2023

When the admin makes a log stream, it can select replicas automatically if a client does not send the log stream's topology. The admin provides the interface ReplicaSelector to support various methods of choosing replicas.

// ReplicaSelector selects storage nodes and volumes to store data for replicas of a new log stream.
// This method returns a slice of `varlogpb.ReplicaDescriptor`, and its length should be equal to the
// replication factor.
type ReplicaSelector interface {
	Select(ctx context.Context) ([]*varlogpb.ReplicaDescriptor, error)
}

Currently, the admin provides balancedReplicaSelector, which implements ReplicaSelector; however, it does not work well because it is unnecessarily too complex. Therefore we have to revisit it.

Goals:

  • Pluggable replica selection algorithm
  • Override mechanism to replace global algorithm with a topic-specific algorithm
  • Call-specific algorithm

Implementation plan:

@ijsong ijsong self-assigned this Apr 4, 2023
ijsong added a commit that referenced this issue Apr 5, 2023
It adds a new method `Name() string` to the replica selector interface
`internal/admin.(ReplicaSelector)`. We can set a unique name for each replica selector, and
therefore, we can use a necessary replica selection algorithm by specifying the name.

Updates #393
ijsong added a commit that referenced this issue Apr 5, 2023
It adds a new replica selector, `randomReplicaSelector`. It chooses storage nodes and paths
randomly.

Updates #393
ijsong added a commit that referenced this issue Apr 5, 2023
This PR adds a new replica selector based on the Least Frequently Used (LFU) algorithm.
`lfuReplicaSelector` selects each replica's storage node and data path, giving preference to those
with fewer assigned replicas.

All replica selectors, including `lfuReplicaSelector`, are stateless, meaning they don't keep
existing log stream topology. When `Select` is invoked, `lfuReplicaSelector` collects used counters
for each storage node and path. Although it seems to be inefficient, it is simple as well as
fault-tolerant.

To select the least used storage nodes and paths, `lfuReplicaSelector` runs as follows:

- Fetch cluster metadata from the `ClusterMetadataView`.
- Increase the used counters for each storage node and path assigned to each log stream.
- Sort counters.
- Choose the least used storage nodes and paths.

Updates #393
ijsong added a commit that referenced this issue Apr 6, 2023
It adds a new method `Name() string` to the replica selector interface
`internal/admin.(ReplicaSelector)`. We can set a unique name for each replica selector, and
therefore, we can use a necessary replica selection algorithm by specifying the name.

Updates #393
ijsong added a commit that referenced this issue Apr 6, 2023
It adds a new replica selector, `randomReplicaSelector`. It chooses storage nodes and paths
randomly.

Updates #393
ijsong added a commit that referenced this issue Apr 6, 2023
This PR adds a new replica selector based on the Least Frequently Used (LFU) algorithm.
`lfuReplicaSelector` selects each replica's storage node and data path, giving preference to those
with fewer assigned replicas.

All replica selectors, including `lfuReplicaSelector`, are stateless, meaning they don't keep
existing log stream topology. When `Select` is invoked, `lfuReplicaSelector` collects used counters
for each storage node and path. Although it seems to be inefficient, it is simple as well as
fault-tolerant.

To select the least used storage nodes and paths, `lfuReplicaSelector` runs as follows:

- Fetch cluster metadata from the `ClusterMetadataView`.
- Increase the used counters for each storage node and path assigned to each log stream.
- Sort counters.
- Choose the least used storage nodes and paths.

Updates #393
ijsong added a commit that referenced this issue Apr 6, 2023
It added a new flag, `replica-selector`, to the varlogadm. The flag sets the global default replica
selector.

Updates #393
@ijsong ijsong linked a pull request Apr 6, 2023 that will close this issue
ijsong added a commit that referenced this issue Apr 6, 2023
It adds a new method `Name() string` to the replica selector interface
`internal/admin.(ReplicaSelector)`. We can set a unique name for each replica selector, and
therefore, we can use a necessary replica selection algorithm by specifying the name.

Updates #393
ijsong added a commit that referenced this issue Apr 6, 2023
It adds a new replica selector, `randomReplicaSelector`. It chooses storage nodes and paths
randomly.

Updates #393
ijsong added a commit that referenced this issue Apr 6, 2023
This PR adds a new replica selector based on the Least Frequently Used (LFU) algorithm.
`lfuReplicaSelector` selects each replica's storage node and data path, giving preference to those
with fewer assigned replicas.

All replica selectors, including `lfuReplicaSelector`, are stateless, meaning they don't keep
existing log stream topology. When `Select` is invoked, `lfuReplicaSelector` collects used counters
for each storage node and path. Although it seems to be inefficient, it is simple as well as
fault-tolerant.

To select the least used storage nodes and paths, `lfuReplicaSelector` runs as follows:

- Fetch cluster metadata from the `ClusterMetadataView`.
- Increase the used counters for each storage node and path assigned to each log stream.
- Sort counters.
- Choose the least used storage nodes and paths.

Updates #393
ijsong added a commit that referenced this issue Apr 6, 2023
It added a new flag, `replica-selector`, to the varlogadm. The flag sets the global default replica
selector.

Updates #393
ijsong added a commit that referenced this issue Apr 6, 2023
It added a new flag, `replica-selector`, to the varlogadm. The flag sets the global default replica
selector.

Updates #393
@ijsong ijsong reopened this Apr 6, 2023
@ijsong ijsong reopened this Apr 6, 2023
ijsong added a commit that referenced this issue Apr 6, 2023
It added a new flag, `replica-selector`, to the varlogadm. The flag sets the global default replica
selector.

Updates #393
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment