Skip to content

Commit

Permalink
Added new spqr-balancer (#552)
Browse files Browse the repository at this point in the history
* Add new balancer

* Add Balancer config

* Added threshold values to balancer config

* WIP

* Changed metrics

* WIP

* WIP

* General logic of generateTasks done

* More additions

* Refactoring

* Refactoring + getShardCurrentState

* Implement getHostStatus

* Update key Ranges info

* Implement getKeyRange

* Refactoring

* WIP on getStatsByKeyRange

* Implemented getStatsByKeyRange

* Implemented getTasks

* Added TaskGroup type

* Initial impl of executeTasks

* Small refactoring

* unificationType -> joinType

* Added TasksService to proto-specs

* Added TasksService to proto servers

* Added internal tasks-related types, implemented Task proto services

* Added Task-related methods to QDB

* Declared Task-related methods to QDBs

* Add Tasks to-from Db methods

* Implemented TaskMgr methods

* Implemented Task methods in MemQDB

* Implemented Task methods in EtcdQDB

* Implemented QDB-related methods in balancer

* Re-written balancer to use internal Task types

* Account for joinType in  executeTasks

* Add spqr-balancer app

* Fixies

* Fixed QDBs GetTaskGroup

* Some logging alterations

* Small interface check

* Some renaming & refactoring

* More shit

* Fixed getting pg_is_in_recovery

* Create pg_comment_stats extension on shards

* Fix getHostStatus & add logging

* Fixes

* More small fixes

* Fixes transferring from last key range

* Fixed transfers of JoinNone type

* Small fixes

* Add balancer Makefile targets

* Fix shard image's setup

* Add balancer feature test

* Fix working with mixed-case table names

* fix boundaries in provider

* Added more tests for balancer

* Fixed bound selection in balancer

* Added CPU test to balancer

* Reduced balancer config

* Added timeout to balancer config

* Changed timeout in balancer configs for feature tests

* Lint fix

* Fix getting detailed cpu stats

* Refactoring in NewBalancer

* Ignore unreachable hosts rather then fail

* Refactor MaxRelative & getCriterion funcs

* Optimized getStatsByKeyRange by adding another index

* Fix comment

* Simpler-to-read formulas

* Fixed fitsOnShard logic

* Changed getShardToMoveTo signature to return bool instead of error

* Oh no, my cum count...

* Fix fitsOnShard logic

* Simplify getAdjacentShards

* Removed incorrect condition

* use TaskGroup.Tasks as stack

* Remove redundant comment

* Fix moving to non-adjacent shard & more sophisticated test

* Fix spqr-balancer description

* Add TODO

* Fixed balancer test's name

* Fixed balancer feature test
  • Loading branch information
EinKrebs authored Mar 25, 2024
1 parent 1f360b1 commit be78c32
Show file tree
Hide file tree
Showing 37 changed files with 2,897 additions and 107 deletions.
7 changes: 5 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ deps:

####################### BUILD #######################

build_balancer:
go build -pgo=auto -o spqr-balancer ./cmd/balancer

build_coorctl:
go build -pgo=auto -o coorctl ./cmd/coordctl

Expand All @@ -42,7 +45,7 @@ build_workloadreplay:
build_spqrdump:
go build -pgo=auto -o spqrdump ./cmd/spqrdump

build: build_coordinator build_coorctl build_router build_mover build_worldmock build_workloadreplay build_spqrdump
build: build_balancer build_coordinator build_coorctl build_router build_mover build_worldmock build_workloadreplay build_spqrdump

build_images:
docker compose build spqr-base-image
Expand All @@ -59,7 +62,7 @@ save_shard_image:
docker save ${IMAGE_SHARD} | gzip -c > ${CACHE_FILE_SHARD};\

clean:
rm -f spqr-router spqr-coordinator spqr-mover spqr-worldmock
rm -f spqr-router spqr-coordinator spqr-mover spqr-worldmock spqr-balancer
make clean_feature_test

######################## RUN ########################
Expand Down
31 changes: 31 additions & 0 deletions balancer/app/app.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
package app

import (
"context"
"github.com/pg-sharding/spqr/balancer"
"github.com/pg-sharding/spqr/pkg/config"
"github.com/pg-sharding/spqr/pkg/spqrlog"
"time"
)

type App struct {
balancer balancer.Balancer
}

func NewApp(b balancer.Balancer) *App {
return &App{
balancer: b,
}
}

func (app *App) Run() error {
if err := spqrlog.UpdateZeroLogLevel(config.BalancerConfig().LogLevel); err != nil {
return err
}
spqrlog.Zero.Info().Msg("running balancer")

ctx, cancel := context.WithTimeout(context.TODO(), time.Duration(config.BalancerConfig().TimeoutSec)*time.Second)
defer cancel()
app.balancer.RunBalancer(ctx)
return nil
}
7 changes: 7 additions & 0 deletions balancer/balancer.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
package balancer

import "context"

type Balancer interface {
RunBalancer(ctx context.Context)
}
Loading

0 comments on commit be78c32

Please sign in to comment.