only async trigger changes #8778

hanshasselberg · 2020-09-30T09:03:25Z

This is a tiny step towards fixing #6616.

This PR stops all agent endpoints from syncing the changes synchronously and leaves it to the asynchronous ticker to pick up these changes. And makes changes to how syncNodeInfo is called...

The evolution of syncNodeInfo:

original:

updateSyncState -- among other things -- determines if node needs syncing and set nodeInfoInSync
SyncChanges adds node info to every service/check update if !nodeInfoInSync and has a manual syncNodeInfo call in case everything before failed to updated nodeinfo

after #7189:

updateSyncState determines if node needs syncing and set nodeInfoInSync
SyncChanges calls syncNodeInfo upfront instead of appending nodeinfo to every service/check update

proposed:

updateSyncState determines if node needs syncing and calls syncNodeInfo right away
SyncChanges doesn't try to sync node info

This PR potentially makes the tests more flaky because services/checks are no longer synced immediately.

Previously we had a variable that would track whether the node's information was in sync with the servers. This was a shared variable in the local state which required locking to read/write. Rather than guarding access to this variable we now sync node updates when we see that the node info is out of sync. This only happens in one location: updateSyncState(). Additionally, rather than SkippingNodeUpdate based on the shared variable we now let the servers determine whether the node update should be skipped. This is checked in structs.RegisterRequest.ChangesNode().

hanshasselberg · 2020-09-30T11:39:29Z

api/health_test.go

@@ -212,6 +212,16 @@ func TestAPI_HealthChecks(t *testing.T) {
 		t.Fatalf("err: %v", err)
 	}

+	retry.Run(t, func(r *retry.R) {


Wait for service to sync to catalog. testrpc cannot be imported here.

hanshasselberg · 2020-09-30T11:40:06Z

testrpc/wait.go

@@ -64,6 +64,7 @@ func WaitUntilNoLeader(t *testing.T, rpc rpcFn, dc string, options ...waitOption
 type waitOption struct {
 	Token                  string
 	WaitForAntiEntropySync bool
+	WaitForService         string


Add ability to wait for a service because services are no longer synced immediately.

hanshasselberg · 2020-09-30T11:41:25Z

sdk/testutil/server.go

@@ -250,7 +250,6 @@ func NewTestServerConfigT(t TestingTB, cb ServerConfigCallback) (*TestServer, er
 		return nil, errors.Wrap(err, "failed marshaling json")
 	}

-	t.Logf("CONFIG JSON: %s", string(b))


remove debug.

hanshasselberg · 2020-09-30T11:46:09Z

agent/consul/server_test.go

-	// Should lose a peer
-	retry.Run(t, func(r *retry.R) {
+	timer := &retry.Timer{Timeout: 10 * time.Second, Wait: 500 * time.Millisecond}
+	retry.RunWith(timer, t, func(r *retry.R) {


hopefully fixing a flaky test with this.

dnephin

Nice! I am not very familiar with state syncing, but I think this makes sense. Just one question about the change in error return

dnephin · 2020-09-30T15:29:19Z

agent/local/state.go

-		return nil

 	default:
 		l.logger.Warn("Syncing node info failed.", "error", err)
-		return err


What are the consequences of this function no longer returning an error?

Based on your comment I revisited this part and reverted to returning and handling any errors from syncNodeInfo. I also added a longer explanation for these changes to the PR description.

hashicorp-cla · 2022-03-12T17:23:40Z

All committers have signed the CLA.

jmurret · 2023-05-09T23:08:07Z

Closing as this is over two years old.

freddygv added 2 commits September 30, 2020 10:57

Make agent<->server sync asynchronous in agent endpoints

ef66dd0

hanshasselberg added the backport/1.8 label Sep 30, 2020

hanshasselberg added 3 commits September 30, 2020 12:28

adding WaitForService to fix tests

84560ab

remove debug out

0a1f586

fix potential panic

ac177c6

hanshasselberg force-pushed the only_async_trigger_changes branch from c10e88c to ac177c6 Compare September 30, 2020 10:58

hanshasselberg added 2 commits September 30, 2020 13:26

wait for service to be synced

a970ebd

t->r

eb6b8cc

hanshasselberg commented Sep 30, 2020

View reviewed changes

remove accidental code

b7e3d4c

hanshasselberg commented Sep 30, 2020

View reviewed changes

fix TestServer_Leave by increasing timeout

770139f

hanshasselberg commented Sep 30, 2020

View reviewed changes

dnephin reviewed Sep 30, 2020

View reviewed changes

return and handle syncNodeInfo error

f6d637f

dnephin mentioned this pull request May 10, 2021

Fix State.SyncChanges mutex Lock/Unlock #6619

Closed

eculver removed the backport/1.8 label Jul 7, 2022

jmurret closed this May 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

only async trigger changes #8778

only async trigger changes #8778

hanshasselberg commented Sep 30, 2020 •

edited

Loading

hanshasselberg Sep 30, 2020

hanshasselberg Sep 30, 2020

hanshasselberg Sep 30, 2020

hanshasselberg Sep 30, 2020

dnephin left a comment

dnephin Sep 30, 2020

hanshasselberg Oct 1, 2020

hashicorp-cla commented Mar 12, 2022 •

edited

Loading

jmurret commented May 9, 2023

only async trigger changes #8778

only async trigger changes #8778

Conversation

hanshasselberg commented Sep 30, 2020 • edited Loading

hanshasselberg Sep 30, 2020

Choose a reason for hiding this comment

hanshasselberg Sep 30, 2020

Choose a reason for hiding this comment

hanshasselberg Sep 30, 2020

Choose a reason for hiding this comment

hanshasselberg Sep 30, 2020

Choose a reason for hiding this comment

dnephin left a comment

Choose a reason for hiding this comment

dnephin Sep 30, 2020

Choose a reason for hiding this comment

hanshasselberg Oct 1, 2020

Choose a reason for hiding this comment

hashicorp-cla commented Mar 12, 2022 • edited Loading

jmurret commented May 9, 2023

hanshasselberg commented Sep 30, 2020 •

edited

Loading

hashicorp-cla commented Mar 12, 2022 •

edited

Loading