Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change placement rule from voter to learner after some store down, find learner count not equal to the setting in placement rule #7358

Closed
mayjiang0203 opened this issue Nov 13, 2023 · 3 comments
Assignees
Labels
severity/moderate type/bug The issue is confirmed as a bug.

Comments

@mayjiang0203
Copy link

Bug Report

What did you do?

image

original placement rule:
var PlacementRuleConfigDrAutosync = fmt.Sprintf(PlacementRuleZoneTmp,
"voter", 1, "zone", "dc1-zone1",
"voter", 1, "zone", "dc1-zone2",
"voter", 1, "zone", "dc1-zone3",
"follower", 1, "zone", "dc2-zone1",
"follower", 1, "zone", "dc2-zone2",
"learner", 1, "zone", "dc2-zone3",
)
The new setting of the placement rule:
var PlacementRuleDowngrade3 = fmt.Sprintf(PlacementRuleZoneTmp,
"voter", 1, "zone", "dc1-zone1",
"voter", 1, "zone", "dc1-zone2",
"voter", 1, "zone", "dc1-zone3",
"learner", 1, "zone", "dc2-zone1",
"learner", 1, "zone", "dc2-zone2",
"learner", 1, "zone", "dc2-zone3",
)

What did you expect to see?

Learner count for each region should equal 3.

What did you see instead?

There are four learner in one region.

sh-4.2# tiup ctl:v6.5.4 pd -u http://pd1-peer:2379 region --jq '[.regions[] | select([.peers[] | select(.role_name == "Learner")] | length != 3)] '|jq .
Starting component `ctl`: /root/.tiup/components/ctl/v6.5.4/ctl /root/.tiup/components/ctl/v6.5.4/ctl pd -u http://pd1-peer:2379 region --jq [.regions[] | select([.peers[] | select(.role_name == "Learner")] | length != 3)]
[
  {
    "id": 2001,
    "start_key": "7480000000000000FF545F728000000000FF1BE1E40000000000FA",
    "end_key": "7480000000000000FF545F728000000000FF27D4CC0000000000FA",
    "epoch": {
      "conf_ver": 59,
      "version": 73
    },
    "peers": [
      {
        "role_name": "Learner",
        "is_learner": true,
        "id": 2274,
        "store_id": 2060,
        "role": 1
      },
      {
        "role_name": "Voter",
        "id": 2803,
        "store_id": 2061
      },
      {
        "role_name": "Voter",
        "id": 2830,
        "store_id": 2057
      },
      {
        "role_name": "Learner",
        "is_learner": true,
        "id": 2853,
        "store_id": 6,
        "role": 1
      },
      {
        "role_name": "Voter",
        "id": 2907,
        "store_id": 9
      },
      {
        "role_name": "Learner",
        "is_learner": true,
        "id": 2962,
        "store_id": 2243,
        "role": 1
      },
      {
        "role_name": "Learner",
        "is_learner": true,
        "id": 2981,
        "store_id": 8,
        "role": 1
      }
    ],
    "leader": {
      "role_name": "Voter",
      "id": 2830,
      "store_id": 2057
    },
    "down_peers": [
      {
        "peer": {
          "role_name": "Learner",
          "is_learner": true,
          "id": 2274,
          "store_id": 2060,
          "role": 1
        },
        "down_seconds": 46926
      },
      {
        "peer": {
          "role_name": "Learner",
          "is_learner": true,
          "id": 2853,
          "store_id": 6,
          "role": 1
        },
        "down_seconds": 46926
      },
      {
        "peer": {
          "role_name": "Learner",
          "is_learner": true,
          "id": 2962,
          "store_id": 2243,
          "role": 1
        },
        "down_seconds": 47744
      },
      {
        "peer": {
          "role_name": "Learner",
          "is_learner": true,
          "id": 2981,
          "store_id": 8,
          "role": 1
        },
        "down_seconds": 47744
      }
    ],
    "pending_peers": [
      {
        "role_name": "Learner",
        "is_learner": true,
        "id": 2981,
        "store_id": 8,
        "role": 1
      }
    ],
    "cpu_usage": 0,
    "written_bytes": 0,
    "read_bytes": 0,
    "written_keys": 0,
    "read_keys": 0,
    "approximate_size": 124,
    "approximate_keys": 1014031
  }
]

What version of PD are you using (pd-server -V)?

20231026-6.5.4-ONCALL-6501

@mayjiang0203 mayjiang0203 added the type/bug The issue is confirmed as a bug. label Nov 13, 2023
@mayjiang0203
Copy link
Author

/severity moderate
/assign @lhy1024

@lhy1024
Copy link
Contributor

lhy1024 commented Nov 13, 2023

sh-4.2# curl http://pd1-peer:2379/pd/api/v1/config/rules/region/2001/detail
{
  "rule-fits": [
    {
      "rule": {
        "group_id": "pd",
        "id": "learner1",
        "start_key": "",
        "end_key": "",
        "role": "learner",
        "is_witness": false,
        "count": 1,
        "label_constraints": [
          {
            "key": "zone",
            "op": "in",
            "values": [
              "dc2-zone3"
            ]
          }
        ],
        "location_labels": [
          "dc",
          "zone",
          "rack",
          "host"
        ]
      },
      "peers": [
        {
          "id": 2962,
          "store_id": 2243,
          "role": 1
        }
      ],
      "peers-different-role": null,
      "isolation-score": 0
    },
    {
      "rule": {
        "group_id": "pd",
        "id": "pfollower1",
        "start_key": "",
        "end_key": "",
        "role": "learner",
        "is_witness": false,
        "count": 1,
        "label_constraints": [
          {
            "key": "zone",
            "op": "in",
            "values": [
              "dc2-zone1"
            ]
          }
        ],
        "location_labels": [
          "dc",
          "zone",
          "rack",
          "host"
        ]
      },
      "peers": [
        {
          "id": 2853,
          "store_id": 6,
          "role": 1
        }
      ],
      "peers-different-role": null,
      "isolation-score": 0
    },
    {
      "rule": {
        "group_id": "pd",
        "id": "pfollower2",
        "start_key": "",
        "end_key": "",
        "role": "learner",
        "is_witness": false,
        "count": 1,
        "label_constraints": [
          {
            "key": "zone",
            "op": "in",
            "values": [
              "dc2-zone2"
            ]
          }
        ],
        "location_labels": [
          "dc",
          "zone",
          "rack",
          "host"
        ]
      },
      "peers": [
        {
          "id": 2274,
          "store_id": 2060,
          "role": 1
        }
      ],
      "peers-different-role": null,
      "isolation-score": 0
    },
    {
      "rule": {
        "group_id": "pd",
        "id": "pvoter1",
        "start_key": "",
        "end_key": "",
        "role": "voter",
        "is_witness": false,
        "count": 1,
        "label_constraints": [
          {
            "key": "zone",
            "op": "in",
            "values": [
              "dc1-zone1"
            ]
          }
        ],
        "location_labels": [
          "dc",
          "zone",
          "rack",
          "host"
        ]
      },
      "peers": [
        {
          "id": 2830,
          "store_id": 2057
        }
      ],
      "peers-different-role": null,
      "isolation-score": 0
    },
    {
      "rule": {
        "group_id": "pd",
        "id": "pvoter2",
        "start_key": "",
        "end_key": "",
        "role": "voter",
        "is_witness": false,
        "count": 1,
        "label_constraints": [
          {
            "key": "zone",
            "op": "in",
            "values": [
              "dc1-zone2"
            ]
          }
        ],
        "location_labels": [
          "dc",
          "zone",
          "rack",
          "host"
        ]
      },
      "peers": [
        {
          "id": 2907,
          "store_id": 9
        }
      ],
      "peers-different-role": null,
      "isolation-score": 0
    },
    {
      "rule": {
        "group_id": "pd",
        "id": "pvoter3",
        "start_key": "",
        "end_key": "",
        "role": "voter",
        "is_witness": false,
        "count": 1,
        "label_constraints": [
          {
            "key": "zone",
            "op": "in",
            "values": [
              "dc1-zone3"
            ]
          }
        ],
        "location_labels": [
          "dc",
          "zone",
          "rack",
          "host"
        ]
      },
      "peers": [
        {
          "id": 2803,
          "store_id": 2061
        }
      ],
      "peers-different-role": null,
      "isolation-score": 0
    }
  ],
  "orphan-peers": [
    {
      "id": 2981,
      "store_id": 8,
      "role": 1
    }
  ]
}

There are three peers which are all down peers and learner in rule fits, and there is a peer which is pending peer, down peer and learner in orphan peers.

In current code, when there are pending or down peer in rule fits, we will try to replace it with orphan peer, so it cannot be successful.

@lhy1024
Copy link
Contributor

lhy1024 commented Nov 13, 2023

but it seems to be expected #4067

@lhy1024 lhy1024 closed this as completed Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/moderate type/bug The issue is confirmed as a bug.
Projects
Development

No branches or pull requests

2 participants