Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG][Opensearch] Opensearch snapshot management: can't select AWS region for configuring s3 bucket. #9265

Closed
hm2thr33 opened this issue Aug 11, 2023 · 9 comments
Labels
bug Something isn't working Storage Issues and PRs relating to data and metadata storage

Comments

@hm2thr33
Copy link

hm2thr33 commented Aug 11, 2023

Describe the bug
After upgrade Opensearch from 2.8.0 to 2.9.0 version snapshot management stop working.
Can't select AWS eu-central-1 region for configuring s3 bucket, available only one region us-east-1.

To Reproduce
Steps to reproduce the behavior:

  1. Create backet in eu-central-1 region.
  2. Configure backet via devtools:
PUT /_snapshot/backup.opensearch?pretty
{
  "type": "s3",
  "settings": {
    "bucket": "backup.opensearch",
    "region": "eu-central-1"
  }
}
  1. Get an error:
"reason": "The authorization header is malformed; the region 'eu-central-1' is wrong; expecting 'us-east-1' (Service: S3, Status Code: 400, Request ID: BETMPMA86J7Q7CTH, Extended Request ID: vzSJCEIEwoZRRFkiEsKX7n/tk86XEkUnHFN28M1qxvQlrvHf7/X7NBrLSDlVpyupRoayeSfJUb8=)"

Expected behavior
The ability to use the bucket in the selected region, and not just in one.

Chart Name
Opensearch

Screenshots
N/A

Host/Environment (please complete the following information):

  • Helm Version: [3.12.2]
  • Kubernetes Version: [1.24.16]
  • OpenSearch Helm Chart version [2.14.1]

Additional context
From my side any configurations for s3 backet were changed also any configuration for opensearch.
Only upgrade Openserch version from 2.8.0 to 2.9.0.

Temporary use backet in us-east-1 region.

@hm2thr33 hm2thr33 added bug Something isn't working untriaged labels Aug 11, 2023
@dblock dblock transferred this issue from opensearch-project/helm-charts Aug 11, 2023
@dblock
Copy link
Member

dblock commented Aug 14, 2023

@raghuvanshraj care to take a look?

@raghuvanshraj
Copy link
Contributor

@raghuvanshraj care to take a look?

Taking a look

@raghuvanshraj
Copy link
Contributor

@hm2thr33 after following the steps outlined by you, I am able to create a bucket in eu-central-1, take a snapshot and restore from it. Are you specifying any custom settings to the s3 client via opensearch.yml?

Pasting my requests here:
PUT snapshot repo:

curl --location --request PUT 'http://localhost:9200/_snapshot/bucket-9265-eu-central-1' \
--header 'Content-Type: application/json' \
--data '{
  "type": "s3",
  "settings": {
    "bucket": "bucket-9265-eu-central-1",
    "region": "eu-central-1"
  }
}'

PUT snapshot:

curl --location --request PUT 'http://localhost:9200/_snapshot/bucket-9265-eu-central-1/snapshot-2'

POST snapshot restore:

curl --location --request POST 'http://localhost:9200/_snapshot/bucket-9265-eu-central-1/snapshot-2/_restore'

@hm2thr33
Copy link
Author

@raghuvanshraj I don't have any custom settings in opensearch.yml for s3 client.

The repository I use is located in the eu-central-1 region. The user has not changed. Now I'm using the same user to use a bucket in the us-east-1 region and everything works for me, but I need eu-central-1.

I also have cluster 2.8.0 not updated, I tried to connect the repository to them and everything works. But after the upgrade, I can no longer connect to the same repository, the error clearly says that I am waiting for a repository in the us-east-1 region.

@raghuvanshraj
Copy link
Contributor

raghuvanshraj commented Aug 21, 2023

@hm2thr33 the requests I have pasted in the above mentioned comment are for the exact same configuration. Sometimes we see 307s for newly created buckets, but based on this forum question, it looks like you are seeing a 301.

  1. Can you confirm that the cluster has no overrides in the static settings?
  2. Can you provide the exact response of the following payload?
PUT /_snapshot/backup.opensearch?pretty
{
  "type": "s3",
  "settings": {
    "bucket": "backup.opensearch",
    "region": "eu-central-1"
  }
}

I'll scope the DEFAULT_S3_ENDPOINT to the region specific URL to avoid 307s on newly created buckets.

@hm2thr33
Copy link
Author

hm2thr33 commented Aug 22, 2023

Hi @raghuvanshraj,

JFYI: I already have 6 clusters with this problem, all of which are running in Kubernetes, which is in the EU-CENTRAL-1 region. The buckets were also created in this region, they are not new, I started using them when I installed the first cluster on version 2.5.0, all updates to 2.9.0 were without any problems.
But on version 2.9.0, problems with the S3 bucket and region started.

I don't have any custom settings for the S3 bucket in opensearch.yml:

1

After the upgrade, when I try to add the same bucket I used for this cluster again, I see the following error:

{
  "error": {
    "root_cause": [
      {
        "type": "repository_verification_exception",
        "reason": "[backup.opensearch] path  is not accessible on cluster-manager node"
      }
    ],
    "type": "repository_verification_exception",
    "reason": "[backup.opensearch] path  is not accessible on cluster-manager node",
    "caused_by": {
      "type": "i_o_exception",
      "reason": "Unable to upload object [tests-tH6wHtNtTfWfntrh50S3EA/master.dat] using a single upload",
      "caused_by": {
        "type": "s3_exception",
        "reason": "The authorization header is malformed; the region 'eu-central-1' is wrong; expecting 'us-east-1' (Service: S3, Status Code: 400, Request ID: BETMPMA86J7Q7CTH, Extended Request ID: vzSJCEIEwoZRRFkiEsKX7n/tk86XEkUnHFN28M1qxvQlrvHf7/X7NBrLSDlVpyupRoayeSfJUb8=)"
      }
    }
  },
  "status": 500
}

I'm pretty sure I have access to this S3 bucket from OpenSearch cluster, because this S3 bucket was used before the cluster upgrade, and this S3 bucket is in the same region (EU-CENTRAL-1) as the OpenSearch cluster. The S3 bucket settings and access to them have not been changed.

@raghuvanshraj
Copy link
Contributor

raghuvanshraj commented Aug 22, 2023

I was able to replicate this issue, but for me it is only happening if the bucket region and the region supplied in the request are different. So if I create a bucket bucket-9265-ap-northeast-1 in ap-northeast-1 and then try to register a repository against it with the region eu-central-1

curl --location --request PUT 'http://localhost:9200/_snapshot/bucket-9265-eu-central-1' \
--header 'Content-Type: application/json' \
--data '{
  "type": "s3",
  "settings": {
    "bucket": "bucket-9265-ap-northeast-1",
    "region": "eu-central-1"
  }
}'

I get the following error:

{
    "error": {
        "root_cause": [
            {
                "type": "repository_verification_exception",
                "reason": "[bucket-9265-eu-central-1] path  is not accessible on cluster-manager node"
            }
        ],
        "type": "repository_verification_exception",
        "reason": "[bucket-9265-eu-central-1] path  is not accessible on cluster-manager node",
        "caused_by": {
            "type": "i_o_exception",
            "reason": "Unable to upload object [tests-LdGyFSq_RNeY3eWEW467Xg/master.dat] using a single upload",
            "caused_by": {
                "type": "s3_exception",
                "reason": "The authorization header is malformed; the region 'eu-central-1' is wrong; expecting 'ap-northeast-1' (Service: S3, Status Code: 400, Request ID: B8E5HHZFM04XG7FF, Extended Request ID: zqtCpaxyP19GvIGgTDOZiUHgpDMavGAhf/qiTFiadBP2oEVlAkJRUIevjI5HBK1EXZ0ulyal2rg=)"
            }
        }
    },
    "status": 500
}

Similarly, if I don't supply a region in my payload, it defaults to us-east-1 since we don't provide a region override to the client:

curl --location --request PUT 'http://localhost:9200/_snapshot/bucket-9265-eu-central-1' \
--header 'Content-Type: application/json' \
--data '{
  "type": "s3",
  "settings": {
    "bucket": "bucket-9265-ap-northeast-1"
  }
}'

As expected, I get the same error, but instead of eu-central-1, I get us-east-1:

{
    "error": {
        "root_cause": [
            {
                "type": "repository_verification_exception",
                "reason": "[bucket-9265-eu-central-1] path  is not accessible on cluster-manager node"
            }
        ],
        "type": "repository_verification_exception",
        "reason": "[bucket-9265-eu-central-1] path  is not accessible on cluster-manager node",
        "caused_by": {
            "type": "i_o_exception",
            "reason": "Unable to upload object [tests-3W0j4lyLRte8NdwHALAnaQ/master.dat] using a single upload",
            "caused_by": {
                "type": "s3_exception",
                "reason": "The authorization header is malformed; the region 'us-east-1' is wrong; expecting 'ap-northeast-1' (Service: S3, Status Code: 400, Request ID: ET97XVJ1C3PVEPX9, Extended Request ID: Zx0SKdusKV1YWvjHv7rLsCt3M/NIyINTNPaJi5WUtdqPRQkKRxsUIXxjOj/zOekdSIaoinFIisk=)"
            }
        }
    },
    "status": 500
}

Based on this, the only possible scenario I can see here is that S3 is resolving the bucket backup.opensearch to be in us-east-1, not eu-central-1, due to which when a client scoped to eu-central-1 calls S3 service hoping to upload objects to it, S3 throws this error.

@anasalkouz anasalkouz added Storage Issues and PRs relating to data and metadata storage and removed untriaged distributed framework labels Aug 22, 2023
@dblock
Copy link
Member

dblock commented Aug 24, 2023

Thanks for debugging this @raghuvanshraj - @hm2thr33 care to take a look at the details above?

@sachinpkale
Copy link
Member

@hm2thr33 Closing this issue. Feel free to re-open if problem still exists.

@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in Storage Project Board Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Storage Issues and PRs relating to data and metadata storage
Projects
Status: ✅ Done
Development

No branches or pull requests

6 participants