-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance S3 Web Identity authentication & security mapping #24645
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@imsayari404 thanks for the enhancement. It mostly looks good. I've added a few minor suggestions for improvement.
return s3WebIdentityEnabled; | ||
} | ||
|
||
@Config("hive.s3.webidentity.enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
@Config("hive.s3.webidentity.enabled") | |
@Config("hive.s3.web.identity.auth.enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open to suggestions here.
if (webIdentityEnabled) { | ||
config.setBoolean(S3_WEB_IDENTITY_ENABLED, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We can always set the config irrespective of true/false and set the token file only if webIdentityEnabled
is true.
if (webIdentityEnabled) { | |
config.setBoolean(S3_WEB_IDENTITY_ENABLED, true); | |
config.setBoolean(S3_WEB_IDENTITY_ENABLED, webIdentityEnabled); | |
if (webIdentityEnabled) { |
boolean webIdentityEnabled = conf.getBoolean("presto.hive.s3.webidentity.enabled", false); | ||
String webIdentityTokenFile = conf.get("presto.hive.s3.web-identity-token-file"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's make these two class variables and move their initialization to the initialize
method, where other configs are retrieved.
presto/presto-hive/src/main/java/com/facebook/presto/hive/s3/PrestoS3FileSystem.java
Lines 201 to 252 in f6c9750
public void initialize(URI uri, Configuration conf) | |
throws IOException | |
{ | |
requireNonNull(uri, "uri is null"); | |
requireNonNull(conf, "conf is null"); | |
super.initialize(uri, conf); | |
setConf(conf); | |
this.uri = URI.create(uri.getScheme() + "://" + uri.getAuthority()); | |
this.workingDirectory = new Path(PATH_SEPARATOR).makeQualified(this.uri, new Path(PATH_SEPARATOR)); | |
HiveS3Config defaults = new HiveS3Config(); | |
this.stagingDirectory = new File(conf.get(S3_STAGING_DIRECTORY, defaults.getS3StagingDirectory().toString())); | |
this.maxAttempts = conf.getInt(S3_MAX_CLIENT_RETRIES, defaults.getS3MaxClientRetries()) + 1; | |
this.maxBackoffTime = Duration.valueOf(conf.get(S3_MAX_BACKOFF_TIME, defaults.getS3MaxBackoffTime().toString())); | |
this.maxRetryTime = Duration.valueOf(conf.get(S3_MAX_RETRY_TIME, defaults.getS3MaxRetryTime().toString())); | |
int maxErrorRetries = conf.getInt(S3_MAX_ERROR_RETRIES, defaults.getS3MaxErrorRetries()); | |
boolean sslEnabled = conf.getBoolean(S3_SSL_ENABLED, defaults.isS3SslEnabled()); | |
Duration connectTimeout = Duration.valueOf(conf.get(S3_CONNECT_TIMEOUT, defaults.getS3ConnectTimeout().toString())); | |
Duration socketTimeout = Duration.valueOf(conf.get(S3_SOCKET_TIMEOUT, defaults.getS3SocketTimeout().toString())); | |
int maxConnections = conf.getInt(S3_MAX_CONNECTIONS, defaults.getS3MaxConnections()); | |
this.multiPartUploadMinFileSize = conf.getLong(S3_MULTIPART_MIN_FILE_SIZE, defaults.getS3MultipartMinFileSize().toBytes()); | |
this.multiPartUploadMinPartSize = conf.getLong(S3_MULTIPART_MIN_PART_SIZE, defaults.getS3MultipartMinPartSize().toBytes()); | |
this.isPathStyleAccess = conf.getBoolean(S3_PATH_STYLE_ACCESS, defaults.isS3PathStyleAccess()); | |
this.useInstanceCredentials = conf.getBoolean(S3_USE_INSTANCE_CREDENTIALS, defaults.isS3UseInstanceCredentials()); | |
this.pinS3ClientToCurrentRegion = conf.getBoolean(S3_PIN_CLIENT_TO_CURRENT_REGION, defaults.isPinS3ClientToCurrentRegion()); | |
this.s3IamRole = conf.get(S3_IAM_ROLE, defaults.getS3IamRole()); | |
this.s3IamRoleSessionName = conf.get(S3_IAM_ROLE_SESSION_NAME, defaults.getS3IamRoleSessionName()); | |
verify(!(useInstanceCredentials && conf.get(S3_IAM_ROLE) != null), | |
"Invalid configuration: either use instance credentials or specify an iam role"); | |
verify((pinS3ClientToCurrentRegion && conf.get(S3_ENDPOINT) == null) || !pinS3ClientToCurrentRegion, | |
"Invalid configuration: either endpoint can be set or S3 client can be pinned to the current region"); | |
this.sseEnabled = conf.getBoolean(S3_SSE_ENABLED, defaults.isS3SseEnabled()); | |
this.sseType = PrestoS3SseType.valueOf(conf.get(S3_SSE_TYPE, defaults.getS3SseType().name())); | |
this.sseKmsKeyId = conf.get(S3_SSE_KMS_KEY_ID, defaults.getS3SseKmsKeyId()); | |
this.s3AclType = PrestoS3AclType.valueOf(conf.get(S3_ACL_TYPE, defaults.getS3AclType().name())); | |
String userAgentPrefix = conf.get(S3_USER_AGENT_PREFIX, defaults.getS3UserAgentPrefix()); | |
this.skipGlacierObjects = conf.getBoolean(S3_SKIP_GLACIER_OBJECTS, defaults.isSkipGlacierObjects()); | |
this.s3StorageClass = conf.getEnum(S3_STORAGE_CLASS, defaults.getS3StorageClass()); | |
ClientConfiguration configuration = new ClientConfiguration() | |
.withMaxErrorRetry(maxErrorRetries) | |
.withProtocol(sslEnabled ? Protocol.HTTPS : Protocol.HTTP) | |
.withConnectionTimeout(toIntExact(connectTimeout.toMillis())) | |
.withSocketTimeout(toIntExact(socketTimeout.toMillis())) | |
.withMaxConnections(maxConnections) | |
.withUserAgentPrefix(userAgentPrefix) | |
.withUserAgentSuffix(S3_USER_AGENT_SUFFIX); | |
this.credentialsProvider = createAwsCredentialsProvider(uri, conf); | |
this.s3 = createAmazonS3Client(conf, configuration); | |
} |
|
||
if (!isNullOrEmpty(s3IamRole)) { | ||
if (webIdentityEnabled) { | ||
log.info("Using Web Identity Token Credentials Provider."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We should minimize info logs in production and use them only when absolutely necessary. If logging is required in this case, we can add a debug log.
} | ||
return providerBuilder.build(); | ||
} | ||
log.info("Using STS Assume Role Session Credentials Provider."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same here
if (webIdentityEnabled) { | ||
log.info("Using Web Identity Token Credentials Provider."); | ||
WebIdentityTokenCredentialsProvider.Builder providerBuilder = WebIdentityTokenCredentialsProvider.builder() | ||
.roleArn(s3IamRole); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please set roleSessionName
here along with roleArn
.
@@ -28,6 +28,8 @@ public interface S3ConfigurationUpdater | |||
String S3_PIN_CLIENT_TO_CURRENT_REGION = "presto.s3.pin-client-to-current-region"; | |||
String S3_USE_INSTANCE_CREDENTIALS = "presto.s3.use-instance-credentials"; | |||
String S3_IAM_ROLE = "presto.hive.s3.iam-role"; | |||
String S3_WEB_IDENTITY_TOKEN_FILE = "presto.hive.s3.web-identity-token-file"; | |||
String S3_WEB_IDENTITY_ENABLED = "presto.hive.s3.webidentity.enabled"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can align this with config name change above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to mention, let's also add a test in TestPrestoS3FileSystem
.
For reference:
presto/presto-hive/src/test/java/com/facebook/presto/hive/s3/TestPrestoS3FileSystem.java
Lines 151 to 162 in f6c9750
public void testAssumeRoleCredentials() | |
throws Exception | |
{ | |
Configuration config = new Configuration(); | |
config.set(S3_IAM_ROLE, "role"); | |
config.setBoolean(S3_USE_INSTANCE_CREDENTIALS, false); | |
try (PrestoS3FileSystem fs = new PrestoS3FileSystem()) { | |
fs.initialize(new URI("s3n://test-bucket/"), config); | |
assertInstanceOf(getAwsCredentialsProvider(fs), STSAssumeRoleSessionCredentialsProvider.class); | |
} | |
} |
3174f42
to
09804ab
Compare
@imjalpreet , I’ve made the suggested changes. Could you review again when you have time? Thanks! |
09804ab
to
6a8273c
Compare
config.setBoolean("presto.hive.s3.web.identity.auth.enabled", true); | ||
config.set("presto.hive.s3.web-identity-token-file", "/path/to/token"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's use static variables defined in S3ConfigurationUpdater
for the config names here
6a8273c
to
7004856
Compare
7004856
to
095f6bd
Compare
config.setBoolean(S3_WEB_IDENTITY_ENABLED, webIdentityEnabled); | ||
if (webIdentityEnabled) { | ||
if (webIdentityTokenFile != null) { | ||
config.set(S3_WEB_IDENTITY_TOKEN_FILE, webIdentityTokenFile); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these 2 are dependent config? Then we can set both after the condition check?
config.setBoolean(S3_WEB_IDENTITY_ENABLED, webIdentityEnabled); | |
if (webIdentityEnabled) { | |
if (webIdentityTokenFile != null) { | |
config.set(S3_WEB_IDENTITY_TOKEN_FILE, webIdentityTokenFile); | |
} | |
} | |
if (webIdentityEnabled && webIdentityTokenFile != null) { | |
config.setBoolean(S3_WEB_IDENTITY_ENABLED, webIdentityEnabled); | |
config.set(S3_WEB_IDENTITY_TOKEN_FILE, webIdentityTokenFile); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two configurations are related but not strictly dependent. S3_WEB_IDENTITY_ENABLED
determines if web identity authentication is used, but the token file is optional. Keeping them separate ensures flexibility, as web identity mode can be enabled without necessarily requiring a token file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok in that case we can modify it to something like -
if (webIdentityEnabled && webIdentityTokenFile != null) {
config.set(S3_WEB_IDENTITY_TOKEN_FILE, webIdentityTokenFile);
}
if (!isNullOrEmpty(webIdentityTokenFile)) { | ||
providerBuilder.webIdentityTokenFile(webIdentityTokenFile); | ||
} | ||
return providerBuilder.build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can s3 identity auth work w/o token file as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, web identity auth can work without explicitly setting the token file in the config. If the environment variable AWS_WEB_IDENTITY_TOKEN_FILE
is set, AWS SDK will pick it up automatically, and authentication will still function.
095f6bd
to
652de66
Compare
@@ -124,6 +128,10 @@ public void updateConfiguration(Configuration config) | |||
if (sseKmsKeyId != null) { | |||
config.set(S3_SSE_KMS_KEY_ID, sseKmsKeyId); | |||
} | |||
if (webIdentityEnabled && webIdentityTokenFile != null) { | |||
config.setBoolean(S3_WEB_IDENTITY_ENABLED, webIdentityEnabled); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@imsayari404 You had mentioned, We can still use webidentity auth if flag is true but token file is not given right?
If that's the case then you should set config.setBoolean(S3_WEB_IDENTITY_ENABLED, webIdentityEnabled);
outside condition itself? Please verify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right! Since web identity auth can work without explicitly setting the token file, S3_WEB_IDENTITY_ENABLED
should be set outside the condition. My bad, I'll update it accordingly.
- Updated `createAwsCredentialsProvider` to support Web Identity. - Modified `PrestoS3ConfigurationUpdater`, `S3ConfigurationUpdater`, and `PrestoS3FileSystem` for Web Identity auth. - Web Identity configs added `hive.s3.web-identity-token-file` and `hive.s3.webidentity.enabled` - Updated `TestHiveS3Config` to validate changes.
652de66
to
9e4c43b
Compare
Description
This PR enhances S3 Web Identity authentication by updating Presto’s security mapping to support Web Identity credentials. It introduces new configurations and ensures compatibility with existing S3 authentication mechanisms.
Changes
createAwsCredentialsProvider
to support Web Identity authentication.PrestoS3ConfigurationUpdater
,S3ConfigurationUpdater
, andPrestoS3FileSystem
.hive.s3.web-identity-token-file
hive.s3.webidentity.enabled
TestHiveS3Config
to validate Web Identity authentication.Motivation and Context
Impact
Test Plan
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.