-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Index Level Encryption plugin #12902
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Olasoji Denloye <olasoji.denloye@intel.com>
Signed-off-by: Olasoji Denloye <olasoji.denloye@intel.com>
Signed-off-by: Olasoji Denloye <olasoji.denloye@intel.com>
Signed-off-by: Olasoji Denloye <olasoji.denloye@intel.com>
Signed-off-by: Olasoji Denloye <olasoji.denloye@intel.com>
❌ Gradle check result for ded0f58: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
{"run-benchmark-test": "id_3"} |
The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/1810/ . Final results will be published once the job is completed. |
Benchmark ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/1810/
|
Benchmark Baseline Comparison ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/25/
|
Listing down my meeting notes with @asonje Tenets
High level items pending in this PR
Todo(@kumargu) |
@asonje checking if you have bandwidth to resume working on this. thanks. |
@kumargu yes i do. I am currently working on some optimizations and will update you once its ready |
} | ||
} | ||
|
||
private void storeWrappedKey(byte[] wrappedKey) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this thread-safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only called from within the constructor; the directory is not visible to other threads
* @param dest the new file name | ||
*/ | ||
@Override | ||
public void rename(String source, String dest) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this atomic? what happens when a node crashes while a rename is in progress?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not atomic. if a crash happens during a rename operation, the ivMap will become inconsistent and would need to be fixed up on the next open. I will look into this
try { | ||
Cipher cipher = CipherFactory.getCipher(provider); | ||
String ivEntry = ivMap.get(getDirectory() + "/" + name); | ||
if (ivEntry == null) throw new IOException("failed to open file. " + name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should not be an IOException. This could happen if the ivMap has gone corrupted or there's a bug in populating the ivMap. I would wrap it in a RuntimeException
success = true; | ||
return indexInput; | ||
} finally { | ||
if (success == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we avoid the usage of the success
flag by using try-with-resources (as below). I am scared that a concurrent thread may change the state of the success flag.
@Override
public IndexInput openInput(String name, IOContext context) throws IOException {
if (name.contains("segments_") || name.endsWith(".si")) {
return super.openInput(name, context);
}
ensureOpen();
ensureCanRead(name);
Path path = getDirectory().resolve(name);
try (FileChannel fc = FileChannel.open(path, StandardOpenOption.READ)) {
Cipher cipher = CipherFactory.getCipher(provider);
String ivEntry = ivMap.get(getDirectory() + "/" + name);
if (ivEntry == null) {
throw new IOException("failed to open file. " + name);
}
byte[] iv = Base64.getDecoder().decode(ivEntry);
CipherFactory.initCipher(cipher, this, Optional.of(iv), Cipher.DECRYPT_MODE, 0);
return new CryptoBufferedIndexInput("CryptoBufferedIndexInput(path=\"" + path + "\")", fc, context, cipher, this);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
success
is a local variable and is not shared among threads. the try-with-resource pattern however should be ok
} catch (java.nio.file.NoSuchFileException fnfe) { | ||
|
||
} | ||
IndexOutput out = super.createOutput("ivMap", new IOContext()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds weird to me that we are doing a create
Output within a close
.
try { | ||
deleteFile("ivMap"); | ||
} catch (java.nio.file.NoSuchFileException fnfe) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to handle the error here.
out.writeMapOfStrings(ivMap); | ||
out.close(); | ||
isOpen = false; | ||
deletePendingFiles(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deletePendingFiles
is also synchronised.. can we get into a deadlock?
we need to use ReentrantLock lock or the lock from the lockFactory
random.nextBytes(iv); | ||
if (dataKey == null) throw new RuntimeException("dataKey is null!"); | ||
CipherFactory.initCipher(cipher, this, Optional.of(iv), Cipher.ENCRYPT_MODE, 0); | ||
ivMap.put(getDirectory() + "/" + name, Base64.getEncoder().encodeToString(iv)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to make this operation thread-safe while reducing the scope of the lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we immediately persist ivMap
to minimize the risk of loosing it.
ivMap.put(getDirectory() + "/" + name, Base64.getEncoder().encodeToString(iv));
IndexOutput out = super.createOutput("ivMap", new IOContext());
out.writeMapOfStrings(ivMap);
out.close();
public final class CryptoDirectory extends NIOFSDirectory { | ||
private Path location; | ||
private Key dataKey; | ||
private ConcurrentSkipListMap<String, String> ivMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make this final
?
try { | ||
in = super.openInput("ivMap", new IOContext()); | ||
} catch (java.nio.file.NoSuchFileException nsfe) { | ||
in = null; | ||
} | ||
if (in != null) { | ||
Map<String, String> tmp = in.readMapOfStrings(); | ||
ivMap.putAll(tmp); | ||
in.close(); | ||
dataKey = new SecretKeySpec(keyProvider.decryptKey(getWrappedKey()), "AES"); | ||
} else { | ||
DataKeyPair dataKeyPair = keyProvider.generateDataPair(); | ||
dataKey = new SecretKeySpec(dataKeyPair.getRawKey(), "AES"); | ||
storeWrappedKey(dataKeyPair.getEncryptedKey()); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move this logic to an init for clearer code?
public static CryptoDirectory create(LockFactory lockFactory, Path location, Provider provider, MasterKeyProvider keyProvider) throws IOException {
CryptoDirectory directory = new CryptoDirectory(lockFactory, location, provider);
directory.init(keyProvider);
return directory;
}
private CryptoDirectory(LockFactory lockFactory, Path location, Provider provider) throws IOException {
super(location, lockFactory);
this.location = location;
this.provider = provider;
this.ivMap = new ConcurrentSkipListMap<>();
}
/** | ||
* Creates a new CryptoDirectoryFactory | ||
*/ | ||
public CryptoDirectoryFactory() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do we invoke the factory
?
@asonje we'll need to figure out how we can run the plugin against the performance benchmarking tool. Similarly, do you know if there's a way to run the Opensearch core integration tests with this plugin enabled? That would give us a lot of coverage on correctness. @cwperks to share any insights on above. |
} | ||
|
||
private byte[] getWrappedKey() { | ||
try (IndexInput in = super.openInput("keyfile", new IOContext())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IndexInput
is not thread-safe, can we have a contention with write?
*/ | ||
@Override | ||
public IndexInput openInput(String name, IOContext context) throws IOException { | ||
if (name.contains("segments_") || name.endsWith(".si")) return super.openInput(name, context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's this special handling about? could we please add a comment on the intention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it means files with this extension are not encrypted. The server reads them on startup
in = null; | ||
} | ||
if (in != null) { | ||
Map<String, String> tmp = in.readMapOfStrings(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we please rename tmp
to maybe loadedIVMap
?
* @opensearch.internal | ||
*/ | ||
public final class CryptoDirectory extends NIOFSDirectory { | ||
private Path location; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am surprised, this has zero usage.
boolean success = false; | ||
try { | ||
Cipher cipher = CipherFactory.getCipher(provider); | ||
String ivEntry = ivMap.get(getDirectory() + "/" + name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering ivMap.get(location)
would have same results?
#12472 is still a blocker for this feature |
Do you just want the plugin to be installed? I don't think that would tell much if the plugin is installed, but not configured. Typically for plugin repos they would write their own integration tests to test out core functionality of the plugin. For example with security, their are integ tests that test different types of security configurations whether is be different SSL configurations or more general security settings like testing with different roles mappings that contains permutations on FLS/DLS/FieldMasking. I can route you to some examples of how to setup Github Actions to run integ tests w/ a plugin installed. |
This is a more involved example (which includes setup w/ security), but this is how ISM creates a testcluster to run integ tests against: https://github.com/opensearch-project/index-management/blob/main/build.gradle#L347-L413 When the In addition to the setup in |
One thing to consider when introducing a new plugin is whether to build it directly in the core repo (native plugin) vs creating a separate repository. If a plugin is introduced in a separate repository it can still be included in the default distribution of OpenSearch. Some advantages of a separate repo is having a separate set of maintainers that are specialized in that plugin. The core is a big repo and each maintainer may be specialized in a different area of the core. Another advantage of a separate repo is being able to setup GH actions specific to that plugin's tests. That being said, the drawbacks of that approach may be that there could be a bit of scaffolding to get setup. Any plugin included in the default distribution would need to have an active set of maintainers. |
I think we will separate it into a different repo (it will be easier to maintain and develop). I am fine for now to get it shipped in this state and open a different to move it to a different repo. |
try { | ||
deleteFile("ivMap"); | ||
} catch (java.nio.file.NoSuchFileException fnfe) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I don't understand the intention here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on close the ivMap is written to disk. This overwrites the existing file with the live version. perhaps the ivMap can be synced more frequently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps the ivMap can be synced more frequently.
Yeah. we cannot afford loosing the ivMap. We can probably write ivMap to disk, read back, cache and use for enc/dec. That way we ensure durability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idea: we can use mmap
to directly write ivMap contents to disk for better durability.
* | ||
* @opensearch.internal | ||
*/ | ||
final class CryptoBufferedIndexInput extends BufferedIndexInput { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we please move it to a different class of its own?
res = cipher.update(b, offset, chunk); | ||
if (res != null) out.write(res); | ||
} catch (IllegalStateException e) { | ||
throw new IllegalStateException("count is " + count + " " + e.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think count
variable adds much value.
byte[] res; | ||
while (length > 0) { | ||
count++; | ||
final int chunk = Math.min(length, CHUNK_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffering may be ineffective for small writes. Maybe we can maintain an enum of variable CHUNK sizes and buffer smaller writes.
private enum ChunkSize {
LARGE(8192),
MEDIUM(4096),
SMALL(1024);
private final int size;
ChunkSize(int size) {
this.size = size;
}
public int getSize() {
return size;
}
public static ChunkSize bestPossibleBufferLength(int length) {
if (length >= LARGE.size) return LARGE;
if (length >= MEDIUM.size) return MEDIUM;
return SMALL;
}
}
ChunkSize chunkSize = ChunkSize.bestPossibleBufferLength(length);
int bytesToWrite = Math.min(chunkSize.getSize(), length);
final class CryptoBufferedIndexInput extends BufferedIndexInput { | ||
/** The maximum chunk size for reads of 16384 bytes. */ | ||
private static final int CHUNK_SIZE = 16384; | ||
ByteBuffer tmpBuffer = ByteBuffer.allocate(CHUNK_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can use directBuffers
here and that would result in sequential IO.
https://stackoverflow.com/questions/5670862/bytebuffer-allocate-vs-bytebuffer-allocatedirect
this.directBuffer = ByteBuffer.allocateDirect(CHUNK_SIZE);
(note to self: should we make the CHUNK_SIZE configurable?)
} | ||
|
||
@SuppressForbidden(reason = "FileChannel#read is a faster alternative to synchronized block") | ||
private int read(ByteBuffer dst, long position) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with directBuffers, read would look slightly different.
private int read(ByteBuffer dst, long position) throws IOException {
directBuffer.clear().limit(dst.remaining());
int bytesRead = channel.read(directBuffer, position);
if (bytesRead > 0) {
directBuffer.flip();
try {
int decrypted = cipher.update(directBuffer, dst);
// handle logic, when we have reached the end-of-file.
if (position + bytesRead == end) {
...
}
return decrypted;
} catch (..) {
}
}
return bytesRead;
}
super.rename(source, dest); | ||
if (!(source.contains("segments_") || source.endsWith(".si"))) ivMap.put( | ||
getDirectory() + "/" + dest, | ||
ivMap.remove(getDirectory() + "/" + source) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should add a null check on ivMap.remove()
* @param name the name of the file to be deleted | ||
*/ | ||
@Override | ||
public void deleteFile(String name) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will also cover files deleted during segment merges?
*/ | ||
@Override | ||
public IndexOutput createOutput(String name, IOContext context) throws IOException { | ||
if (name.contains("segments_") || name.endsWith(".si")) return super.createOutput(name, context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should also skip write.lock
and .del
?
https://lucene.apache.org/core/3_0_3/fileformats.html#Lock%20File
https://lucene.apache.org/core/3_0_3/fileformats.html#Deleted%20Documents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are also encrypting ivmap
? i think we should be skip encrypting ivMap.
we will need a integration test to test ivMap
is been loaded correctly post a restart.
* @param path the shard file path | ||
*/ | ||
@Override | ||
public Directory newDirectory(IndexSettings indexSettings, ShardPath path) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess, no special handling is required on index deletion?
Another thing to consider is adding this feature to the security plugin: https://github.com/opensearch-project/security You would get all of the existing test infrastructure that exists in the security repo and I think it would be a natural feature of the plugin. That being said, you may want this to be a separate plugin if the index-level encryption should be available to clusters regardless if using the full features of the security plugin or not. Just because the security plugin is installed doesn't mean a cluster needs to be full FGAC. You can disable security and have the plugin installed, or you can run in SSL only mode (no FGAC) so we can build this into the security plugin and provide support regardless if a cluster uses all features of the security plugin or not. The security plugin also has a companion dashboards-plugin for configuration through OpenSearch-Dashboards. If there are plans to support configuring index-level encryption through OpenSearch Dashboards, then that could be built into the companion dashboards plugin. |
Yeah, that's my mental model. You can have index-level enc enabled without having security plugin (features) and vice versa. I am leaning towards bundling this as a different plugin; how hard is it to setup the test infra? |
@cwperks let's discuss this with more folks during security-triage-meeting. |
I think you're right. This makes more sense as a standalone plugin. There is a template repo to help get bootstrapped. I will allocate some time after 2.19 code freeze to demonstrate creating a plugin repo and writing a few integ tests w/ different configurations. |
CryptoDirectory(LockFactory lockFactory, Path location, Provider provider, MasterKeyProvider keyProvider) throws IOException { | ||
super(location, lockFactory); | ||
this.location = location; | ||
ivMap = new ConcurrentSkipListMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ivMap will auto shrink on segment merge? asking because I was initially concerned about size of ivMap with risk of collisions and lock contention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed offline: @asonje will run some instrumentation to verify segment merges are happening as expected and ivmap entry for deleted files are removed from on segment merges.
@asonje: I started taking a look on snapshots would work for us. Could you please help with a few clarifications?
(edit: discussed offline: Both |
Description
This pull request adds index level encryption features to OpenSearch based on the issue #3469. Each OpenSearch index is individually encrypted based on user provided encryption keys. A new cryptofs store type index.store.type is introduced which instantiates a CryptoDirectory that encrypts and decrypts files as they are written and read respectively
Related Issues
Resolves #3469
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.