Index Level Encryption plugin #12902

asonje · 2024-03-25T16:49:14Z

Description

This pull request adds index level encryption features to OpenSearch based on the issue #3469. Each OpenSearch index is individually encrypted based on user provided encryption keys. A new cryptofs store type index.store.type is introduced which instantiates a CryptoDirectory that encrypts and decrypts files as they are written and read respectively

Related Issues

Resolves #3469

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
Commits are signed per the DCO using --signoff
Commit changes are listed out in CHANGELOG.md file (See: Changelog)
Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Olasoji Denloye <olasoji.denloye@intel.com>

github-actions · 2024-12-05T18:58:44Z

❌ Gradle check result for ded0f58: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

asonje · 2024-12-05T19:23:00Z

{"run-benchmark-test": "id_3"}

github-actions · 2024-12-05T20:39:45Z

The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/1810/ . Final results will be published once the job is completed.

opensearch-ci-bot · 2024-12-05T21:53:25Z

Benchmark Results

Benchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/1810/

Metric	Task	Value	Unit
Cumulative indexing time of primary shards		0	min
Min cumulative indexing time across primary shards		0	min
Median cumulative indexing time across primary shards		0	min
Max cumulative indexing time across primary shards		0	min
Cumulative indexing throttle time of primary shards		0	min
Min cumulative indexing throttle time across primary shards		0	min
Median cumulative indexing throttle time across primary shards		0	min
Max cumulative indexing throttle time across primary shards		0	min
Cumulative merge time of primary shards		0	min
Cumulative merge count of primary shards		0
Min cumulative merge time across primary shards		0	min
Median cumulative merge time across primary shards		0	min
Max cumulative merge time across primary shards		0	min
Cumulative merge throttle time of primary shards		0	min
Min cumulative merge throttle time across primary shards		0	min
Median cumulative merge throttle time across primary shards		0	min
Max cumulative merge throttle time across primary shards		0	min
Cumulative refresh time of primary shards		0	min
Cumulative refresh count of primary shards		2
Min cumulative refresh time across primary shards		0	min
Median cumulative refresh time across primary shards		0	min
Max cumulative refresh time across primary shards		0	min
Cumulative flush time of primary shards		0	min
Cumulative flush count of primary shards		1
Min cumulative flush time across primary shards		0	min
Median cumulative flush time across primary shards		0	min
Max cumulative flush time across primary shards		0	min
Total Young Gen GC time		0.333	s
Total Young Gen GC count		13
Total Old Gen GC time		0	s
Total Old Gen GC count		0
Store size		23.6241	GB
Translog size		5.12227e-08	GB
Heap used for segments		0	MB
Heap used for doc values		0	MB
Heap used for terms		0	MB
Heap used for norms		0	MB
Heap used for points		0	MB
Heap used for stored fields		0	MB
Segment count		32
Min Throughput	wait-for-snapshot-recovery	4.18848e+07	byte/s
Mean Throughput	wait-for-snapshot-recovery	4.18848e+07	byte/s
Median Throughput	wait-for-snapshot-recovery	4.18848e+07	byte/s
Max Throughput	wait-for-snapshot-recovery	4.18848e+07	byte/s
100th percentile latency	wait-for-snapshot-recovery	600574	ms
100th percentile service time	wait-for-snapshot-recovery	600574	ms
error rate	wait-for-snapshot-recovery	0	%
Min Throughput	default	3.01	ops/s
Mean Throughput	default	3.02	ops/s
Median Throughput	default	3.02	ops/s
Max Throughput	default	3.04	ops/s
50th percentile latency	default	7.2419	ms
90th percentile latency	default	8.27506	ms
99th percentile latency	default	8.72156	ms
100th percentile latency	default	8.79778	ms
50th percentile service time	default	6.12717	ms
90th percentile service time	default	6.93989	ms
99th percentile service time	default	7.81181	ms
100th percentile service time	default	7.84552	ms
error rate	default	0	%
Min Throughput	range	0.7	ops/s
Mean Throughput	range	0.7	ops/s
Median Throughput	range	0.7	ops/s
Max Throughput	range	0.71	ops/s
50th percentile latency	range	159.906	ms
90th percentile latency	range	163.337	ms
99th percentile latency	range	174.24	ms
100th percentile latency	range	182.367	ms
50th percentile service time	range	156.87	ms
90th percentile service time	range	160.573	ms
99th percentile service time	range	170.992	ms
100th percentile service time	range	178.964	ms
error rate	range	0	%
Min Throughput	distance_amount_agg	0.07	ops/s
Mean Throughput	distance_amount_agg	0.07	ops/s
Median Throughput	distance_amount_agg	0.07	ops/s
Max Throughput	distance_amount_agg	0.07	ops/s
50th percentile latency	distance_amount_agg	1.32056e+06	ms
90th percentile latency	distance_amount_agg	1.84544e+06	ms
99th percentile latency	distance_amount_agg	1.96348e+06	ms
100th percentile latency	distance_amount_agg	1.97006e+06	ms
50th percentile service time	distance_amount_agg	13592.9	ms
90th percentile service time	distance_amount_agg	13707.9	ms
99th percentile service time	distance_amount_agg	13806.9	ms
100th percentile service time	distance_amount_agg	13817.7	ms
error rate	distance_amount_agg	0	%
Min Throughput	autohisto_agg	1.51	ops/s
Mean Throughput	autohisto_agg	1.51	ops/s
Median Throughput	autohisto_agg	1.51	ops/s
Max Throughput	autohisto_agg	1.52	ops/s
50th percentile latency	autohisto_agg	13.9108	ms
90th percentile latency	autohisto_agg	14.7679	ms
99th percentile latency	autohisto_agg	21.0951	ms
100th percentile latency	autohisto_agg	23.0766	ms
50th percentile service time	autohisto_agg	12.3491	ms
90th percentile service time	autohisto_agg	13.0162	ms
99th percentile service time	autohisto_agg	19.8559	ms
100th percentile service time	autohisto_agg	21.7752	ms
error rate	autohisto_agg	0	%
Min Throughput	date_histogram_agg	1.51	ops/s
Mean Throughput	date_histogram_agg	1.52	ops/s
Median Throughput	date_histogram_agg	1.51	ops/s
Max Throughput	date_histogram_agg	1.53	ops/s
50th percentile latency	date_histogram_agg	13.3376	ms
90th percentile latency	date_histogram_agg	13.855	ms
99th percentile latency	date_histogram_agg	28.4331	ms
100th percentile latency	date_histogram_agg	37.9943	ms
50th percentile service time	date_histogram_agg	11.9196	ms
90th percentile service time	date_histogram_agg	12.2359	ms
99th percentile service time	date_histogram_agg	27.0351	ms
100th percentile service time	date_histogram_agg	36.3977	ms
error rate	date_histogram_agg	0	%
Min Throughput	desc_sort_tip_amount	0.5	ops/s
Mean Throughput	desc_sort_tip_amount	0.5	ops/s
Median Throughput	desc_sort_tip_amount	0.5	ops/s
Max Throughput	desc_sort_tip_amount	0.51	ops/s
50th percentile latency	desc_sort_tip_amount	35.4273	ms
90th percentile latency	desc_sort_tip_amount	36.0668	ms
99th percentile latency	desc_sort_tip_amount	41.6973	ms
100th percentile latency	desc_sort_tip_amount	42.6475	ms
50th percentile service time	desc_sort_tip_amount	32.7031	ms
90th percentile service time	desc_sort_tip_amount	33.1555	ms
99th percentile service time	desc_sort_tip_amount	38.9371	ms
100th percentile service time	desc_sort_tip_amount	39.7858	ms
error rate	desc_sort_tip_amount	0	%
Min Throughput	asc_sort_tip_amount	0.5	ops/s
Mean Throughput	asc_sort_tip_amount	0.51	ops/s
Median Throughput	asc_sort_tip_amount	0.5	ops/s
Max Throughput	asc_sort_tip_amount	0.51	ops/s
50th percentile latency	asc_sort_tip_amount	8.78618	ms
90th percentile latency	asc_sort_tip_amount	9.31272	ms
99th percentile latency	asc_sort_tip_amount	10.2617	ms
100th percentile latency	asc_sort_tip_amount	10.2739	ms
50th percentile service time	asc_sort_tip_amount	5.91895	ms
90th percentile service time	asc_sort_tip_amount	6.25749	ms
99th percentile service time	asc_sort_tip_amount	7.16934	ms
100th percentile service time	asc_sort_tip_amount	7.17124	ms
error rate	asc_sort_tip_amount	0	%

opensearch-ci-bot · 2024-12-05T21:54:24Z

Benchmark Baseline Comparison Results

Benchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/25/

Metric	Task	Baseline	Contender	Diff	Unit
Cumulative indexing time of primary shards		0	0	0	min
Min cumulative indexing time across primary shard		0	0	0	min
Median cumulative indexing time across primary shard		0	0	0	min
Max cumulative indexing time across primary shard		0	0	0	min
Cumulative indexing throttle time of primary shards		0	0	0	min
Min cumulative indexing throttle time across primary shard		0	0	0	min
Median cumulative indexing throttle time across primary shard		0	0	0	min
Max cumulative indexing throttle time across primary shard		0	0	0	min
Cumulative merge time of primary shards		0	0	0	min
Cumulative merge count of primary shards		0	0	0
Min cumulative merge time across primary shard		0	0	0	min
Median cumulative merge time across primary shard		0	0	0	min
Max cumulative merge time across primary shard		0	0	0	min
Cumulative merge throttle time of primary shards		0	0	0	min
Min cumulative merge throttle time across primary shard		0	0	0	min
Median cumulative merge throttle time across primary shard		0	0	0	min
Max cumulative merge throttle time across primary shard		0	0	0	min
Cumulative refresh time of primary shards		0	0	0	min
Cumulative refresh count of primary shards		2	2	0
Min cumulative refresh time across primary shard		0	0	0	min
Median cumulative refresh time across primary shard		0	0	0	min
Max cumulative refresh time across primary shard		0	0	0	min
Cumulative flush time of primary shards		0	0	0	min
Cumulative flush count of primary shards		1	1	0
Min cumulative flush time across primary shard		0	0	0	min
Median cumulative flush time across primary shard		0	0	0	min
Max cumulative flush time across primary shard		0	0	0	min
Total Young Gen GC time		0.325	0.333	0.008	s
Total Young Gen GC count		13	13	0
Total Old Gen GC time		0	0	0	s
Total Old Gen GC count		0	0	0
Store size		23.6241	23.6241	0	GB
Translog size		5.12227e-08	5.12227e-08	0	GB
Heap used for segments		0	0	0	MB
Heap used for doc values		0	0	0	MB
Heap used for terms		0	0	0	MB
Heap used for norms		0	0	0	MB
Heap used for points		0	0	0	MB
Heap used for stored fields		0	0	0	MB
Segment count		32	32	0
Min Throughput	wait-for-snapshot-recovery	4.18863e+07	4.18848e+07	-1520	byte/s
Mean Throughput	wait-for-snapshot-recovery	4.18863e+07	4.18848e+07	-1520	byte/s
Median Throughput	wait-for-snapshot-recovery	4.18863e+07	4.18848e+07	-1520	byte/s
Max Throughput	wait-for-snapshot-recovery	4.18863e+07	4.18848e+07	-1520	byte/s
100th percentile latency	wait-for-snapshot-recovery	600592	600574	-18.3125	ms
100th percentile service time	wait-for-snapshot-recovery	600592	600574	-18.3125	ms
error rate	wait-for-snapshot-recovery	0	0	0	%
Min Throughput	default	3.01355	3.01317	-0.00038	ops/s
Mean Throughput	default	3.02202	3.02142	-0.0006	ops/s
Median Throughput	default	3.02008	3.0195	-0.00058	ops/s
Max Throughput	default	3.03885	3.03768	-0.00117	ops/s
50th percentile latency	default	6.30176	7.2419	0.94013	ms
90th percentile latency	default	7.23365	8.27506	1.04142	ms
99th percentile latency	default	8.1162	8.72156	0.60537	ms
100th percentile latency	default	8.53889	8.79778	0.25889	ms
50th percentile service time	default	5.29412	6.12717	0.83305	ms
90th percentile service time	default	6.00782	6.93989	0.93207	ms
99th percentile service time	default	6.91444	7.81181	0.89737	ms
100th percentile service time	default	7.18129	7.84552	0.66423	ms
error rate	default	0	0	0	%
Min Throughput	range	0.702121	0.702159	4e-05	ops/s
Mean Throughput	range	0.703487	0.703551	6e-05	ops/s
Median Throughput	range	0.703176	0.703235	6e-05	ops/s
Max Throughput	range	0.706283	0.706399	0.00012	ops/s
50th percentile latency	range	159.567	159.906	0.33825	ms
90th percentile latency	range	163.17	163.337	0.16733	ms
99th percentile latency	range	168.481	174.24	5.75862	ms
100th percentile latency	range	169.503	182.367	12.8635	ms
50th percentile service time	range	157.297	156.87	-0.42642	ms
90th percentile service time	range	160.425	160.573	0.14827	ms
99th percentile service time	range	165.635	170.992	5.3576	ms
100th percentile service time	range	166.422	178.964	12.5422	ms
error rate	range	0	0	0	%
Min Throughput	distance_amount_agg	0.0745231	0.0733222	-0.0012	ops/s
Mean Throughput	distance_amount_agg	0.0750837	0.0733966	-0.00169	ops/s
Median Throughput	distance_amount_agg	0.0749722	0.0734084	-0.00156	ops/s
Max Throughput	distance_amount_agg	0.0757835	0.0734192	-0.00236	ops/s
50th percentile latency	distance_amount_agg	1.29251e+06	1.32056e+06	28054.8	ms
90th percentile latency	distance_amount_agg	1.81601e+06	1.84544e+06	29427.4	ms
99th percentile latency	distance_amount_agg	1.93337e+06	1.96348e+06	30107.9	ms
100th percentile latency	distance_amount_agg	1.9399e+06	1.97006e+06	30158.4	ms
50th percentile service time	distance_amount_agg	13599.7	13592.9	-6.83984	ms
90th percentile service time	distance_amount_agg	13728.3	13707.9	-20.3667	ms
99th percentile service time	distance_amount_agg	13891.7	13806.9	-84.8272	ms
100th percentile service time	distance_amount_agg	13921.2	13817.7	-103.477	ms
error rate	distance_amount_agg	0	0	0	%
Min Throughput	autohisto_agg	1.50711	1.50695	-0.00016	ops/s
Mean Throughput	autohisto_agg	1.51174	1.51147	-0.00027	ops/s
Median Throughput	autohisto_agg	1.5107	1.51044	-0.00026	ops/s
Max Throughput	autohisto_agg	1.52112	1.52062	-0.00049	ops/s
50th percentile latency	autohisto_agg	13.8174	13.9108	0.09342	ms
90th percentile latency	autohisto_agg	14.4257	14.7679	0.34225	ms
99th percentile latency	autohisto_agg	20.6309	21.0951	0.4642	ms
100th percentile latency	autohisto_agg	20.8432	23.0766	2.2334	ms
50th percentile service time	autohisto_agg	12.2654	12.3491	0.08371	ms
90th percentile service time	autohisto_agg	12.9628	13.0162	0.05343	ms
99th percentile service time	autohisto_agg	19.0487	19.8559	0.8072	ms
100th percentile service time	autohisto_agg	19.0793	21.7752	2.6959	ms
error rate	autohisto_agg	0	0	0	%
Min Throughput	date_histogram_agg	1.5097	1.50961	-8e-05	ops/s
Mean Throughput	date_histogram_agg	1.51602	1.51589	-0.00013	ops/s
Median Throughput	date_histogram_agg	1.51459	1.51446	-0.00013	ops/s
Max Throughput	date_histogram_agg	1.52884	1.52864	-0.0002	ops/s
50th percentile latency	date_histogram_agg	13.7754	13.3376	-0.43773	ms
90th percentile latency	date_histogram_agg	14.3006	13.855	-0.44553	ms
99th percentile latency	date_histogram_agg	15.9162	28.4331	12.5168	ms
100th percentile latency	date_histogram_agg	16.4133	37.9943	21.5809	ms
50th percentile service time	date_histogram_agg	12.3155	11.9196	-0.39588	ms
90th percentile service time	date_histogram_agg	12.6337	12.2359	-0.39774	ms
99th percentile service time	date_histogram_agg	14.3141	27.0351	12.721	ms
100th percentile service time	date_histogram_agg	15.1024	36.3977	21.2953	ms
error rate	date_histogram_agg	0	0	0	%
Min Throughput	desc_sort_tip_amount	0.502272	0.502143	-0.00013	ops/s
Mean Throughput	desc_sort_tip_amount	0.503736	0.503524	-0.00021	ops/s
Median Throughput	desc_sort_tip_amount	0.503399	0.503206	-0.00019	ops/s
Max Throughput	desc_sort_tip_amount	0.506743	0.50636	-0.00038	ops/s
50th percentile latency	desc_sort_tip_amount	35.634	35.4273	-0.20672	ms
90th percentile latency	desc_sort_tip_amount	36.4391	36.0668	-0.37232	ms
99th percentile latency	desc_sort_tip_amount	42.9616	41.6973	-1.26432	ms
100th percentile latency	desc_sort_tip_amount	45.5025	42.6475	-2.85496	ms
50th percentile service time	desc_sort_tip_amount	32.9943	32.7031	-0.29126	ms
90th percentile service time	desc_sort_tip_amount	33.5623	33.1555	-0.40684	ms
99th percentile service time	desc_sort_tip_amount	40.3316	38.9371	-1.39457	ms
100th percentile service time	desc_sort_tip_amount	42.2373	39.7858	-2.45145	ms
error rate	desc_sort_tip_amount	0	0	0	%
Min Throughput	asc_sort_tip_amount	0.503111	0.503087	-2e-05	ops/s
Mean Throughput	asc_sort_tip_amount	0.505123	0.50508	-4e-05	ops/s
Median Throughput	asc_sort_tip_amount	0.504659	0.50462	-4e-05	ops/s
Max Throughput	asc_sort_tip_amount	0.509265	0.509189	-8e-05	ops/s
50th percentile latency	asc_sort_tip_amount	8.64147	8.78618	0.14471	ms
90th percentile latency	asc_sort_tip_amount	9.20375	9.31272	0.10897	ms
99th percentile latency	asc_sort_tip_amount	21.0795	10.2617	-10.8178	ms
100th percentile latency	asc_sort_tip_amount	32.0886	10.2739	-21.8147	ms
50th percentile service time	asc_sort_tip_amount	5.80252	5.91895	0.11642	ms
90th percentile service time	asc_sort_tip_amount	6.20851	6.25749	0.04897	ms
99th percentile service time	asc_sort_tip_amount	18.1958	7.16934	-11.0265	ms
100th percentile service time	asc_sort_tip_amount	29.2886	7.17124	-22.1174	ms
error rate	asc_sort_tip_amount	0	0	0	%

kumargu · 2024-12-16T07:34:41Z

Listing down my meeting notes with @asonje

Tenets

Performance: There should not be more than single digit millisecond impact to search queries (let's keep it as goal and iterate backwards from there). Similarly, there should not be more then 5% impact to indexing throughout
There should be minimal impact to mem, disk, cpu usage.
When it comes to security, data at rest at all known rest storage should be encrypted
All existing features must continue to work with this feature.
Key (master key) rotation may not be a requirement right now; but the design decisions must be taken in view key rotation feature can be supported ideally without the need of reindexing data (take motivation from the Solr implementation)

High level items pending in this PR

Bring in optimizations to reduce search latencies (which, right now are above expected baselines)
Design ideas on how master key rotation via a customer action would not need a reindexing.
Check if crypto-fs can be deleted if customer has lost their master key. (this is a important security requirement). However there should be safeguards in place to identify if the caller can deleted data for the lost key.
Snapshots in remote store (S3, Azure blob etc) for the encrypted index must also be encrypted with the same CMK with which the index was encrypted at Opensearch rest. This is a must to have requirement. In lack of this, we loose the definition of encryption at rest the remote store rest; as any user can now read all the data from snapshot while they are not supposed to.
Ability to recover from snapshots (should work by default)
When using Remote backed storage, the indices to be encrypted must be also encrypted at the remote storage (rest)
Support for FIPS (asonje@ confirmed its already supported).
Backoff when KMS is down

Todo

(@kumargu)
Write a doc describing how fscrypt can be useful here also listing down its limitations. On a high level, i think fscrypt can meet perf requirement, and give us ability to delete crypto fs in lack of a key (lost).

kumargu · 2025-01-06T05:35:11Z

@asonje checking if you have bandwidth to resume working on this. thanks.

asonje · 2025-01-07T18:14:53Z

@kumargu yes i do. I am currently working on some optimizations and will update you once its ready

kumargu · 2025-01-13T11:16:06Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+        }
+    }
+
+    private void storeWrappedKey(byte[] wrappedKey) {


how is this thread-safe?

This is only called from within the constructor; the directory is not visible to other threads

kumargu · 2025-01-13T11:17:02Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+     * @param dest the new file name
+     */
+    @Override
+    public void rename(String source, String dest) throws IOException {


how is this atomic? what happens when a node crashes while a rename is in progress?

it is not atomic. if a crash happens during a rename operation, the ivMap will become inconsistent and would need to be fixed up on the next open. I will look into this

kumargu · 2025-01-13T11:19:50Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+        try {
+            Cipher cipher = CipherFactory.getCipher(provider);
+            String ivEntry = ivMap.get(getDirectory() + "/" + name);
+            if (ivEntry == null) throw new IOException("failed to open file. " + name);


this should not be an IOException. This could happen if the ivMap has gone corrupted or there's a bug in populating the ivMap. I would wrap it in a RuntimeException

kumargu · 2025-01-13T11:25:04Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+            success = true;
+            return indexInput;
+        } finally {
+            if (success == false) {


can we avoid the usage of the success flag by using try-with-resources (as below). I am scared that a concurrent thread may change the state of the success flag.

@Override public IndexInput openInput(String name, IOContext context) throws IOException { if (name.contains("segments_") || name.endsWith(".si")) { return super.openInput(name, context); } ensureOpen(); ensureCanRead(name); Path path = getDirectory().resolve(name); try (FileChannel fc = FileChannel.open(path, StandardOpenOption.READ)) { Cipher cipher = CipherFactory.getCipher(provider); String ivEntry = ivMap.get(getDirectory() + "/" + name); if (ivEntry == null) { throw new IOException("failed to open file. " + name); } byte[] iv = Base64.getDecoder().decode(ivEntry); CipherFactory.initCipher(cipher, this, Optional.of(iv), Cipher.DECRYPT_MODE, 0); return new CryptoBufferedIndexInput("CryptoBufferedIndexInput(path=\"" + path + "\")", fc, context, cipher, this); } }

success is a local variable and is not shared among threads. the try-with-resource pattern however should be ok

kumargu · 2025-01-13T11:26:57Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+        } catch (java.nio.file.NoSuchFileException fnfe) {
+
+        }
+        IndexOutput out = super.createOutput("ivMap", new IOContext());


sounds weird to me that we are doing a create Output within a close.

kumargu · 2025-01-13T11:30:01Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+        try {
+            deleteFile("ivMap");
+        } catch (java.nio.file.NoSuchFileException fnfe) {
+


we need to handle the error here.

kumargu · 2025-01-13T11:36:48Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+        out.writeMapOfStrings(ivMap);
+        out.close();
+        isOpen = false;
+        deletePendingFiles();


deletePendingFiles is also synchronised.. can we get into a deadlock?

we need to use ReentrantLock lock or the lock from the lockFactory

kumargu · 2025-01-13T11:43:37Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+        random.nextBytes(iv);
+        if (dataKey == null) throw new RuntimeException("dataKey is null!");
+        CipherFactory.initCipher(cipher, this, Optional.of(iv), Cipher.ENCRYPT_MODE, 0);
+        ivMap.put(getDirectory() + "/" + name, Base64.getEncoder().encodeToString(iv));


~~we need to make this operation thread-safe while reducing the scope of the lock.~~

should we immediately persist ivMap to minimize the risk of loosing it.

ivMap.put(getDirectory() + "/" + name, Base64.getEncoder().encodeToString(iv)); IndexOutput out = super.createOutput("ivMap", new IOContext()); out.writeMapOfStrings(ivMap); out.close();

kumargu · 2025-01-13T11:50:46Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+public final class CryptoDirectory extends NIOFSDirectory {
+    private Path location;
+    private Key dataKey;
+    private ConcurrentSkipListMap<String, String> ivMap;


can we make this final?

kumargu · 2025-01-13T11:53:27Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+        try {
+            in = super.openInput("ivMap", new IOContext());
+        } catch (java.nio.file.NoSuchFileException nsfe) {
+            in = null;
+        }
+        if (in != null) {
+            Map<String, String> tmp = in.readMapOfStrings();
+            ivMap.putAll(tmp);
+            in.close();
+            dataKey = new SecretKeySpec(keyProvider.decryptKey(getWrappedKey()), "AES");
+        } else {
+            DataKeyPair dataKeyPair = keyProvider.generateDataPair();
+            dataKey = new SecretKeySpec(dataKeyPair.getRawKey(), "AES");
+            storeWrappedKey(dataKeyPair.getEncryptedKey());
+        }
+    }


can we move this logic to an init for clearer code?

public static CryptoDirectory create(LockFactory lockFactory, Path location, Provider provider, MasterKeyProvider keyProvider) throws IOException { CryptoDirectory directory = new CryptoDirectory(lockFactory, location, provider); directory.init(keyProvider); return directory; } private CryptoDirectory(LockFactory lockFactory, Path location, Provider provider) throws IOException { super(location, lockFactory); this.location = location; this.provider = provider; this.ivMap = new ConcurrentSkipListMap<>(); }

kumargu · 2025-01-16T19:21:09Z

...plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectoryFactory.java

+    /**
+     * Creates a new CryptoDirectoryFactory
+     */
+    public CryptoDirectoryFactory() {


where do we invoke the factory?

kumargu · 2025-01-17T05:13:48Z

@asonje we'll need to figure out how we can run the plugin against the performance benchmarking tool.

Similarly, do you know if there's a way to run the Opensearch core integration tests with this plugin enabled? That would give us a lot of coverage on correctness.

@cwperks to share any insights on above.

kumargu · 2025-01-17T17:24:34Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+    }
+
+    private byte[] getWrappedKey() {
+        try (IndexInput in = super.openInput("keyfile", new IOContext())) {


IndexInput is not thread-safe, can we have a contention with write?

kumargu · 2025-01-17T18:02:45Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+     */
+    @Override
+    public IndexInput openInput(String name, IOContext context) throws IOException {
+        if (name.contains("segments_") || name.endsWith(".si")) return super.openInput(name, context);


what's this special handling about? could we please add a comment on the intention?

it means files with this extension are not encrypted. The server reads them on startup

kumargu · 2025-01-17T18:21:21Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+            in = null;
+        }
+        if (in != null) {
+            Map<String, String> tmp = in.readMapOfStrings();


can we please rename tmp to maybe loadedIVMap?

kumargu · 2025-01-17T18:50:58Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+ * @opensearch.internal
+ */
+public final class CryptoDirectory extends NIOFSDirectory {
+    private Path location;


I am surprised, this has zero usage.

kumargu · 2025-01-17T19:04:52Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+        boolean success = false;
+        try {
+            Cipher cipher = CipherFactory.getCipher(provider);
+            String ivEntry = ivMap.get(getDirectory() + "/" + name);


wondering ivMap.get(location) would have same results?

asonje · 2025-01-17T19:30:02Z

@asonje we'll need to figure out how we can run the plugin against the performance benchmarking tool.

Similarly, do you know if there's a way to run the Opensearch core integration tests with this plugin enabled? That would give us a lot of coverage on correctness.

@cwperks to share any insights on above.

#12472 is still a blocker for this feature
feel free to suggest more integration tests

cwperks · 2025-01-17T19:36:41Z

@asonje we'll need to figure out how we can run the plugin against the performance benchmarking tool.

Similarly, do you know if there's a way to run the Opensearch core integration tests with this plugin enabled? That would give us a lot of coverage on correctness.

@cwperks to share any insights on above.

Do you just want the plugin to be installed? I don't think that would tell much if the plugin is installed, but not configured.

Typically for plugin repos they would write their own integration tests to test out core functionality of the plugin.

For example with security, their are integ tests that test different types of security configurations whether is be different SSL configurations or more general security settings like testing with different roles mappings that contains permutations on FLS/DLS/FieldMasking.

I can route you to some examples of how to setup Github Actions to run integ tests w/ a plugin installed.

cwperks · 2025-01-17T19:41:50Z

This is a more involved example (which includes setup w/ security), but this is how ISM creates a testcluster to run integ tests against: https://github.com/opensearch-project/index-management/blob/main/build.gradle#L347-L413

When the integTests task runs it will run tests against that testcluster.

In addition to the setup in build.gradle there would also be a Github workflow that directly (or indirectly) triggers the ./gradlew integTest task to run. For example, it would be called in this step in ISM: https://github.com/opensearch-project/index-management/blob/main/.github/workflows/test-and-build-workflow.yml#L56

cwperks · 2025-01-17T19:48:50Z

One thing to consider when introducing a new plugin is whether to build it directly in the core repo (native plugin) vs creating a separate repository.

If a plugin is introduced in a separate repository it can still be included in the default distribution of OpenSearch.

Some advantages of a separate repo is having a separate set of maintainers that are specialized in that plugin. The core is a big repo and each maintainer may be specialized in a different area of the core.

Another advantage of a separate repo is being able to setup GH actions specific to that plugin's tests.

That being said, the drawbacks of that approach may be that there could be a bit of scaffolding to get setup. Any plugin included in the default distribution would need to have an active set of maintainers.

kumargu · 2025-01-17T19:53:55Z

One thing to consider when introducing a new plugin is whether to build it directly in the core repo (native plugin) vs creating a separate repository.

I think we will separate it into a different repo (it will be easier to maintain and develop). I am fine for now to get it shipped in this state and open a different to move it to a different repo.

kumargu · 2025-01-17T19:26:29Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+        try {
+            deleteFile("ivMap");
+        } catch (java.nio.file.NoSuchFileException fnfe) {


sorry, I don't understand the intention here.

on close the ivMap is written to disk. This overwrites the existing file with the live version. perhaps the ivMap can be synced more frequently

perhaps the ivMap can be synced more frequently.

Yeah. we cannot afford loosing the ivMap. We can probably write ivMap to disk, read back, cache and use for enc/dec. That way we ensure durability.

idea: we can use mmap to directly write ivMap contents to disk for better durability.

kumargu · 2025-01-18T16:51:05Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+     *
+     * @opensearch.internal
+     */
+    final class CryptoBufferedIndexInput extends BufferedIndexInput {


can we please move it to a different class of its own?

kumargu · 2025-01-18T17:04:50Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+                            res = cipher.update(b, offset, chunk);
+                            if (res != null) out.write(res);
+                        } catch (IllegalStateException e) {
+                            throw new IllegalStateException("count is " + count + " " + e.getMessage());


I don't think count variable adds much value.

kumargu · 2025-01-18T17:21:13Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+                    byte[] res;
+                    while (length > 0) {
+                        count++;
+                        final int chunk = Math.min(length, CHUNK_SIZE);


Buffering may be ineffective for small writes. Maybe we can maintain an enum of variable CHUNK sizes and buffer smaller writes.

private enum ChunkSize { LARGE(8192), MEDIUM(4096), SMALL(1024); private final int size; ChunkSize(int size) { this.size = size; } public int getSize() { return size; } public static ChunkSize bestPossibleBufferLength(int length) { if (length >= LARGE.size) return LARGE; if (length >= MEDIUM.size) return MEDIUM; return SMALL; } }

ChunkSize chunkSize = ChunkSize.bestPossibleBufferLength(length); int bytesToWrite = Math.min(chunkSize.getSize(), length);

kumargu · 2025-01-18T17:31:18Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+    final class CryptoBufferedIndexInput extends BufferedIndexInput {
+        /** The maximum chunk size for reads of 16384 bytes. */
+        private static final int CHUNK_SIZE = 16384;
+        ByteBuffer tmpBuffer = ByteBuffer.allocate(CHUNK_SIZE);


i think we can use directBuffers here and that would result in sequential IO.
https://stackoverflow.com/questions/5670862/bytebuffer-allocate-vs-bytebuffer-allocatedirect

this.directBuffer = ByteBuffer.allocateDirect(CHUNK_SIZE);

(note to self: should we make the CHUNK_SIZE configurable?)

kumargu · 2025-01-18T18:05:32Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+        }
+
+        @SuppressForbidden(reason = "FileChannel#read is a faster alternative to synchronized block")
+        private int read(ByteBuffer dst, long position) throws IOException {


with directBuffers, read would look slightly different.

private int read(ByteBuffer dst, long position) throws IOException { directBuffer.clear().limit(dst.remaining()); int bytesRead = channel.read(directBuffer, position); if (bytesRead > 0) { directBuffer.flip(); try { int decrypted = cipher.update(directBuffer, dst); // handle logic, when we have reached the end-of-file. if (position + bytesRead == end) { ... } return decrypted; } catch (..) { } } return bytesRead; }

kumargu · 2025-01-20T09:03:51Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+        super.rename(source, dest);
+        if (!(source.contains("segments_") || source.endsWith(".si"))) ivMap.put(
+            getDirectory() + "/" + dest,
+            ivMap.remove(getDirectory() + "/" + source)


we should add a null check on ivMap.remove()

kumargu · 2025-01-20T18:02:53Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+     * @param name the name of the file to be deleted
+     */
+    @Override
+    public void deleteFile(String name) throws IOException {


this will also cover files deleted during segment merges?

kumargu · 2025-01-20T18:07:51Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+     */
+    @Override
+    public IndexOutput createOutput(String name, IOContext context) throws IOException {
+        if (name.contains("segments_") || name.endsWith(".si")) return super.createOutput(name, context);


we should also skip write.lock and .del ?

https://lucene.apache.org/core/3_0_3/fileformats.html#Lock%20File
https://lucene.apache.org/core/3_0_3/fileformats.html#Deleted%20Documents

I think we are also encrypting ivmap ? i think we should be skip encrypting ivMap.
we will need a integration test to test ivMap is been loaded correctly post a restart.

kumargu · 2025-01-20T18:32:52Z

...plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectoryFactory.java

+     * @param path the shard file path
+     */
+    @Override
+    public Directory newDirectory(IndexSettings indexSettings, ShardPath path) throws IOException {


i guess, no special handling is required on index deletion?

cwperks · 2025-01-21T03:23:10Z

One thing to consider when introducing a new plugin is whether to build it directly in the core repo (native plugin) vs creating a separate repository.

I think we will separate it into a different repo (it will be easier to maintain and develop). I am fine for now to get it shipped in this state and open a different to move it to a different repo.

Another thing to consider is adding this feature to the security plugin: https://github.com/opensearch-project/security

You would get all of the existing test infrastructure that exists in the security repo and I think it would be a natural feature of the plugin.

That being said, you may want this to be a separate plugin if the index-level encryption should be available to clusters regardless if using the full features of the security plugin or not.

Just because the security plugin is installed doesn't mean a cluster needs to be full FGAC. You can disable security and have the plugin installed, or you can run in SSL only mode (no FGAC) so we can build this into the security plugin and provide support regardless if a cluster uses all features of the security plugin or not.

The security plugin also has a companion dashboards-plugin for configuration through OpenSearch-Dashboards. If there are plans to support configuring index-level encryption through OpenSearch Dashboards, then that could be built into the companion dashboards plugin.

kumargu · 2025-01-21T05:34:22Z

should be available to clusters regardless if using the full features of the security plugin or not.

Yeah, that's my mental model. You can have index-level enc enabled without having security plugin (features) and vice versa.

I am leaning towards bundling this as a different plugin; how hard is it to setup the test infra?

kumargu · 2025-01-21T13:58:56Z

Another thing to consider is adding this feature to the security plugin: https://github.com/opensearch-project/security

@cwperks let's discuss this with more folks during security-triage-meeting.

cwperks · 2025-01-21T14:21:38Z

should be available to clusters regardless if using the full features of the security plugin or not.

Yeah, that's my mental model. You can have index-level enc enabled without having security plugin (features) and vice versa.

I am leaning towards bundling this as a different plugin; how hard is it to setup the test infra?

I think you're right. This makes more sense as a standalone plugin. There is a template repo to help get bootstrapped. I will allocate some time after 2.19 code freeze to demonstrate creating a plugin repo and writing a few integ tests w/ different configurations.

kumargu · 2025-01-21T15:58:02Z

sandbox/plugins/cryptodirectory/src/main/java/org/opensearch/index/store/CryptoDirectory.java

+    CryptoDirectory(LockFactory lockFactory, Path location, Provider provider, MasterKeyProvider keyProvider) throws IOException {
+        super(location, lockFactory);
+        this.location = location;
+        ivMap = new ConcurrentSkipListMap<>();


ivMap will auto shrink on segment merge? asking because I was initially concerned about size of ivMap with risk of collisions and lock contention.

discussed offline: @asonje will run some instrumentation to verify segment merges are happening as expected and ivmap entry for deleted files are removed from on segment merges.

kumargu · 2025-01-21T16:09:51Z

@asonje: I started taking a look on snapshots would work for us. Could you please help with a few clarifications?

keyfile which stores the data-key would be also moved to snapshots ? We will need the data-key for decrypting data when an Opensearch index is recovered from snapshot
Similarly, ivMap will be carried over to snapshots?

(edit: discussed offline: Both keyfile and ivMap would be carried over to snapshots.

asonje added 4 commits March 25, 2024 09:23

Moved cryptodirectory plugin to sandbox

8802124

Signed-off-by: Olasoji Denloye <olasoji.denloye@intel.com>

Modified changelog for PR

69a443b

Signed-off-by: Olasoji Denloye <olasoji.denloye@intel.com>

Updated changelog with PR link

eb6e499

Signed-off-by: Olasoji Denloye <olasoji.denloye@intel.com>

Fixed conflict in CHANGELOG

ee0b4ab

Signed-off-by: Olasoji Denloye <olasoji.denloye@intel.com>

github-actions bot added discuss Issues intended to help drive brainstorming and decision making feature New feature or request labels Mar 25, 2024

Reverted gradle properties file

ded0f58

Signed-off-by: Olasoji Denloye <olasoji.denloye@intel.com>

github-actions bot mentioned this pull request Dec 5, 2024

Manual approval required for workflow run 12186627333: Request to approve/deny benchmark run for PR #12902 #16794

Closed

opensearch-ci-bot mentioned this pull request Dec 10, 2024

[AUTOCUT] Gradle Check Flaky Test Report for SpecificClusterManagerNodesIT #15944

Closed

kumargu reviewed Jan 13, 2025

View reviewed changes

kumargu reviewed Jan 16, 2025

View reviewed changes

kumargu reviewed Jan 17, 2025

View reviewed changes

kumargu reviewed Jan 18, 2025

View reviewed changes

kumargu reviewed Jan 20, 2025

View reviewed changes

kumargu reviewed Jan 21, 2025

View reviewed changes

Index Level Encryption plugin #12902

Are you sure you want to change the base?

Index Level Encryption plugin #12902

Conversation

asonje commented Mar 25, 2024

Description

Related Issues

Check List

github-actions bot commented Dec 5, 2024

asonje commented Dec 5, 2024

github-actions bot commented Dec 5, 2024

opensearch-ci-bot commented Dec 5, 2024

Benchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/1810/

opensearch-ci-bot commented Dec 5, 2024

Benchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/25/

kumargu commented Dec 16, 2024 • edited Loading

Tenets

High level items pending in this PR

Todo

kumargu commented Jan 6, 2025

asonje commented Jan 7, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kumargu Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kumargu Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

kumargu Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kumargu commented Jan 17, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asonje commented Jan 17, 2025

cwperks commented Jan 17, 2025

cwperks commented Jan 17, 2025

cwperks commented Jan 17, 2025

kumargu commented Jan 17, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kumargu Jan 18, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cwperks commented Jan 21, 2025 • edited Loading

kumargu commented Jan 21, 2025

kumargu commented Jan 21, 2025

cwperks commented Jan 21, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kumargu commented Jan 21, 2025 • edited Loading

kumargu commented Dec 16, 2024 •

edited

Loading

kumargu Jan 13, 2025 •

edited

Loading

kumargu Jan 13, 2025 •

edited

Loading

kumargu Jan 13, 2025 •

edited

Loading

kumargu commented Jan 17, 2025 •

edited

Loading

kumargu Jan 18, 2025 •

edited

Loading

cwperks commented Jan 21, 2025 •

edited

Loading

kumargu commented Jan 21, 2025 •

edited

Loading