Skip to content

Conversation

steveloughran
Copy link
Contributor

@steveloughran steveloughran commented Nov 8, 2019

This adds a new s3guard command to audit a s3guard bucket's authoritative
state:

hadoop s3guard authoritative -check-config s3a://landsat-pds

Also adds more diags of what is going on, including a specific bulk operation type
"listing" which is used for listing initiated updates.

No docs for the new command,

In DDB we do auth mode better, with a new test suite for this.

  • renamed dirs are marked as auth
  • when we update parents as part of any Put, we don't overwrite their auth dir status (they were all being cleared)
  • explicit exclusion of directories from prune

@steveloughran steveloughran added enhancement fs/s3 changes related to hadoop-aws; submitter must declare test endpoint work in progress PRs still Work in Progress; reviews not expected but still welcome labels Nov 11, 2019
@steveloughran
Copy link
Contributor Author

tests need to unset auth mode dirs from per bucket settings, else ITestAuthoritativePath can fail

[ERROR] testMultiAuthPath(org.apache.hadoop.fs.s3a.ITestAuthoritativePath)  Time elapsed: 4.132 s  <<< FAILURE!
java.lang.AssertionError: Listing was not supposed to include s3a://hwdev-steve-ireland-new/test/1573571607290/testMultiAuthPath-first/in-path-out-of-band. Actual: s3a://hwdev-steve-ireland-new/test/1573571607290/testMultiAuthPath-first/in-path-out-of-band
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.failEquals(Assert.java:185)
	at org.junit.Assert.assertNotEquals(Assert.java:161)
	at org.apache.hadoop.fs.s3a.S3ATestUtils.checkListingDoesNotContainPath(S3ATestUtils.java:1306)
	at org.apache.hadoop.fs.s3a.ITestAuthoritativePath.runTestInsidePath(ITestAuthoritativePath.java:213)
	at org.apache.hadoop.fs.s3a.ITestAuthoritativePath.testMultiAuthPath(ITestAuthoritativePath.java:251)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:748)

[INFO] 

@steveloughran steveloughran force-pushed the s3/HADOOP-16697-authoritative-buckets branch from 94858b3 to aeb09b3 Compare November 15, 2019 14:42
@apache apache deleted a comment from hadoop-yetus Nov 15, 2019
@steveloughran
Copy link
Contributor Author

going to add a S3GuardInstrumentation interface so the internals of the S3A FS aren't so exposed in the S3Guard API. This also lets us assume that the instrumentation is never null, which simplifies code

@steveloughran steveloughran force-pushed the s3/HADOOP-16697-authoritative-buckets branch from 9a951fa to 397a754 Compare November 18, 2019 13:33
@apache apache deleted a comment from hadoop-yetus Nov 18, 2019
@apache apache deleted a comment from hadoop-yetus Nov 18, 2019
@apache apache deleted a comment from hadoop-yetus Nov 18, 2019
@steveloughran
Copy link
Contributor Author


./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/AuthoritativeAudit.java:92:   * @throws NonAuthoritativeDirException if it is non-auth and requireAuth=true.: Line is longer than 80 characters (found 81). [LineLength]
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/Importer.java:118:  private long importDir(FileStatus status) throws IOException {:37: 'status' hides a field. [HiddenField]
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/MetastoreInstrumentation.java:21:public interface MetastoreInstrumentation {: Missing a Javadoc comment. [JavadocType]
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3Guard.java:51:import org.apache.hadoop.fs.s3a.S3AInstrumentation;:8: Unused import - org.apache.hadoop.fs.s3a.S3AInstrumentation. [UnusedImports]
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardTool.java:1145:    public int run(String[] args, PrintStream out):5: Method length is 175 lines (max allowed is 150). [MethodLength]
./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardAuthMode.java:59: * The main testFS is non-auth; we also create a test FS which runs in auth mode.: Line is longer than 80 characters (found 81). [LineLength]

@apache apache deleted a comment from hadoop-yetus Nov 20, 2019
@apache apache deleted a comment from hadoop-yetus Nov 20, 2019
@apache apache deleted a comment from hadoop-yetus Nov 20, 2019
@steveloughran
Copy link
Contributor Author

For anyone watching, the latest PR has testRenameWithNonEmptySubDir failing as a dest path appears to exist. No idea why, yet: debug time.

Related to that -the S3A contract rename tests are parameterized on auth/non-auth; the non-auth ones skip if !s3guard. It might make sense to do that for more of the tests to better guarantee coverage of the two modes of operation.

@bgaborg bgaborg self-requested a review November 25, 2019 08:19
@steveloughran steveloughran force-pushed the s3/HADOOP-16697-authoritative-buckets branch from 4db6bd5 to f60da56 Compare November 27, 2019 17:09
@steveloughran steveloughran removed the work in progress PRs still Work in Progress; reviews not expected but still welcome label Nov 27, 2019
@steveloughran
Copy link
Contributor Author

testing s3a ireland with & without auth. doing a local db test run too for completeness

Copy link

@bgaborg bgaborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @steveloughran for this major piece of work.
It would be also great if we could split this into different chunks - adding the new auth tool, metrics, prune, rebasing the import under different jiras, but it's ok to do that if we don't forget to add every new feature and fix to the description so we will remember later.
Added some comments about the code in my first review phase.

-it also adds a lot on committers and tunes layout

Change-Id: I3c048799063a14e1ed13fadc9507c972a9f05259
Change-Id: Ib3313100e335e1a59eaaed631be7c702cb48431c
Change-Id: I7a8a3b36007a99d61b98501e27c231e8a70d970a
This adds a new s3guard command to audit a s3guard bucket's authoritative
state:

hadoop s3guard authoritative -check-config s3a://landsat-pds

Also adds more diags of what is going on, including a specific bulk operation type
"listing" which is used for listing initiated updates.

No tests or docs, yet.

Change-Id: I8949db5315ef005123549079cd73c7567dbc71e8
* new test for ops to verify the operations we expect to mark as auth,
those which unmark and those which leave alone do as expected.
* prune returns a count of #of pruned entries to enable testing
* auditing operation improved for testing

Change-Id: Ia8bffe488d819a0d3005ab9eb8b31d2df5203bd9
* prune doesn't clear the auth bit on a dir when a tombstone is deleted underneath.
* rename sets all new directories under the dest as authoritative.

Also: S3guard instrumentation has an interface and base implementation;
metastores always have a non-null value; its used in the S3Guard API calls
instead of delving into S3AInstrumentation internals.

Note: just realised a problem here: only DDB metastore updates those
counters. Never mind.

Change-Id: I419757c500eba70c301504de9776e3a353b72ddf
Change-Id: I3c18b376776b83d89caba8cdf6e374bcef5efdd2
This will cause the meta store to marks the directory tree as
authoritative once the import has completed its scan.

It is off by default because you have to be 100% confident that
during a potentially slow scan nobody is writing to that directory tree.

This change includes moving the core operations of S3GuardTool.Import into
an Importer class for ease of testing and so it can live alongside the auditor.
They both use the same (new) MetadataStore method
markAuthoritative(path, BulkOperation).

For DDB, the AncestorState is used to determine what to mark
-we don't attempt any listing of the directory tree.

For Local S3Guard, is currently a no-op. I'm not worried about that,
Especially as the auditor only audits DDB Tables.

The documentation is yet to be updated!

Change-Id: Ifc47713d3b70a6bcf2671c7e2ded31a15ee4d9c6
* tests for the CLI, command line parsing etc
* tuning CLI based on manual use (mainly logging)

S3GuardTool is now closeable, and will close its FS and MS in close. Without this, our tests can leak FS and MS instances. We just hadn't noticed.

The CLI entry point of S3GuardTool does the right thing and closes the tool; for our tests we have to explicitly do it. This is done in a mixture of try-with-resources and building up a list of tools to close in test tear down. All tests which patch in an existing metastore need to replace it with a null one before the close operation. Are you

Change-Id: I061eb0ece24e84aaffb4f12a3e7c920011146dd6
Testing: new tests are good, doing more regression testing. Can't run -Dlocal for some reason.
@steveloughran steveloughran force-pushed the s3/HADOOP-16697-authoritative-buckets branch from f60da56 to cf10674 Compare December 4, 2019 19:03
@apache apache deleted a comment from hadoop-yetus Dec 5, 2019
@apache apache deleted a comment from hadoop-yetus Dec 5, 2019
@apache apache deleted a comment from hadoop-yetus Dec 5, 2019
@steveloughran
Copy link
Contributor Author

steveloughran commented Dec 5, 2019

findbugs

Unchecked/unconfirmed cast from org.apache.hadoop.fs.s3a.s3guard.BulkOperationState to org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore$AncestorState in org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.markAsAuthoritative(Path, BulkOperationState) At DynamoDBMetadataStore.java:org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore$AncestorState in org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.markAsAuthoritative(Path, BulkOperationState) At DynamoDBMetadataStore.java:[line 2104]
Switch statement found in org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(String[], PrintStream) where one case falls through to the next case At S3GuardTool.java:PrintStream) where one case falls through to the next case At S3GuardTool.java:[lines 1276-1283]

Change-Id: Id3b4863ddf585f390da95a6c683900c0db690b92
Change-Id: I4d4ca95e990a2b2a3529c38cea9bbb00047e7083
@apache apache deleted a comment from hadoop-yetus Dec 10, 2019
@apache apache deleted a comment from hadoop-yetus Dec 10, 2019
Change-Id: I5a97dee1862162eea0d5f811c70b176b28c67141
@steveloughran
Copy link
Contributor Author

checkstyle revision; tested s3 ireland

-Dparallel-tests -DtestsThreadCount=8 -Ds3guard -Ddynamo -Dauth

no failures

Copy link

@bgaborg bgaborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are running ok - one intermittent error with org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStore:

java.lang.IllegalArgumentException: Table THIS_IS_THE_TEST_TABLE is not deleted.
	at com.amazonaws.services.dynamodbv2.document.Table.waitForDelete(Table.java:505)
	at org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStore.setUp(ITestDynamoDBMetadataStore.java:169)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)```

My review contains some nits to look at, my +1 pending on that.

Based on Gabor's feedback
-rename a method
-review import logging and extend

the -verbose mode already logs each file/dir; now non-verbose and
verbose both print summaries of the operation: the number of files
and dirs added.

Note, well looking at the logging at debug level I can see that after the import, when we mark everything as authoritative, we end up overwriting all the dir entries just added.

It's not that easy to avoid doing this while keeping the intermediate state of the S3Guard tables consistent; I'd have to modify that marking operation to do a Get on each dir and only update non-auth ones, while tagging the dirs as auth when created.

It's doable; we'd be moving from two writes to one read, one write, or somehow blue together the tracking even more.

I'm not convinced it's worth the extra complexity for an operation which is not expected to be run very often.

Change-Id: Ic2c6378d4c47fe2dc36d8f6d6b7d5e50735ceacb
…istent store.

Essentially, yes -we do need to provide as much diagnostics as we can, which include logging problems at WARN level and are giving a bit more intermediate progress.

* renamed `-require-auth` to `-required` so it is easy to remember.
* Updated the documentation with a examples of this and expanded import operation.

Change-Id: I43051789e105f3cf52e8d8decd61ed7ab6485372
@apache apache deleted a comment from hadoop-yetus Jan 3, 2020
@apache apache deleted a comment from hadoop-yetus Jan 3, 2020
checkstyles

Change-Id: I86e515a71174aa91584a5ab9c3fb7a240752c461
@apache apache deleted a comment from hadoop-yetus Jan 6, 2020
@apache apache deleted a comment from hadoop-yetus Jan 6, 2020
@apache apache deleted a comment from hadoop-yetus Jan 6, 2020
@steveloughran
Copy link
Contributor Author

oops, deleted last yetus review. I'll kick off a new one.

(this kicks off a yetus build, which is what I wanted)

Change-Id: I6116a86723d3268f1507a7e1654f2a4591b1e72b
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 30s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 12 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 1m 28s Maven dependency ordering for branch
+1 💚 mvninstall 23m 41s trunk passed
+1 💚 compile 17m 40s trunk passed
+1 💚 checkstyle 2m 51s trunk passed
+1 💚 mvnsite 2m 6s trunk passed
+1 💚 shadedclient 20m 51s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 2m 3s trunk passed
+0 🆗 spotbugs 1m 8s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 3m 26s trunk passed
-0 ⚠️ patch 1m 29s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 24s Maven dependency ordering for patch
+1 💚 mvninstall 1m 23s the patch passed
+1 💚 compile 18m 9s the patch passed
+1 💚 javac 18m 9s the patch passed
-0 ⚠️ checkstyle 2m 50s root: The patch generated 3 new + 96 unchanged - 0 fixed = 99 total (was 96)
+1 💚 mvnsite 2m 12s the patch passed
-1 ❌ whitespace 0m 0s The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 💚 xml 0m 1s The patch has no ill-formed XML file.
+1 💚 shadedclient 14m 14s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 1m 59s the patch passed
+1 💚 findbugs 3m 48s the patch passed
_ Other Tests _
+1 💚 unit 9m 52s hadoop-common in the patch passed.
+1 💚 unit 1m 31s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 48s The patch does not generate ASF License warnings.
132m 43s
Subsystem Report/Notes
Docker Client=19.03.5 Server=19.03.5 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1707/18/artifact/out/Dockerfile
GITHUB PR #1707
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint
uname Linux 46978472041a 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 768ee22
Default Java 1.8.0_232
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1707/18/artifact/out/diff-checkstyle-root.txt
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-1707/18/artifact/out/whitespace-eol.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1707/18/testReport/
Max. process+thread count 1346 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1707/18/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@steveloughran steveloughran changed the title HADOOP-16697. Tune/audit auth mode HADOOP-16697. Tune/audit S3A authoritative mode Jan 7, 2020
Copy link

@bgaborg bgaborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are passing (only one intermittent failure of ITestDynamoDBMetadataStore.setUp is present)
My comments/change requests are all resolved.
+1; Thanks @steveloughran!

@steveloughran
Copy link
Contributor Author

Thank you for your excellent diligence going through this patch!

@steveloughran
Copy link
Contributor Author

merged after a final retest

@steveloughran steveloughran deleted the s3/HADOOP-16697-authoritative-buckets branch October 15, 2021 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement fs/s3 changes related to hadoop-aws; submitter must declare test endpoint

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants