Skip to content

Conversation

steveloughran
Copy link
Contributor

This is a followup to the original HADOOP-18546
patch (#5176); cherry-picks of that should include this
or follow up with it.

Removes risk of race conditions in assertions
of ITestReadBufferManager on the state of the in-progress and completed queues by removing the assertions.

  • Assertions that there are no in progress reads MUST be cut as there may be some and they won't be cancelled.
  • Assertions that the completed list is without buffers of a closed stream are brittle because if there was an in progress stream which completed after stream.close() then it will end up in the list.

How was this patch tested?

modified test run standalone; doing full test suite as well for due diligence

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

This is a followup to the original HADOOP-18546
patch; cherry-picks of that should include this
or follow up with it.

Removes risk of race conditions in assertions
of ITestReadBufferManager on the state of the in-progress
and completed queues by removing the assertions.

* Assertions that there are no in progress reads MUST be
  cut as there may be some and they won't be cancelled.
* Assertions that the completed list is without buffers
  of a closed stream are brittle because if there was
  an in progress stream which completed after stream.close()
  then it will end up in the list.

Change-Id: Ide2570db36b0f6c23a6569d90446de28cebfc2ac
@steveloughran
Copy link
Contributor Author

azure endpoint is cardiff; parallel -Dscale run in progress. already verified the failing test seems happy now. (given i've deleted assertions, it'd be hard not to

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 39s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 38m 53s trunk passed
+1 💚 compile 0m 41s trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 compile 0m 36s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 checkstyle 0m 35s trunk passed
+1 💚 mvnsite 0m 43s trunk passed
-1 ❌ javadoc 0m 41s /branch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt hadoop-azure in trunk failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.
+1 💚 javadoc 0m 34s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 1m 14s trunk passed
+1 💚 shadedclient 20m 10s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 31s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javac 0m 33s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 javac 0m 29s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 19s the patch passed
+1 💚 mvnsite 0m 31s the patch passed
-1 ❌ javadoc 0m 24s /patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt hadoop-azure in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.
+1 💚 javadoc 0m 22s the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 1m 5s the patch passed
+1 💚 shadedclient 20m 3s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 59s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 38s The patch does not generate ASF License warnings.
92m 51s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5198/1/artifact/out/Dockerfile
GITHUB PR #5198
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 13077f4d7d7b 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / a539323
Default Java Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5198/1/testReport/
Max. process+thread count 555 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5198/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

}
} finally {
executorService.shutdown();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

executorService.shutdown does an orderly shutdown of the task. It does not wait for the tasks to be completed. So, the main thread after executing line 76 will go to line 79, and assertions will happen where the execution of tasks may or may not have got completed.

Requesting you to kindly change to https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html#awaitTermination method which will wait for the tasks to be completed. Thanks.

Ref: https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html#shutdown()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aah. yes, will do that. we still need to cut the assertions about completed list and buffers returned, as any in-progress read will finish and add the buffer which will then stay until the threshold timeout

Change-Id: Icc9940851789d0796bc4ef8467093aa3f597ee4e
@steveloughran
Copy link
Contributor Author

ok, updated this one, which i'm using in my internal cherrypick which I need for e2e testing though spark.

will review your #5199 change which does more than just comment out the failing assertions.

my patch tested against azure cardiff

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 50s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 40m 56s trunk passed
+1 💚 compile 0m 38s trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 compile 0m 35s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 39s trunk passed
-1 ❌ javadoc 0m 36s /branch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt hadoop-azure in trunk failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.
+1 💚 javadoc 0m 29s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 1m 12s trunk passed
+1 💚 shadedclient 23m 21s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 30s the patch passed
+1 💚 compile 0m 32s the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javac 0m 32s the patch passed
+1 💚 compile 0m 27s the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 javac 0m 27s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 18s the patch passed
+1 💚 mvnsite 0m 31s the patch passed
-1 ❌ javadoc 0m 22s /patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt hadoop-azure in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.
+1 💚 javadoc 0m 21s the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 1m 7s the patch passed
+1 💚 shadedclient 23m 27s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 1s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 32s The patch does not generate ASF License warnings.
100m 50s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5198/2/artifact/out/Dockerfile
GITHUB PR #5198
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux b42b2f4dbbcd 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / df1a4c3
Default Java Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5198/2/testReport/
Max. process+thread count 536 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5198/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

not happy about javadoc failures, but looks unrelated (annotation issues? javadoc changes?).

going to approve this myself because it's only a test change, and the change is just removing some flaky asserts plus the fix from @pranavsaxena-microsoft . If I'd checked out the PR and tested locally before the merge i would have just included them in the original commit.

+1

@steveloughran steveloughran merged commit 0a7dfcc into apache:trunk Dec 9, 2022
asfgit pushed a commit that referenced this pull request Dec 9, 2022
… stream close() (#5176)

This addresses HADOOP-18521, "ABFS ReadBufferManager buffer sharing
across concurrent HTTP requests" by not trying to cancel
in progress reads.

It supercedes HADOOP-18528, which disables the prefetching.
If that patch is applied *after* this one, prefetching
will be disabled.

As well as changing the default value in the code,
core-default.xml is updated to set
fs.azure.enable.readahead = true

As a result, if Configuration.get("fs.azure.enable.readahead")
returns a non-null value, then it can be inferred that
it was set in or core-default.xml (the fix is present)
or in core-site.xml (someone asked for it).

Note: this commit contains the followup commit:
  HADOOP-18546. Followup: ITestReadBufferManager fix (#5198)
That is needed to avoid race conditions in the test.

Contributed by Pranav Saxena.
@saxenapranav
Copy link
Contributor

not happy about javadoc failures, but looks unrelated (annotation issues? javadoc changes?).

going to approve this myself because it's only a test change, and the change is just removing some flaky asserts plus the fix from @pranavsaxena-microsoft . If I'd checked out the PR and tested locally before the merge i would have just included them in the original commit.

+1

Hi @steveloughran , Thanks a lot for the PR. Related to javadocs failure, it seems that it is bit inconsistent. Sometimes it fails and then back-merging the trunk, the test passes.

@steveloughran
Copy link
Contributor Author

yeah, nothing related to this pr. been mentioned elsewhere and i think it's java 11 related. moving to a static import of the specific values will apparently fix, but this should be a standalone "fix javadocs" pr for each module affected

slfan1989 pushed a commit to slfan1989/hadoop that referenced this pull request Dec 20, 2022

This is a followup to the original HADOOP-18546
patch; cherry-picks of that should include this
or follow up with it.

Removes risk of race conditions in assertions
of ITestReadBufferManager on the state of the in-progress
and completed queues by removing assertions brittle
to race conditions in scheduling/network IO

* Waits for all the executor pool shutdown to complete before
  making any assertions
* Assertions that there are no in progress reads MUST be
  cut as there may be some and they won't be cancelled.
* Assertions that the completed list is without buffers
  of a closed stream are brittle because if there was
  an in progress stream which completed after stream.close()
  then it will end up in the list.

Contributed by Steve Loughran
anmolanmol1234 pushed a commit to ABFSDriver/AbfsHadoop that referenced this pull request Mar 6, 2023

This is a followup to the original HADOOP-18546
patch; cherry-picks of that should include this
or follow up with it.

Removes risk of race conditions in assertions
of ITestReadBufferManager on the state of the in-progress
and completed queues by removing assertions brittle
to race conditions in scheduling/network IO

* Waits for all the executor pool shutdown to complete before
  making any assertions
* Assertions that there are no in progress reads MUST be
  cut as there may be some and they won't be cancelled.
* Assertions that the completed list is without buffers
  of a closed stream are brittle because if there was
  an in progress stream which completed after stream.close()
  then it will end up in the list.

Contributed by Steve Loughran
anmolanmol1234 pushed a commit to ABFSDriver/AbfsHadoop that referenced this pull request Apr 3, 2023

This is a followup to the original HADOOP-18546
patch; cherry-picks of that should include this
or follow up with it.

Removes risk of race conditions in assertions
of ITestReadBufferManager on the state of the in-progress
and completed queues by removing assertions brittle
to race conditions in scheduling/network IO

* Waits for all the executor pool shutdown to complete before
  making any assertions
* Assertions that there are no in progress reads MUST be
  cut as there may be some and they won't be cancelled.
* Assertions that the completed list is without buffers
  of a closed stream are brittle because if there was
  an in progress stream which completed after stream.close()
  then it will end up in the list.

Contributed by Steve Loughran
anujmodi2021 pushed a commit to anujmodi2021/hadoop that referenced this pull request Apr 3, 2023

This is a followup to the original HADOOP-18546
patch; cherry-picks of that should include this
or follow up with it.

Removes risk of race conditions in assertions
of ITestReadBufferManager on the state of the in-progress
and completed queues by removing assertions brittle
to race conditions in scheduling/network IO

* Waits for all the executor pool shutdown to complete before
  making any assertions
* Assertions that there are no in progress reads MUST be
  cut as there may be some and they won't be cancelled.
* Assertions that the completed list is without buffers
  of a closed stream are brittle because if there was
  an in progress stream which completed after stream.close()
  then it will end up in the list.

Contributed by Steve Loughran
anujmodi2021 pushed a commit to anujmodi2021/hadoop that referenced this pull request Apr 3, 2023

This is a followup to the original HADOOP-18546
patch; cherry-picks of that should include this
or follow up with it.

Removes risk of race conditions in assertions
of ITestReadBufferManager on the state of the in-progress
and completed queues by removing assertions brittle
to race conditions in scheduling/network IO

* Waits for all the executor pool shutdown to complete before
  making any assertions
* Assertions that there are no in progress reads MUST be
  cut as there may be some and they won't be cancelled.
* Assertions that the completed list is without buffers
  of a closed stream are brittle because if there was
  an in progress stream which completed after stream.close()
  then it will end up in the list.

Contributed by Steve Loughran
anujmodi2021 pushed a commit to anujmodi2021/hadoop that referenced this pull request Apr 4, 2023

This is a followup to the original HADOOP-18546
patch; cherry-picks of that should include this
or follow up with it.

Removes risk of race conditions in assertions
of ITestReadBufferManager on the state of the in-progress
and completed queues by removing assertions brittle
to race conditions in scheduling/network IO

* Waits for all the executor pool shutdown to complete before
  making any assertions
* Assertions that there are no in progress reads MUST be
  cut as there may be some and they won't be cancelled.
* Assertions that the completed list is without buffers
  of a closed stream are brittle because if there was
  an in progress stream which completed after stream.close()
  then it will end up in the list.

Contributed by Steve Loughran
anujmodi2021 pushed a commit to anujmodi2021/hadoop that referenced this pull request Apr 4, 2023

This is a followup to the original HADOOP-18546
patch; cherry-picks of that should include this
or follow up with it.

Removes risk of race conditions in assertions
of ITestReadBufferManager on the state of the in-progress
and completed queues by removing assertions brittle
to race conditions in scheduling/network IO

* Waits for all the executor pool shutdown to complete before
  making any assertions
* Assertions that there are no in progress reads MUST be
  cut as there may be some and they won't be cancelled.
* Assertions that the completed list is without buffers
  of a closed stream are brittle because if there was
  an in progress stream which completed after stream.close()
  then it will end up in the list.

Contributed by Steve Loughran
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants