Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEZ-4526: Avoid calling LocationProvider#getPreferredLocations multiple times while generating grouped splits #323

Merged
merged 1 commit into from
Jan 9, 2024

Conversation

SourabhBadhya
Copy link
Contributor

TEZ-4526: Avoid calling LocationProvider#getPreferredLocations multiple times while generating grouped splits

@tez-yetus

This comment was marked as outdated.

@SourabhBadhya
Copy link
Contributor Author

Requesting @rbalamohan @abstractdog for reviews.

@tez-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 22m 9s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ master Compile Tests _
+1 💚 mvninstall 15m 44s master passed
+1 💚 compile 0m 31s master passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04
+1 💚 compile 0m 30s master passed with JDK Private Build-1.8.0_392-8u392-ga-1~22.04-b08
+1 💚 checkstyle 1m 22s master passed
+1 💚 javadoc 0m 41s master passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04
+1 💚 javadoc 0m 27s master passed with JDK Private Build-1.8.0_392-8u392-ga-1~22.04-b08
+0 🆗 spotbugs 1m 20s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 1m 18s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 19s the patch passed
+1 💚 compile 0m 19s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04
+1 💚 javac 0m 19s the patch passed
+1 💚 compile 0m 18s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~22.04-b08
+1 💚 javac 0m 18s the patch passed
+1 💚 checkstyle 0m 10s tez-mapreduce: The patch generated 0 new + 15 unchanged - 1 fixed = 15 total (was 16)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 javadoc 0m 17s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04
+1 💚 javadoc 0m 15s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~22.04-b08
+1 💚 findbugs 0m 44s the patch passed
_ Other Tests _
+1 💚 unit 1m 33s tez-mapreduce in the patch passed.
+1 💚 asflicense 0m 16s The patch does not generate ASF License warnings.
47m 49s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-323/2/artifact/out/Dockerfile
GITHUB PR #323
JIRA Issue TEZ-4526
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile
uname Linux 5b3a0ef9aa5f 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/tez.sh
git revision master / 0c5cf68
Default Java Private Build-1.8.0_392-8u392-ga-1~22.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu122.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~22.04-b08
Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-323/2/testReport/
Max. process+thread count 243 (vs. ulimit of 5500)
modules C: tez-mapreduce U: tez-mapreduce
Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-323/2/console
versions git=2.34.1 maven=3.6.3 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@SourabhBadhya
Copy link
Contributor Author

Time taken for grouping splits - Generated 30000 file splits and generated 1000 grouped splits with each grouped split having 30 file splits.

Without patch With patch
0.095s 0.078s
0.08s 0.081s
0.081s 0.078s

I dont see a regression happening with the patch. I think I have addressed the regression comments on TEZ-4069.

@abstractdog abstractdog self-requested a review January 9, 2024 15:45
Copy link
Contributor

@abstractdog abstractdog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the perf testing!

@abstractdog abstractdog merged commit 2161124 into apache:master Jan 9, 2024
4 checks passed
abstractdog pushed a commit that referenced this pull request Jan 17, 2024
abstractdog added a commit that referenced this pull request Jan 17, 2024
…le times while generating grouped splits (#323) - CHANGES.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants