Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Iceberg]Refactor test classes to avoid them growing too large #23466

Merged
merged 1 commit into from
Aug 27, 2024

Conversation

hantangwangd
Copy link
Member

@hantangwangd hantangwangd commented Aug 18, 2024

Description

In Iceberg, constantly expanding IcebergDistributedTestBase whose tests are all sharing the same query runner is somewhat harmful. This may lead in memory pressure because of the histories of expired queries tracked by the query runner, especially considering IcebergDistributedTestBase itself has inherited a huge test set.

This PR re-factored IcebergDistributedTestBase, splitting it into two parts. One part including a newly added class TestIcebergDistributedQueries and it's subclasses, which are used to run all tests inherited from AbstractTestDistributedQueries and its ancestor classes on Iceberg connector. While the other part still uses class IcebergDistributedTestBase and it's subclasses, which are used only for Iceberg's own extended tests.

Motivation and Context

Handle the timeout flaky in testLargeQuery() on Iceberg connector

Impact

N/A

Test Plan

N/A

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== NO RELEASE NOTE ==

@hantangwangd hantangwangd force-pushed the handle_bigquery_flaky branch 5 times, most recently from 76c0256 to 80167d4 Compare August 19, 2024 06:28
@hantangwangd hantangwangd marked this pull request as ready for review August 19, 2024 10:14
@hantangwangd hantangwangd requested review from ZacBlanco and a team as code owners August 19, 2024 10:14
@Override
public void testDelete()
{
// Test delete all rows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we further split these into individual test methods or does each depend on the previous?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After check the code, I find the method testDelete() is a special case. When we used this method to override the same method in parent class, Iceberg didn't support MOR row level data deletion yet, so it cannot successfully run the corresponding test method of the parent class. However, now we have supported row level deletion, so we can directly run the testDelete() method of the parent class. So I can move this method to IcebergDistributedTestBase directly, so that it can be used to test v1 COW based deletion.

The other methods either do not support certain functionalities in the corresponding methods in parent class (such as not support char(10)), or modify the timeout compare to the corresponding methods in parent class. It feels that their purpose is not to extend testing methods based on parent class, but is used to demonstrate the comparison with the behavior defined by the parent class test cases. For example, through the override test method, we can intuitively see that Iceberg does not support char (10), but its behavior towards strings is the same as its parent class. Once we move them elsewhere, we actually lose this intuitive comparison. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides, after we improve the timeout issue, we may eventually remove the override of methods testLargeIn() and testLargeInWithHistograms().

String longValues = range(0, 5000)
.mapToObj(Integer::toString)
.collect(joining(", "));
Session session = Session.builder(getSession())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason these all need to be in the same session? otherwise these could be independent more unitary test methods

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we can make sure that the transactionId is not shared, as Session.build(session) has checked already, then we can consider the sessions built on this way as stateless ones. However, I'm not so sure, will check out this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @elharo, sorry for not having much time recently, I will spend some time later to organize in detail whether there could be any issues of mutual influence between the sessions constructed by Session.build(session). Once done, I will still reply here.


assertQuery("SELECT count(*) FROM test_varcharn_filter WHERE shipmode = 'AIR'", "VALUES (8491)");
assertQuery("SELECT count(*) FROM test_varcharn_filter WHERE shipmode = 'AIR '", "VALUES (0)");
assertQuery("SELECT count(*) FROM test_varcharn_filter WHERE shipmode = 'AIR '", "VALUES (0)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this duplicates the previous assert/ Is that deliberate? if so, a comment would help

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertions are the same, I think it's deliberate, will add a comment. The difference is Iceberg do not support char(10), so use varchar(10) instead.

{
nessieContainer = NessieContainer.builder().build();
nessieContainer.start();
super.init();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super.init usually comes first

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to start the container first, so that the super.init() can succeed.

@tdcmeehan
Copy link
Contributor

@elharo can you please get some consensus on testing guidelines? Presto has a large established convention of putting its integration style tests into test cases that test multiple queries within the same test case. This isn't cargo cult programming, it's a desire for consistency in an open source project. I see lots of feedback to break apart test cases, but what I think is missing for everyone receiving this feedback is: why do this now, and what benefits do we get out of doing this incrementally when clearly this is not the convention for this project? Can you get some consensus on our approach to test cases before requesting feedback and asking for changes across PRs?

@tdcmeehan tdcmeehan self-assigned this Aug 19, 2024
@elharo
Copy link
Contributor

elharo commented Aug 19, 2024

We can talk about it in Slack or one of the VCs if you like. This is general best practice for tests going back at least to the early days of JUnit in the 90s, but it goes well beyond any one framework or language. The fundamental goal is that two or more tests pass or fail independently without interfering with each other, and it lets developers run and debug individual tests in isolation.

The more unitary the tests are the healthier a test suite is. And it can be improved one method at a time. The benefit is incremental. If you're adding 10 new tests, it's always better to add 10 new test methods than one method. That's true whether the existing project has 10,000 existing tests in 10,000 methods or 10,000 tests in one method. Our test suite is flaky, but making tests more independent makes them less flaky and makes it easier to pinpoint and debug flakes and failures.

Why do this now? Because these are new tests. It might be laborious (though useful) to split out all the existing tests that tightly couple to each other, but there's no significant cost to doing this for new tests.

@tdcmeehan
Copy link
Contributor

@elharo I agree that splitting tests up makes it marginally simpler to individually troubleshoot one test, but for the tests you're making this feedback on, it's really trivial to just comment the above lines to pinpoint the failure. What I'm worried about is if we take your feedback to its extreme, our unit tests will become vastly more complicated because we'll have dozens of variants of unit tests that all prove out the same functionality in different ways that are hard to differentiate from one another.

I don't understand how asking people to extract a test that had maybe 10 queries in it into 10 individual tests makes the CI less flaky? Can you explain? Perhaps in an issue explaining the rationale for moving in this direction. We need an issue anyway because I don't think an appeal to tradition is strong enough to suddenly start doing things completely differently in a codebase that's already quite large and established. I see any benefit being marginal at best, and I really worry about maintainability of e.g. AbstractTestQueries with these guidelines.

@hantangwangd hantangwangd changed the title [ForTest]Refactor test classes to avoid them growing too large [Iceberg]Refactor test classes to avoid them growing too large Aug 20, 2024
@hantangwangd
Copy link
Member Author

What I'm worried about is if we take your feedback to its extreme, our unit tests will become vastly more complicated because we'll have dozens of variants of unit tests that all prove out the same functionality in different ways that are hard to differentiate from one another.

Besides, put related statements into the same test method to test a specific functionality centrally can sometimes reveal hidden problems, which are difficult to detect if the execution statements are split into separate methods. Because if the statements affect each other, then it would be a real problem that users will encounter in actual use. For example, if some statements execute without errors in their behavior and results, but do not appropriately release their resources or wrongly changed some common data structures, then the execution of the subsequent statements will fail.

@tdcmeehan
Copy link
Contributor

@hantangwangd do you think this PR is still needed with #23257 merged?

@hantangwangd
Copy link
Member Author

@tdcmeehan Yes, the test failure caused by timeout did not occur again after merging PR #23257. It seems that the histograms in tracked expired queries is the most direct cause of testLargeQuery()'s timeout.

However, IMO, I still think it's worth splitting the test class IcebergDistributedTestBase (which seems a bit bulky considering its inherited test cases set). All these test cases are still sharing the same query runner, and as we continue to add more test cases, we may still encounter issues related to memory pressure caused by the tracked expired queries.

For example, for test methods like testLargeIn() and testLargeInWithHistograms(), if we mix our extended test cases with the inherited base test cases set, it would be more harder to figure out the exact reason why optimizer spend more time on Iceberg than on other connectors.

What do you think?

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits

}

@Override
public void testRenameColumn()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the code was previously like this, but could you add comments where necessary explaining that the Iceberg connector doesn't yet support these features?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! After recheck the tests here, I found some of them have been supported by Iceberg connector, so I remove these empty override methods directly. And in the remaining methods, comments have been added where necessary to indicate that the corresponding functionality is not supported by Iceberg connector or differs from the parent method being overridden. Please take a look when available, thanks!

import static com.facebook.presto.spi.statistics.ColumnStatisticType.NUMBER_OF_DISTINCT_VALUES;
import static com.facebook.presto.spi.statistics.ColumnStatisticType.TOTAL_SIZE_IN_BYTES;

public class TestIcebergDistributedQueriesHive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you adopt the following naming convention? TestIcebergHiveCatalogDistributedQueries? Likewise for the others?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, done!

@hantangwangd hantangwangd merged commit 961ff8a into prestodb:master Aug 27, 2024
56 checks passed
@hantangwangd hantangwangd deleted the handle_bigquery_flaky branch August 27, 2024 01:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants