Skip to content

Commit

Permalink
ARROW-11319: [Rust] [DataFusion] Improve test comparisons to record b…
Browse files Browse the repository at this point in the history
…atch, remove test::format_batch

The `test::format_batch` function does not have wide range of type support (e.g. it doesn't support dictionaries) and its output makes tests hard to read / update, in my opinion. This PR consolidates the datafusion tests to use `arrow::util::pretty::pretty_format_batches` both to reduce code duplication as well as increase type support

This PR removes the `test::format_batch(&batch);` function and replaces it with `arrow::util::pretty::pretty_format_batches` and some macros. It has no code changes.

This change the following benefits:

1. Better type support (I immediately can compare RecordBatches with `Dictionary` types in tests without having to update `format_batch` and #9233 gets simpler)
2. Better readability and error reporting (at least I find the code and diffs easier to understand)
3. Easier test update / review: it is easier to update the diffs (you can copy/paste the test output into the source code) and to review them

This is a variant of a strategy that I been using with success in IOx [source link](https://github.com/influxdata/influxdb_iox/blob/main/arrow_deps/src/test_util.rs#L15) and I wanted to contribute it back.

An example failure with this PR:

```
---- physical_plan::hash_join::tests::join_left_one stdout ----
thread 'physical_plan::hash_join::tests::join_left_one' panicked at 'assertion failed: `(left == right)`
  left: `["+----+----+----+----+", "| a1 | b2 | c1 | c2 |", "+----+----+----+----+", "| 1  | 1  | 7  | 70 |", "| 2  | 2  | 8  | 80 |", "| 2  | 2  | 9  | 80 |", "+----+----+----+----+"]`,
 right: `["+----+----+----+----+----+", "| a1 | b1 | c1 | a2 | c2 |", "+----+----+----+----+----+", "| 1  | 4  | 7  | 10 | 70 |", "| 2  | 5  | 8  | 20 | 80 |", "| 3  | 7  | 9  |    |    |", "+----+----+----+----+----+"]`:

expected:

[
    "+----+----+----+----+",
    "| a1 | b2 | c1 | c2 |",
    "+----+----+----+----+",
    "| 1  | 1  | 7  | 70 |",
    "| 2  | 2  | 8  | 80 |",
    "| 2  | 2  | 9  | 80 |",
    "+----+----+----+----+",
]
actual:

[
    "+----+----+----+----+----+",
    "| a1 | b1 | c1 | a2 | c2 |",
    "+----+----+----+----+----+",
    "| 1  | 4  | 7  | 10 | 70 |",
    "| 2  | 5  | 8  | 20 | 80 |",
    "| 3  | 7  | 9  |    |    |",
    "+----+----+----+----+----+",
]
```

You can copy/paste the output of `actual` directly into the test code for an update.

Closes #9264 from alamb/remove_test_format_batch

Authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
  • Loading branch information
alamb committed Jan 23, 2021
1 parent 13e2134 commit 67d0c2e
Show file tree
Hide file tree
Showing 4 changed files with 475 additions and 420 deletions.
Loading

0 comments on commit 67d0c2e

Please sign in to comment.