Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matt/feat/recursive ctes/config flag #3

Closed
wants to merge 572 commits into from

Conversation

matthewgapp
Copy link
Owner

Veeupup and others added 30 commits November 28, 2023 10:02
* move array function unit_tests to sqllogictest

Signed-off-by: veeupup <code@tanweime.com>

* add comment for array_expression internal test

---------

Signed-off-by: veeupup <code@tanweime.com>
Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com>
…che#8351)

* Minor: Improve the document format of JoinHashMap

* sql csv_with_quote_escape

* fix
* Minor: restore DataFrame test

* Move test to a better location

* simplify test
…pache#8354)

These utils manipulate `LogicalPlan`s and `Expr`s and may be useful in
projects that only depend on `datafusion-expr`
* Extract parquet statistics to its own module, add tests

* Update datafusion/core/src/datasource/physical_plan/parquet/statistics.rs

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

* rename enum

* Improve API

* Add test for reading struct array statistics

* Add test for column after statistics

* improve tests

* simplify

* clippy

* Update datafusion/core/src/datasource/physical_plan/parquet/statistics.rs

* Update datafusion/core/src/datasource/physical_plan/parquet/statistics.rs

* Add test showing incorrect statistics

* Rework statistics

* Fix clippy

* Update documentation and make it clear the statistics are not publically accessable

* Add link to upstream arrow ticket

---------

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com>
* feat:implement sql style 'find_in_set' string function

* format code

* modify test case
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
* Refactor aggregate function handling

* fix ci

* update comment

* fix ci

* simplify the code

* fix fmt

* fix ci

* fix clippy
* Implement Aliases for ScalarUDF

Signed-off-by: veeupup <code@tanweime.com>

* fix comments

Signed-off-by: veeupup <code@tanweime.com>

---------

Signed-off-by: veeupup <code@tanweime.com>
* support LargeList in array_empty

* update err info
* feat: test queries for to_timestamp(float) WIP

* feat: Float64 input for to_timestamp

* cargo fmt

* clippy

* docs: double input type for to_timestamp

* feat: cast floats to timestamp

* style: cargo fmt

* fix: float64 cast for timestamp nanos only
* Support User Defined Table Function

Signed-off-by: veeupup <code@tanweime.com>

* fix comments

Signed-off-by: veeupup <code@tanweime.com>

* add udtf test

Signed-off-by: veeupup <code@tanweime.com>

* add file header

* Simply table function example, add some comments

* Simplfy exprs

* make clippy happy

* Update datafusion/core/tests/user_defined/user_defined_table_functions.rs

---------

Signed-off-by: veeupup <code@tanweime.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* document timestamp input limis

* fix text

* prettier

* remove doc for nanoseconds

* Update datafusion/physical-expr/src/datetime_expressions.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* fix: make ntile work in some corner cases

* fix comments

* minor

* Update datafusion/sqllogictest/test_files/window.slt

Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com>

---------

Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com>
Given that group keys inherently have few repeated values, especially
when grouping on a single column, the use of dictionary encoding is
unlikely to be yielding significant returns
* done

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* add more test

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* cleanup

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

---------

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Minor: Improve the documentation on `ScalarValue`

* Update datafusion/common/src/scalar.rs

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>

* Update datafusion/common/src/scalar.rs

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>

---------

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
* add benchmark

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* fmt

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* address clippy

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* cleanup

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* fix comment

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

---------

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* minor changes

* PipelineStatePropagator tree refactor

* Remove duplications by children_unbounded()

* Remove on-the-fly tree construction

* Minor changes

---------

Co-authored-by: Mustafa Akur <mustafa.akur@synnada.ai>
…8121)

* feat: support  LargeList in make_array and
array_length

* chore: add tests

* fix: update tests for nested array

* use usise_as

* add new_large_list

* refactor array_length

* add comment

* update test in sqllogictest

* fix ci

* fix macro

* use usize_as

* update comment

* return based on data_type in make_array
Ted-Jiang and others added 29 commits January 3, 2024 17:02
…apache#8737)

* Add primary key support for row_number window function

* Add comments, minor changes

* Add new test

* Review

---------

Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com>
…pache#8721)

* DistinctCountGroupsAccumulator

* test coverage

* clippy warnings

* count distinct for primitive types

* revert hashset to std

* fixed accumulator size estimation
* support LargeList in cardinality
…nforceDistribution` rule (apache#8731)

* Cleanup

* More

* Restore add_roundrobin_on_top

* Restore test files

* More

* Restore

* More

* More

* Make test stable

* For review

* Add test
* Clean internal implementation of WindowUDF

* fix doc
* support largelist in array_to_string

* reduce code duplication
…, `array_append` and `array_prepend` (apache#8636)

* reuse function for string concat

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* remove casting in string concat

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* add test

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* operator to function rewrite

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* fix explain

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* add more test

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* add column cases

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* cleanup

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* presever name

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* Update datafusion/optimizer/src/analyzer/rewrite_expr.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* rename

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

---------

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* fix bug

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* fmt

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* add rowsort

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

---------

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
…ingPredicate and cp_solver (apache#8749)

* Minor: Improve library docs to mention TreeNode, ExprSimplifier, PruningPredicate and cp_solver

* fix link
* Add logo source files

* add another file
* Add `schema_err!` error macros with optional backtrace
…pache#8740)

* revert eb8aff7 / Materialize dictionaries in group keys

* Update tests

* Update tests
* fix: struct don't push down to TableScan

* add similar to test and apply comment

* remove catch all in outer_columns_helper

* minor

* fix clippy

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Fix error messages in array expressions

* fix fmt
* move tests from  to sqllogictests part1

* Update datafusion/sqllogictest/test_files/expr.slt

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Update datafusion/sqllogictest/test_files/expr.slt

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

non-null sub-field on nullable struct-field has wrong nullity. Parallel NDSON file reading