Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sqllogictest for datafusion #1453

Closed
xudong963 opened this issue Dec 15, 2021 · 13 comments
Closed

Add sqllogictest for datafusion #1453

xudong963 opened this issue Dec 15, 2021 · 13 comments
Labels
enhancement New feature or request

Comments

@xudong963
Copy link
Member

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

As we get more and more datafusion users(This is a great thing! 👍), bug reports are also increasing.

Recently I was wondering how to make the user experience more comfortable, reducing bugs is the most straightforward way.

I think Datafusion's test mechanism isn't solid, so I looked into how other databases are tested, SQLite caught my eye.
FYI

As of version 3.33.0 (2020-08-14), the SQLite library consists of approximately 143.4 KSLOC of C code. (KSLOC means thousands of "Source Lines Of Code" or, in other words, lines of code excluding blank lines and comments.) By comparison, the project has 640 times as much test code and test scripts - 91911.0 KSLOC.

https://sqlite.org/testing.html

Describe the solution you'd like
I want to introduce sqllogictest into datafusion, https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki.

BTW, rust version of sqllogictest is already available and I think their developers are active(risinglightdb/sqllogictest-rs#2). So we don't have to reinvent the wheel 👍

Describe alternatives you've considered
No

Additional context
Our integration tests seem to be doing something similar, but I think it's a little fragile.

@xudong963 xudong963 added the enhancement New feature or request label Dec 15, 2021
@xudong963
Copy link
Member Author

Any thoughts? @alamb @houqp @jimexist

@houqp
Copy link
Member

houqp commented Dec 15, 2021

Great idea @xudong963 !

@alamb
Copy link
Contributor

alamb commented Dec 15, 2021

#743 may also be related.

I agree DataFusion's tests could use some reorganizing

Note that a quirk of how sqlite is developed is that the test suite is not open source ;)

I do think that https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki has basically the same aim as our current integration tests (written in python) contributed by @jimexist

I would be supportive of moving to a different test runner that is part of another project. I think it would be a net negative if we ended up with two test runners. Perhaps a good maturity test of sqllogictest would be if it can handle all the queries currently supported by the datafusion integration test;

@xudong963
Copy link
Member Author

Perhaps a good maturity test of sqllogictest would be if it can handle all the queries currently supported by the datafusion integration test;

Sure, I think it can!

@liukun4515
Copy link
Contributor

@xudong963 Does the sqllogical-test is like the mysql-test-framework?

@liukun4515
Copy link
Contributor

In the mysql-test-framework or similar framework, we could add some tests by using the sqls and comparing results with result files.
But the method may be heavy for unit tests if we have more and more sql tests, it is usually used in the integration test.

@hntd187
Copy link
Contributor

hntd187 commented Dec 18, 2021

@xudong963 @alamb I plan to take up the test re-organizing and refactoring as part of the email response I got from you, perhaps I should start a more general issue to talk about at a higher level that refactoring and it's relation to this?

@liukun4515
Copy link
Contributor

@xudong963 @alamb I plan to take up the test re-organizing and refactoring as part of the email response I got from you, perhaps I should start a more general issue to talk about at a higher level that refactoring and it's relation to this?

I'm confused about that why we need to refactor or re-organize the tests?

@hntd187
Copy link
Contributor

hntd187 commented Dec 20, 2021

@xudong963 @alamb I plan to take up the test re-organizing and refactoring as part of the email response I got from you, perhaps I should start a more general issue to talk about at a higher level that refactoring and it's relation to this?

I'm confused about that why we need to refactor or re-organize the tests?

I had emailed @alamb about some bigger things to tackle and I think he mentioned re-organizing the tests into something more extendable but I have not investigated what actually has to be done for it. If it doesn't make sense I can focus elsewhere

@matthewmturner
Copy link
Contributor

@hntd187 @liukun4515 FYI I had been working on #743 which was updating and reorganizing some of the tests. I still have to do the third task but currently working on something on the arrow-rs side which is taking my time. not sure if this is the reorganization of tests that @alamb was thinking of or if it was something else.

@alamb
Copy link
Contributor

alamb commented Dec 21, 2021

I was indeed thinking of #743 (specifically reorganize sql.rs so that it is not just a single giant module, and move code from context.rs there as well - item 4)

Since we originally discussed #743, @jimexist added the postgres integration test which makes things a little more complicated.

Perhaps the best thing to do is to start trying to describe what we have so far / want the tests to look like. I started a PR to do so here #1471

@alamb
Copy link
Contributor

alamb commented Nov 17, 2022

We may be finally ready to revisit this -- see #4248 (and thanks to @xudong963 for originally bringing it up!)

@xudong963
Copy link
Member Author

Let close the duplicate issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants