-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-32335: [C++][Docs] Add design document for Acero #35320
Conversation
|
@github-actions crossbow submit preview-docs |
Revision: 04e8cc0 Submitted crossbow builds: ursacomputing/crossbow @ actions-22cf5a633d
|
04e8cc0
to
62c725d
Compare
@github-actions crossbow submit preview-docs |
Revision: 62c725d Submitted crossbow builds: ursacomputing/crossbow @ actions-838f0ce05e
|
C++ User Guide: http://crossbow.voltrondata.com/pr_docs/35320/cpp/streaming_execution.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a few minutes to read over this great documentation and stumbled upon a few typos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great! Very nice and comprehensive docs, this is going to be a super useful resource.
I added a few minor comments (and note for myself: I read up to the Developer's guide)
docs/source/cpp/acero/substrait.rst
Outdated
* The Substrait spec requires that a ``filter`` be completely satisfied by a read | ||
relation. However, Acero only uses a read filter for pushdown projection and | ||
it may not be fully satisfied. Users should generally attach an additional | ||
filter relation with the same filter expression after the read relation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side question, but wondering while reading this: if that's what the spec of substrait requires, shouldn't the translation of a substrait plan to Acero ExecPlan inject such an additional filter node to ensure we actually fulfill that requirement? (instead of requiring the user to do that explicitly in their Substrait plan)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a fair point. Let's address in a follow-up.
docs/source/cpp/acero/substrait.rst
Outdated
* Acero does not support the SATURATE option for overflow | ||
* Acero does not support kernels that take more than two arguments | ||
for the functions ``and``, ``or``, ``xor`` | ||
* Acero does not support temporal arithmetic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if it is clear enough that it is the Acero Substrait->Acero::ExecPlan conversion that doesn't support this, because Acero itself does support temporal arithmetic, as far as I know? (since we have scalar compute functions for temporal types, you should be able to use those in Acero)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a fair point. Let's address in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignore that above comment. Replied to the wrong thing. What I meant to say was...
I think we actually might support temporal arithmetic now. Also, some of the things in this list were definitely wrong. This section has always been a bit volatile and I went ahead and removed it for now (there is already a blanket statement that new functions are added regularly and mappings may not be available instantly).
@github-actions crossbow submit preview-docs |
Revision: e54a546 Submitted crossbow builds: ursacomputing/crossbow @ actions-cb2af88888
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to read the user_guide.rst
and part of developer_guide.rst
. As a newcomer to arrow compute and acero, I find this documentation easy to understand. Thanks!
@github-actions crossbow submit preview-docs |
Revision: 8707d92 Submitted crossbow builds: ursacomputing/crossbow @ actions-62d908a2ed
|
…d a developer's guide. Fix up the API reference docs for Acero
Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
8707d92
to
9eebc5a
Compare
This has been out for a while and I think there is some other ongoing docs work I don't want to conflict with. I'm going to merge this when CI passes. |
Benchmark runs are scheduled for baseline = 3f736ff and contender = 8a856c9. 8a856c9 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Rationale for this change
The documentation for Acero was incomplete. This PR refactors the existing documentation and adds several entirely new sections to form a complete design document for Acero.
What changes are included in this PR?
Some existing documentation is cleaned up. Acero documentation is moved into its own folder and broken into several pages.
Are these changes tested?
The documentation is built as part of the CI but I wouldn't say it is fully tested.
Are there any user-facing changes?
There are not code changes (other than the removal of two legacy methods) but there are many user-facing documentation changes.