Skip to content

Commit

Permalink
More doc
Browse files Browse the repository at this point in the history
  • Loading branch information
zanmato1984 committed Jun 11, 2024
1 parent 8bb5af3 commit b382a67
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 4 deletions.
20 changes: 20 additions & 0 deletions cpp/src/arrow/compute/special_form.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,28 @@
namespace arrow {
namespace compute {

/// The concept "special form" is borrowed from Lisp
/// (https://courses.cs.northwestern.edu/325/readings/special-forms.html). Velox also uses
/// the same term. A special form behaves like a function call except that it has special
/// evaluation rules, mostly for arguments.
/// For example, the `if_else(cond, expr1, expr2)` special form first evaluates the
/// argument `cond` and obtains a boolean array:
/// [true, false, true, false]
/// then the argument `expr1` should ONLY be evaluated for row:
/// [0, 2]
/// and the argument `expr2` should ONLY be evaluated for row:
/// [1, 3]
/// Consider, if `expr1`/`expr2` has some observable side-effects (e.g., division by zero
/// error) on row [1, 3]/[0, 2], these side-effects would be undesirably observed if
/// evaluated using a regular function call, which always evaluates all its arguments
/// eagerly.
/// Other special forms include `case_when`, `and`, and `or`, etc.
/// In a vectorized execution engine, a special form normally takes advantage of
/// "selection vector" to mask rows of arguments to be evaluated.
class ARROW_EXPORT SpecialForm {
public:
/// A poor man's factory method to create a special form by name.
/// TODO: More formal factory, a registry maybe?
static Result<std::unique_ptr<SpecialForm>> Make(const std::string& name);

virtual ~SpecialForm() = default;
Expand Down
10 changes: 6 additions & 4 deletions cpp/src/arrow/compute/special_forms/if_else_special_form.cc
Original file line number Diff line number Diff line change
Expand Up @@ -29,16 +29,17 @@ namespace compute {
Result<Datum> IfElseSpecialForm::Execute(const Expression::Call& call,
const ExecBatch& input,
ExecContext* exec_context) {
DCHECK(!call.kernel->selection_vector_aware);
DCHECK(!input.selection_vector);
// The kernel (if_else) is not selection-vector-aware, so the input should not have a
// selection vector.
DCHECK(!call.kernel->selection_vector_aware && !input.selection_vector);

std::vector<Datum> arguments(call.arguments.size());
ARROW_ASSIGN_OR_RAISE(arguments[0],
ExecuteScalarExpression(call.arguments[0], input, exec_context));
// Use cond as selection vector for IF.
// Use cond as selection vector for IF branch.
// TODO: Consider chunked array for arguments[0].
auto if_sel = std::make_shared<SelectionVector>(arguments[0].array());
// Duplicate and invert cond as selection vector for ELSE.
// Duplicate and invert cond as selection vector for ELSE branch.
ARROW_ASSIGN_OR_RAISE(
auto else_sel,
if_sel->Copy(CPUDevice::memory_manager(exec_context->memory_pool())));
Expand All @@ -53,6 +54,7 @@ Result<Datum> IfElseSpecialForm::Execute(const Expression::Call& call,
ARROW_ASSIGN_OR_RAISE(
arguments[2], ExecuteScalarExpression(call.arguments[2], else_input, exec_context));

// Leveraging if_else kernel with all arguments evaluated.
return ExecuteCallNonRecursive(call, input, arguments, exec_context);
}

Expand Down

0 comments on commit b382a67

Please sign in to comment.