Skip to content

Commit

Permalink
Merge commit 'b41ef20c5dad7bdd674e3cc5f35a9c99efae676c' into chunchun…
Browse files Browse the repository at this point in the history
…/update-df-apr-week-4-4
  • Loading branch information
appletreeisyellow committed May 1, 2024
2 parents e3796f7 + b41ef20 commit cc73c49
Show file tree
Hide file tree
Showing 3 changed files with 64 additions and 9 deletions.
66 changes: 58 additions & 8 deletions datafusion/expr/src/columnar_value.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
// specific language governing permissions and limitations
// under the License.

//! Columnar value module contains a set of types that represent a columnar value.
//! [`ColumnarValue`] represents the result of evaluating an expression.

use arrow::array::ArrayRef;
use arrow::array::NullArray;
Expand All @@ -25,15 +25,65 @@ use datafusion_common::format::DEFAULT_CAST_OPTIONS;
use datafusion_common::{internal_err, Result, ScalarValue};
use std::sync::Arc;

/// Represents the result of evaluating an expression: either a single
/// [`ScalarValue`] or an [`ArrayRef`].
/// The result of evaluating an expression.
///
/// While a [`ColumnarValue`] can always be converted into an array
/// for convenience, it is often much more performant to provide an
/// optimized path for scalar values.
/// [`ColumnarValue::Scalar`] represents a single value repeated any number of
/// times. This is an important performance optimization for handling values
/// that do not change across rows.
///
/// See [`ColumnarValue::values_to_arrays`] for a function that converts
/// multiple columnar values into arrays of the same length.
/// [`ColumnarValue::Array`] represents a column of data, stored as an Arrow
/// [`ArrayRef`]
///
/// A slice of `ColumnarValue`s logically represents a table, with each column
/// having the same number of rows. This means that all `Array`s are the same
/// length.
///
/// # Example
///
/// A `ColumnarValue::Array` with an array of 5 elements and a
/// `ColumnarValue::Scalar` with the value 100
///
/// ```text
/// ┌──────────────┐
/// │ ┌──────────┐ │
/// │ │ "A" │ │
/// │ ├──────────┤ │
/// │ │ "B" │ │
/// │ ├──────────┤ │
/// │ │ "C" │ │
/// │ ├──────────┤ │
/// │ │ "D" │ │ ┌──────────────┐
/// │ ├──────────┤ │ │ ┌──────────┐ │
/// │ │ "E" │ │ │ │ 100 │ │
/// │ └──────────┘ │ │ └──────────┘ │
/// └──────────────┘ └──────────────┘
///
/// ColumnarValue:: ColumnarValue::
/// Array Scalar
/// ```
///
/// Logically represents the following table:
///
/// | Column 1| Column 2 |
/// | ------- | -------- |
/// | A | 100 |
/// | B | 100 |
/// | C | 100 |
/// | D | 100 |
/// | E | 100 |
///
/// # Performance Notes
///
/// When implementing functions or operators, it is important to consider the
/// performance implications of handling scalar values.
///
/// Because all functions must handle [`ArrayRef`], it is
/// convenient to convert [`ColumnarValue::Scalar`]s using
/// [`Self::into_array`]. For example, [`ColumnarValue::values_to_arrays`]
/// converts multiple columnar values into arrays of the same length.
///
/// However, it is often much more performant to provide a different,
/// implementation that handles scalar values differently
#[derive(Clone, Debug)]
pub enum ColumnarValue {
/// Array of values
Expand Down
4 changes: 3 additions & 1 deletion datafusion/proto/proto/datafusion.proto
Original file line number Diff line number Diff line change
Expand Up @@ -389,6 +389,7 @@ message LogicalExprNode {
NegativeNode negative = 13;
InListNode in_list = 14;
Wildcard wildcard = 15;
// was ScalarFunctionNode scalar_function = 16;
TryCastNode try_cast = 17;

// window expressions
Expand Down Expand Up @@ -1310,12 +1311,13 @@ message PhysicalExprNode {
PhysicalSortExprNode sort = 10;
PhysicalNegativeNode negative = 11;
PhysicalInListNode in_list = 12;
// was PhysicalScalarFunctionNode scalar_function = 13;
PhysicalTryCastNode try_cast = 14;

// window expressions
PhysicalWindowExprNode window_expr = 15;

PhysicalScalarUdfNode scalar_udf = 16;
// was PhysicalDateTimeIntervalExprNode date_time_interval_expr = 17;

PhysicalLikeExprNode like_expr = 18;
}
Expand Down
3 changes: 3 additions & 0 deletions datafusion/proto/src/generated/prost.rs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit cc73c49

Please sign in to comment.