Skip to content

Commit

Permalink
Minor: improve join_on docs
Browse files Browse the repository at this point in the history
  • Loading branch information
alamb committed Oct 12, 2023
1 parent 9be3a72 commit 399ee21
Showing 1 changed file with 25 additions and 7 deletions.
32 changes: 25 additions & 7 deletions datafusion/core/src/dataframe.rs
Original file line number Diff line number Diff line change
Expand Up @@ -582,12 +582,21 @@ impl DataFrame {
Ok(DataFrame::new(self.session_state, plan))
}

/// Join this DataFrame with another DataFrame using the specified columns as join keys.
/// Join this DataFrame with another DataFrame using explicitly specified
/// columns and an optional filter expression.
///
/// Filter expression expected to contain non-equality predicates that can not be pushed
/// down to any of join inputs.
/// In case of outer join, filter applied to only matched rows.
/// See [`join_on`](Self::join_on) for a more concise way to specify the
/// join condition. Since DataFusion will automatically identify and
/// optimize equality predicates there is no performance difference between
/// this function and `join_on`
///
/// `left_cols` and `right_cols` are used to form "equijoin" predicates (see
/// example below), which are then combined with the optional `filter`
/// expression.
///
/// Note that in case of outer join, the `filter` is applied to only matched rows.
///
/// # Example
/// ```
/// # use datafusion::prelude::*;
/// # use datafusion::error::Result;
Expand All @@ -600,6 +609,8 @@ impl DataFrame {
/// col("a").alias("a2"),
/// col("b").alias("b2"),
/// col("c").alias("c2")])?;
/// // Perform the equivalent of `left INNER JOIN right ON (a = a2 AND b = b2)`
/// // finding all pairs of rows from `left` and `right` where `a = a2` and `b = b2`.
/// let join = left.join(right, JoinType::Inner, &["a", "b"], &["a2", "b2"], None)?;
/// let batches = join.collect().await?;
/// # Ok(())
Expand All @@ -624,10 +635,13 @@ impl DataFrame {
Ok(DataFrame::new(self.session_state, plan))
}

/// Join this DataFrame with another DataFrame using the specified expressions.
/// Join this `DataFrame` with another `DataFrame` using the specified
/// expressions and join type.
///
/// Simply a thin wrapper over [`join`](Self::join) where the join keys are not provided,
/// and the provided expressions are AND'ed together to form the filter expression.
/// Note that DataFusion automatically optimizes joins, including
/// identifying and optimizing equality predicates.
///
/// # Example
///
/// ```
/// # use datafusion::prelude::*;
Expand All @@ -646,6 +660,10 @@ impl DataFrame {
/// col("b").alias("b2"),
/// col("c").alias("c2"),
/// ])?;
///
/// // Perform the equivalent of `left INNER JOIN right ON (a != a2 AND b != b2)`
/// // finding all pairs of rows from `left` and `right` where
/// // where `a != a2` and `b != b2`.
/// let join_on = left.join_on(
/// right,
/// JoinType::Inner,
Expand Down

0 comments on commit 399ee21

Please sign in to comment.