Skip to content

Commit

Permalink
[SPARK-40154][PYTHON][DOCS] Correct storage level in Dataframe.cache …
Browse files Browse the repository at this point in the history
…docstring

### What changes were proposed in this pull request?
Corrects the docstring `DataFrame.cache` to give the correct storage level after it changed with Spark 3.0. It seems that the docstring of `DataFrame.persist` was updated, but `cache` was forgotten.

### Why are the changes needed?
The doctoring claims that `cache` uses serialised storage, but it actually uses deserialised storage. I confirmed that this is still the case with Spark 3.5.0 using the example code from the Jira ticket.

### Does this PR introduce _any_ user-facing change?
Yes, the docstring changes.

### How was this patch tested?
The Github actions workflow succeeded.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#43229 from paulstaab/SPARK-40154.

Authored-by: Paul Staab <paulstaab@users.noreply.github.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit 94607dd)
Signed-off-by: Sean Owen <srowen@gmail.com>
  • Loading branch information
paulstaab authored and srowen committed Oct 25, 2023
1 parent 26f6663 commit 9e4411e
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions python/pyspark/sql/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -1485,7 +1485,7 @@ def foreachPartition(self, f: Callable[[Iterator[Row]], None]) -> None:
self.rdd.foreachPartition(f) # type: ignore[arg-type]

def cache(self) -> "DataFrame":
"""Persists the :class:`DataFrame` with the default storage level (`MEMORY_AND_DISK`).
"""Persists the :class:`DataFrame` with the default storage level (`MEMORY_AND_DISK_DESER`).
.. versionadded:: 1.3.0
Expand All @@ -1494,7 +1494,7 @@ def cache(self) -> "DataFrame":
Notes
-----
The default storage level has changed to `MEMORY_AND_DISK` to match Scala in 2.0.
The default storage level has changed to `MEMORY_AND_DISK_DESER` to match Scala in 3.0.
Returns
-------
Expand Down

0 comments on commit 9e4411e

Please sign in to comment.