Skip to content

Commit

Permalink
[SPARK-14298][ML][MLLIB] LDA should support disable checkpoint
Browse files Browse the repository at this point in the history
## What changes were proposed in this pull request?
In the doc of [```checkpointInterval```](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala#L241), we told users that they can disable checkpoint by setting ```checkpointInterval = -1```. But we did not handle this situation for LDA actually, we should fix this bug.
## How was this patch tested?
Existing tests.

cc jkbradley

Author: Yanbo Liang <ybliang8@gmail.com>

Closes apache#12089 from yanboliang/spark-14298.

(cherry picked from commit 56af8e8)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
  • Loading branch information
yanboliang authored and jkbradley committed Apr 11, 2016
1 parent f4110cd commit 05dbc28
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ import org.apache.spark.storage.StorageLevel
* - This class removes checkpoint files once later Datasets have been checkpointed.
* However, references to the older Datasets will still return isCheckpointed = true.
*
* @param checkpointInterval Datasets will be checkpointed at this interval
* @param checkpointInterval Datasets will be checkpointed at this interval.
* If this interval was set as -1, then checkpointing will be disabled.
* @param sc SparkContext for the Datasets given to this checkpointer
* @tparam T Dataset type, such as RDD[Double]
*/
Expand Down Expand Up @@ -88,7 +89,8 @@ private[mllib] abstract class PeriodicCheckpointer[T](
updateCount += 1

// Handle checkpointing (after persisting)
if ((updateCount % checkpointInterval) == 0 && sc.getCheckpointDir.nonEmpty) {
if (checkpointInterval != -1 && (updateCount % checkpointInterval) == 0
&& sc.getCheckpointDir.nonEmpty) {
// Add new checkpoint before removing old checkpoints.
checkpoint(newData)
checkpointQueue.enqueue(newData)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,8 @@ import org.apache.spark.storage.StorageLevel
* // checkpointed: graph4
* }}}
*
* @param checkpointInterval Graphs will be checkpointed at this interval
* @param checkpointInterval Graphs will be checkpointed at this interval.
* If this interval was set as -1, then checkpointing will be disabled.
* @tparam VD Vertex descriptor type
* @tparam ED Edge descriptor type
*
Expand Down

0 comments on commit 05dbc28

Please sign in to comment.