Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged Apache bug fixes #165

Merged
merged 21 commits into from
Apr 18, 2016
Merged

Merged Apache bug fixes #165

merged 21 commits into from
Apr 18, 2016

Conversation

markhamstra
Copy link

No description provided.

jayv and others added 10 commits April 4, 2016 13:29
Backport for apache#10370 andrewor14

Author: Jo Voordeckers <jo.voordeckers@gmail.com>

Closes apache#12101 from jayv/mesos_cluster_params_backport.
…case unit.

## What changes were proposed in this pull request?

This fix tries to address the issue in PySpark where `spark.python.worker.memory`
could only be configured with a lower case unit (`k`, `m`, `g`, `t`). This fix
allows the upper case unit (`K`, `M`, `G`, `T`) to be used as well. This is to
conform to the JVM memory string as is specified in the documentation .

## How was this patch tested?

This fix adds additional test to cover the changes.

Author: Yong Tang <yong.tang.github@outlook.com>

Closes apache#12163 from yongtang/SPARK-14368.

(cherry picked from commit 7db5624)
Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
…locks

## What changes were proposed in this pull request?

This patch try to  update the `updatedBlockStatuses ` when removing blocks, making sure `BlockManager` correctly updates `updatedBlockStatuses`

## How was this patch tested?

test("updated block statuses") in BlockManagerSuite.scala

Author: jeanlyn <jeanlyn92@gmail.com>

Closes apache#12150 from jeanlyn/updataBlock1.6.
…Optimizer

## What changes were proposed in this pull request?
jira: https://issues.apache.org/jira/browse/SPARK-14322

OnlineLDAOptimizer uses RDD.reduce in two places where it could use treeAggregate. This can cause scalability issues. This should be an easy fix.
This is also a bug since it modifies the first argument to reduce, so we should use aggregate or treeAggregate.
See this line: https://github.com/apache/spark/blob/f12f11e578169b47e3f8b18b299948c0670ba585/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala#L452
and a few lines below it.

## How was this patch tested?
unit tests

Author: Yuhao Yang <hhbyyh@gmail.com>

Closes apache#12106 from hhbyyh/ldaTreeReduce.

(cherry picked from commit 8cffcb6)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
Docs change to remove the sentence about Mesos not supporting cluster mode.

It was not.

Author: Michael Gummelt <mgummelt@mesosphere.io>

Closes apache#12249 from mgummelt/fix-mesos-cluster-docs.

(cherry picked from commit 30e980a)
Signed-off-by: Andrew Or <andrew@databricks.com>
## What changes were proposed in this pull request?

`OutputCommitCoordinator` was introduced to deal with concurrent task attempts racing to write output, leading to data loss or corruption. For more detail, read the [JIRA description](https://issues.apache.org/jira/browse/SPARK-14468).

Before: `OutputCommitCoordinator` is enabled only if speculation is enabled.
After: `OutputCommitCoordinator` is always enabled.

Users may still disable this through `spark.hadoop.outputCommitCoordination.enabled`, but they really shouldn't...

## How was this patch tested?

`OutputCommitCoordinator*Suite`

Author: Andrew Or <andrew@databricks.com>

Closes apache#12244 from andrewor14/always-occ.

(cherry picked from commit 3e29e37)
Signed-off-by: Andrew Or <andrew@databricks.com>
…ied exception

## What changes were proposed in this pull request?

When deciding whether a CommitDeniedException caused a task to fail, consider the root cause of the Exception.

## How was this patch tested?

Added a test suite for the component that extracts the root cause of the error.
Made a distribution after cherry-picking this commit to branch-1.6 and used to run our Spark application that would quite often fail due to the CommitDeniedException.

Author: Jason Moore <jasonmoore2k@outlook.com>

Closes apache#12228 from jasonmoore2k/SPARK-14357.

(cherry picked from commit 22014e6)
Signed-off-by: Andrew Or <andrew@databricks.com>
…emory copy in Netty's tran…

## What changes were proposed in this pull request?
When netty transfer data that is not `FileRegion`, data will be in format of `ByteBuf`, If the data is large, there will occur significant performance issue because there is memory copy underlying in `sun.nio.ch.IOUtil.write`, the CPU is 100% used, and network is very low.

In this PR, if data size is large, we will split it into small chunks to call `WritableByteChannel.write()`, so that avoid wasting of memory copy. Because the data can't be written within a single write, and it will call `transferTo` multiple times.

## How was this patch tested?
Spark unit test and manual test.
Manual test:
`sc.parallelize(Array(1,2,3),3).mapPartitions(a=>Array(new Array[Double](1024 * 1024 * 50)).iterator).reduce((a,b)=> a).length`

For more details, please refer to [SPARK-14290](https://issues.apache.org/jira/browse/SPARK-14290)

Author: Zhang, Liye <liye.zhang@intel.com>

Closes apache#12296 from liyezhang556520/apache-branch-1.6-spark-14290.
…failed

Backports apache#12234 to 1.6. Original description below:

## What changes were proposed in this pull request?

This patch adds support for better handling of exceptions inside catch blocks if the code within the block throws an exception. For instance here is the code in a catch block before this change in `WriterContainer.scala`:

```scala
logError("Aborting task.", cause)
// call failure callbacks first, so we could have a chance to cleanup the writer.
TaskContext.get().asInstanceOf[TaskContextImpl].markTaskFailed(cause)
if (currentWriter != null) {
  currentWriter.close()
}
abortTask()
throw new SparkException("Task failed while writing rows.", cause)
```

If `markTaskFailed` or `currentWriter.close` throws an exception, we currently lose the original cause. This PR fixes this problem by implementing a utility function `Utils.tryWithSafeCatch` that suppresses (`Throwable.addSuppressed`) the exception that are thrown within the catch block and rethrowing the original exception.

## How was this patch tested?

No new functionality added

Author: Sameer Agarwal <sameer@databricks.com>

Closes apache#12272 from sameeragarwal/fix-exception-1.6.
JoshRosen and others added 11 commits April 11, 2016 15:14
…n archive.apache.org

[archive.apache.org](https://archive.apache.org/) is undergoing maintenance, breaking our `build/mvn` script:

> We are in the process of relocating this service. To save on the immense bandwidth that this service outputs, we have put it in maintenance mode, disabling all downloads for the next few days. We expect the maintenance to be complete no later than the morning of Monday the 11th of April, 2016.

This patch fixes this issue by updating the script to use the regular mirror network to download Maven.

(This is a backport of apache#12262 to 1.6)

Author: Josh Rosen <joshrosen@databricks.com>

Closes apache#12307 from JoshRosen/fix-1.6-mvn-download.
## What changes were proposed in this pull request?
In the doc of [```checkpointInterval```](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala#L241), we told users that they can disable checkpoint by setting ```checkpointInterval = -1```. But we did not handle this situation for LDA actually, we should fix this bug.
## How was this patch tested?
Existing tests.

cc jkbradley

Author: Yanbo Liang <ybliang8@gmail.com>

Closes apache#12089 from yanboliang/spark-14298.

(cherry picked from commit 56af8e8)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
…decoder

## What changes were proposed in this pull request?
In this patch, we set the initial `maxNumComponents` to `Integer.MAX_VALUE` instead of the default size ( which is 16) when allocating `compositeBuffer` in `TransportFrameDecoder` because `compositeBuffer` will introduce too many memory copies underlying if `compositeBuffer` is with default `maxNumComponents` when the frame size is large (which result in many transport messages). For details, please refer to [SPARK-14242](https://issues.apache.org/jira/browse/SPARK-14242).

## How was this patch tested?
spark unit tests and manual tests.
For manual tests, we can reproduce the performance issue with following code:
`sc.parallelize(Array(1,2,3),3).mapPartitions(a=>Array(new Array[Double](1024 * 1024 * 50)).iterator).reduce((a,b)=> a).length`
It's easy to see the performance gain, both from the running time and CPU usage.

Author: Zhang, Liye <liye.zhang@intel.com>

Closes apache#12038 from liyezhang556520/spark-14242.
…ransformer

## What changes were proposed in this pull request?

Use a random table name instead of `__THIS__` in SQLTransformer, and add a test for `transformSchema`. The problems of using `__THIS__` are:

* It doesn't work under HiveContext (in Spark 1.6)
* Race conditions

## How was this patch tested?

* Manual test with HiveContext.
* Added a unit test for `transformSchema` to improve coverage.

cc: yhuai

Author: Xiangrui Meng <meng@databricks.com>

Closes apache#12330 from mengxr/SPARK-14563.

(cherry picked from commit 1995c2e)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
## What changes were proposed in this pull request?

This PR improve the performance of SQL UI by:

1) remove the details column in all executions page (the first page in SQL tab). We can check the details by enter the execution page.
2) break-all is super slow in Chrome recently, so switch to break-word.
3) Using "display: none" to hide a block.
4) using one js closure for  for all the executions, not one for each.
5) remove the height limitation of details, don't need to scroll it in the tiny window.

## How was this patch tested?

Exists tests.

![ui](https://cloud.githubusercontent.com/assets/40902/14445712/68d7b258-0004-11e6-9b48-5d329b05d165.png)

Author: Davies Liu <davies@databricks.com>

Closes apache#12311 from davies/ui_perf.
Fix memory leak in the Sorter. When the UnsafeExternalSorter spills the data to disk, it does not free up the underlying pointer array. As a result, we see a lot of executor OOM and also memory under utilization.
This is a regression partially introduced in PR apache#9241

Tested by running a job and observed around 30% speedup after this change.

Author: Sital Kedia <skedia@fb.com>

Closes apache#12285 from sitalkedia/executor_oom.

(cherry picked from commit d187e7d)
Signed-off-by: Davies Liu <davies.liu@gmail.com>

Conflicts:
	core/src/main/java/org/apache/spark/shuffle/sort/ShuffleInMemorySorter.java
	core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java
## What changes were proposed in this pull request?

In Spark 1.4, we negated some metrics from RegressionEvaluator since CrossValidator always maximized metrics. This was fixed in 1.5, but the docs were not updated. This PR updates the docs.

## How was this patch tested?

no tests

Author: Joseph K. Bradley <joseph@databricks.com>

Closes apache#12377 from jkbradley/regeval-doc.

(cherry picked from commit bf65c87)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
…pwords

The default stopwords were a Java object.  They are no longer.

Unit test which failed before the fix

Author: Joseph K. Bradley <joseph@databricks.com>

Closes apache#12422 from jkbradley/pyspark-stopwords.

(cherry picked from commit d6ae7d4)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>

Conflicts:
	python/pyspark/ml/feature.py
	python/pyspark/ml/tests.py
@markhamstra markhamstra merged commit 588bd75 into alteryx:csd-1.6 Apr 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.