Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GraphX merge upstream changes and switched vertex id type to long #1

Open
wants to merge 529 commits into
base: graph
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
529 commits
Select commit Hold shift + click to select a range
c40f0f2
Merge pull request #711 from shivaram/ml-generators
mateiz Jul 19, 2013
cfce9a6
Regression: default webui-port can't be set via command line "--webui…
wandjenkins Jul 19, 2013
81bb5dc
Creates Executors tab for application with RDD block and memory/disk …
karenfeng Jul 19, 2013
865dc63
Changed table format for executors
karenfeng Jul 19, 2013
15fb394
Merge pull request #716 from c0s/webui-port
mateiz Jul 21, 2013
f4d5148
Building spark assembly for further consumption of the Spark project …
wandjenkins Jul 3, 2013
0337d88
Add a public method getCachedRdds to SparkContext
Jul 22, 2013
636b19f
Merge branch 'master' of https://github.com/mesos/spark into ui-808
karenfeng Jul 22, 2013
8901f37
Fixed memory used/remaining/total bug
karenfeng Jul 22, 2013
f649dab
Fix bug: DoubleRDDFunctions.sampleStdev() computed non-sample stdev().
JoshRosen Jul 22, 2013
85c4d7b
Shows number of complete/total/failed tasks (bug: failed tasks assign…
karenfeng Jul 22, 2013
2eea974
Executors UI now calls executor ID from TaskInfo instead of TaskMetrics
karenfeng Jul 22, 2013
8ae1436
Merge pull request #722 from JoshRosen/spark-825
mateiz Jul 22, 2013
8e38e77
Fix a test that was using an outdated config setting
mateiz Jul 22, 2013
c836804
Add JavaAPICompletenessChecker.
JoshRosen Jul 18, 2013
e17e1b3
Remove annotation code that broke build.
JoshRosen Jul 21, 2013
ea1cfab
Merge branch 'master' of github.com:mesos/spark
mateiz Jul 22, 2013
872c97a
Split task columns, memory columns sort by numeric value
karenfeng Jul 22, 2013
401aac8
Merge pull request #719 from karenfeng/ui-808
mateiz Jul 22, 2013
2c2bfbe
Add toMap method to TimeStampedHashMap and use it
Jul 23, 2013
4830e22
Rename method per rxin feedback
Jul 23, 2013
efd6418
Move getPersistentRDDs testing to a new Suite
Jul 23, 2013
87a9dd8
Made RegressionModel serializable and added unit tests to make sure p…
rxin Jul 23, 2013
2210e8c
Use a different validation dataset for Logistic Regression prediction…
rxin Jul 23, 2013
f369e0e
Merge pull request #720 from ooyala/2013-07/persistent-rdds-api
mateiz Jul 23, 2013
0200801
Tracks task start events and shows number of active tasks on Executor UI
karenfeng Jul 23, 2013
5364f64
Merge pull request #723 from rxin/mllib
shivaram Jul 23, 2013
9f2dbb2
Adds/removes active tasks only once
karenfeng Jul 23, 2013
101b8cc
SPARK-829: scheduler shouldn't hang if a task contains unserializable…
rxin Jul 23, 2013
5ed38b4
Scheduler code style cleanup.
rxin Jul 23, 2013
f2422d4
SPARK-829: scheduler shouldn't hang if a task contains unserializable…
rxin Jul 23, 2013
383684d
Replaces Seq with HashSet, removes redundant import
karenfeng Jul 23, 2013
abc78cd
Modifies instead of copies HashSets, fixes comment style
karenfeng Jul 23, 2013
2f1736c
Merge pull request #725 from karenfeng/task-start
mateiz Jul 23, 2013
6a31b71
Small bug fix
mateiz Jul 23, 2013
85ab811
Moved non-serializable closure catching exception from submitStage to…
rxin Jul 24, 2013
d33b8a2
Added comments on task closure serialization.
rxin Jul 24, 2013
3dae1df
Moved non-serializable closure catching exception from submitStage to…
rxin Jul 24, 2013
876125b
Merge pull request #726 from rxin/spark-826
mateiz Jul 24, 2013
b011329
Merge pull request #727 from rxin/scheduler
mateiz Jul 24, 2013
503acd3
Build metrics system framwork
jerryshao Jun 27, 2013
9dec8c7
Add Master and Worker instrumentation support
jerryshao Jun 27, 2013
c3daad3
Update metric source support for instrumentation
jerryshao Jun 27, 2013
03f9871
MetricsSystem refactor
jerryshao Jun 27, 2013
4d6dd67
refactor metrics system
xiajunluan Jun 27, 2013
7fb574b
Code clean and remarshal
jerryshao Jun 28, 2013
576528f
Add dependency of Codahale's metrics library
jerryshao Jun 28, 2013
871bc16
Add Executor instrumentation
jerryshao Jun 28, 2013
5ce5dc9
Add default properties to deal with no configure file situation
jerryshao Jun 28, 2013
e080588
Add metrics system unit test
jerryshao Jul 1, 2013
e9ac887
Remove twice add Source bug and code clean
jerryshao Jul 1, 2013
7d2eada
Add metrics source of DAGScheduler and blockManager
xiajunluan Jul 1, 2013
9cea0c2
Refactor metricsSystem unit test, add resource files.
xiajunluan Jul 1, 2013
5f8802c
Register and init metricsSystem in SparkContext
xiajunluan Jul 2, 2013
1daff54
Change Executor MetricsSystem initialize code to SparkEnv
jerryshao Jul 2, 2013
a79f607
Add Maven metrics library dependency and code changes
jerryshao Jul 2, 2013
5730193
Fix some typos
jerryshao Jul 2, 2013
ed1a3bc
continue to refactor code style and functions
xiajunluan Jul 3, 2013
5b4a2f2
Add metrics config template file
xiajunluan Jul 3, 2013
05637de
Change class xxxInstrumentation to class xxxSource
xiajunluan Jul 3, 2013
8d1ef7f
Code style changes
jerryshao Jul 4, 2013
31ec72b
Code refactor according to comments
jerryshao Jul 16, 2013
a73f3ee
Merge pull request #671 from jerryshao/master
mateiz Jul 24, 2013
93c6015
Shows task status and running tasks on Stage Page: fixes SPARK-804 an…
karenfeng Jul 24, 2013
bd3931c
Changed ifs with returns to if/else
karenfeng Jul 24, 2013
5584ebc
Merge pull request #675 from c0s/assembly
mateiz Jul 24, 2013
4280e17
Removed finished status for task info, changed name of success case
karenfeng Jul 24, 2013
57009ee
Fixed consistency of "success" status string
karenfeng Jul 24, 2013
1d10192
Fix setting of SPARK_EXAMPLES_JAR
jey Jul 22, 2013
20338c2
Merge pull request #729 from karenfeng/ui-811
mateiz Jul 24, 2013
52723b9
Merge pull request #728 from jey/examples-jar-env
mateiz Jul 24, 2013
eef6787
Adding SVM and Lasso, moving LogisticRegression to classification fro…
pxinghao Jul 24, 2013
c258718
Fix Maven build errors after previous commits
mateiz Jul 24, 2013
8e0939f
refactor Kryo serializer support to use chill/chill-java
ryanlecompte Jul 25, 2013
a1c515f
add copyright back in
ryanlecompte Jul 25, 2013
fc4b025
add test
ryanlecompte Jul 25, 2013
30a369a
update pom.xml
ryanlecompte Jul 25, 2013
e56aa75
fix wrapping
ryanlecompte Jul 25, 2013
51c2427
Merge pull request #732 from ryanlecompte/master
mateiz Jul 25, 2013
e2421c1
Update Chill reference in pom.xml too
mateiz Jul 25, 2013
8eb8b52
Fix Chill version in Maven
mateiz Jul 25, 2013
a6de90c
For standalone mode, get JAVA_HOME, SPARK_JAVA_OPTS, SPARK_LIBRARY_PA…
charlesreiss Jul 25, 2013
f3cf094
Merge pull request #734 from woggle/executor-env2
mateiz Jul 25, 2013
d4bbc8b
Shows totals for shuffle data and CPU time in Stage, homepage overvie…
karenfeng Jul 25, 2013
22faeab
Split Shuffle Activity overview column for read/write
karenfeng Jul 26, 2013
3fbe9ea
Displys shuffle read/write only if exists, wraps if statements, trims…
karenfeng Jul 26, 2013
743fc4e
Fix Bug in Partition Pruning, index of Pruned Partitions should inher…
Jul 26, 2013
822aac8
Indentation
Jul 26, 2013
72cf7ec
Indentation
Jul 26, 2013
392d747
Code review
Jul 26, 2013
3fc6408
Added missing scalatest dependency
markhamstra Jul 26, 2013
cb36677
Merge pull request #738 from harsha2010/pruning
rxin Jul 26, 2013
f3d72ff
Merge pull request #739 from markhamstra/toolsPom
rxin Jul 27, 2013
bd4cc52
Made metrics Option instead of Some, fixed NullPointerException
karenfeng Jul 27, 2013
f74a03c
Multiple changes
pxinghao Jul 27, 2013
f0a1f95
Rename LogisticRegression, SVM and Lasso to *_LocalRandomSGD
pxinghao Jul 27, 2013
10fd394
Making ClassificationModel serializable
pxinghao Jul 27, 2013
071afe2
New files from merge with master
pxinghao Jul 27, 2013
b0bbc7f
Resolve conflicts with master, removed regParam for LogisticRegression
pxinghao Jul 27, 2013
0c391fe
Maximum task failures configurable
dlyubimov Jul 22, 2013
6a47cee
style
dlyubimov Jul 22, 2013
1714693
Current time called once with value now
karenfeng Jul 27, 2013
dcc4743
Moved val now to render
karenfeng Jul 27, 2013
5a93e3c
Cleaned up code based on pwendell's suggestions
karenfeng Jul 27, 2013
c2223e6
Improve catch scope and logging for client stop()
pwendell Jul 15, 2013
8177165
Log executor on finish
pwendell Jul 27, 2013
bcafb36
Slight wording change
pwendell Jul 27, 2013
077f2da
Fixed outdated bugs
karenfeng Jul 27, 2013
f11ad72
Some fixes to Python examples (style and package name for LR)
mateiz Jul 28, 2013
f5067ab
changes per comments.
dlyubimov Jul 28, 2013
0862494
typo
dlyubimov Jul 28, 2013
ccfa362
Change *_LocalRandomSGD to *LocalRandomSGD
pxinghao Jul 28, 2013
72ff62a
Two fixes to IPython support:
mateiz Jul 29, 2013
29e0429
Move data generators to util
pxinghao Jul 29, 2013
67de051
SVMSuite and LassoSuite rewritten to follow closely with LogisticRegr…
pxinghao Jul 29, 2013
9398dce
Changed Classification to return Int instead of Double
pxinghao Jul 29, 2013
96e04f4
Fixed SVM and LR train functions to take Int instead of Double for Cl…
pxinghao Jul 29, 2013
c823ee1
Replace map-reduce with dot operator using DoubleMatrix
pxinghao Jul 29, 2013
b9d6783
Optimize Python take() to not compute entire first partition
mateiz Jul 29, 2013
b5ec355
Optimize Python foreach() to not return as many objects
mateiz Jul 29, 2013
96b50e8
Allow python/run-tests to run from any directory
mateiz Jul 29, 2013
d75c308
Use None instead of empty string as it's slightly smaller/faster
mateiz Jul 29, 2013
feba7ee
SPARK-815. Python parallelize() should split lists before batching
mateiz Jul 29, 2013
497f557
Add docs about ipython
mateiz Jul 29, 2013
d8158ce
Merge branch 'master' of github.com:mesos/spark
mateiz Jul 29, 2013
75f3757
Fix rounding error in LogisticRegression.scala
pxinghao Jul 29, 2013
3a8d07d
Deleting extra LogisticRegressionGenerator and RidgeRegressionGenerator
pxinghao Jul 29, 2013
07f1743
Fix validatePrediction functions for Classification models
pxinghao Jul 29, 2013
2b2630b
Style fix
pxinghao Jul 29, 2013
c34c0f6
Merge pull request #731 from pxinghao/master
shivaram Jul 29, 2013
43a2cc1
Use Bootstrap progress bars in web UI
karenfeng Jul 29, 2013
fe7298b
Merge pull request #741 from pwendell/usability
rxin Jul 29, 2013
e04a37a
Merge branch 'master' of https://github.com/mesos/spark into bootstra…
karenfeng Jul 29, 2013
478a288
Added started tasks to progress bar
karenfeng Jul 29, 2013
2d6da91
Alphabetized imports
karenfeng Jul 29, 2013
07da72b
Remove duplicate loss history and clarify why.
shivaram Jul 29, 2013
c99b674
Merge pull request #735 from karenfeng/ui-807
pwendell Jul 29, 2013
c7b2788
Merge branch 'master' of https://github.com/mesos/spark into bootstra…
karenfeng Jul 29, 2013
87b821d
Fixed continuity of executorToTasksActive, changed color of progress …
karenfeng Jul 29, 2013
17e6211
Moved DeployMessage's into its own DeployMessages object.
rxin Jul 30, 2013
207548b
Open up Job UI ports (33000-33010) on EC2 clusters
mateiz Jul 30, 2013
105f4d2
Removed Cache and SoftReferenceCache since they are no longer used.
rxin Jul 30, 2013
23b5da1
Moved block manager messages into BlockManagerMessages object.
rxin Jul 30, 2013
81720e1
Moved all StandaloneClusterMessage's into StandaloneClusterMessages o…
rxin Jul 30, 2013
3ca9faa
Clarify how regVal is computed in Updater docs
shivaram Jul 30, 2013
1e1ffb1
Merge pull request #745 from shivaram/loss-update-fix
atalwalkar Jul 30, 2013
468a36c
Merge pull request #746 from rxin/cleanup
mateiz Jul 30, 2013
614ee16
refactor job ui with pool information
xiajunluan Jul 30, 2013
5406013
refactor codes less than 100 character per line
xiajunluan Jul 30, 2013
b957326
Do not inherit master's PYTHONPATH on workers.
JoshRosen Jul 29, 2013
49be084
Use File.pathSeparator instead of hardcoding ':'.
JoshRosen Jul 29, 2013
e4387dd
made SimpleUpdater consistent with other updaters
atalwalkar Jul 30, 2013
f6f4645
Added property 'spark.executor.uri' for launching on Mesos without
benh Jul 23, 2013
8aee118
Merge pull request #748 from atalwalkar/master
shivaram Jul 30, 2013
f1cab31
Removed intermediate set for activeTasks, removed progress bar margin
karenfeng Jul 30, 2013
218d7c4
Fixed style, lowered height of progress bars
karenfeng Jul 30, 2013
26144c4
Fixed wrap style
karenfeng Jul 30, 2013
e35966a
Renamed Classification.scala to ClassificationModel.scala and Regress…
rxin Jul 30, 2013
47011e6
Use a tigher bound in logistic regression unit test's prediction vali…
rxin Jul 30, 2013
366f773
Minor style cleanup of mllib.
rxin Jul 30, 2013
48851d4
Add bagel, mllib to SBT assembly.
shivaram Jul 30, 2013
ae57020
Merge pull request #752 from rxin/master
shivaram Jul 30, 2013
e87de03
Merge pull request #744 from karenfeng/bootstrap-update
pwendell Jul 30, 2013
368c58e
Merge branch 'lazy_file_open' of github.com:lyogavin/spark into compr…
rxin Jul 30, 2013
7bdafa9
Format cleanup.
benh Jul 31, 2013
ad7e9d0
CompressionCodec cleanup. Moved it to spark.io package.
rxin Jul 31, 2013
5227043
Documentation update for compression codec.
rxin Jul 31, 2013
56774b1
Added unit test for compression codecs.
rxin Jul 31, 2013
3b1ced8
Exclude older version of Snappy in streaming and examples.
rxin Jul 31, 2013
311aae7
Added Snappy dependency to Maven build files.
rxin Jul 31, 2013
dae12fe
Updated the configuration option for Snappy block size to be consiste…
rxin Jul 31, 2013
98024ea
Renamed compressionOutputStream and compressionInputStream to compres…
rxin Jul 31, 2013
15fd0d6
Add mllib, bagel to repl dependencies
shivaram Jul 31, 2013
bf93180
Add Apache license header to metrics system
jerryshao Jul 31, 2013
29b8cd3
Merge pull request #755 from jerryshao/add-apache-header
mateiz Jul 31, 2013
fefb03c
Eliminated code duplication, refactored to pattern-matching style Par…
tkroman Jul 31, 2013
5670c96
Merge branch 'master' into Pool_UI
xiajunluan Jul 31, 2013
d4556f4
Merge pull request #751 from cdshines/master
mateiz Jul 31, 2013
12553e5
Simplified nonNegativeMod to match previous version
mateiz Jul 31, 2013
0c65537
Refactored Vector.apply(length, initializer) replacing excessive code…
tkroman Jul 31, 2013
9a815de
write and read generation in ResultTask
BlackNiuza Jul 31, 2013
89da9d9
Add JSON path to master index page
pwendell Jul 31, 2013
c61843a
Changed other LZF uses to use the compression codec interface.
rxin Jul 31, 2013
0be071a
Merge pull request #756 from cdshines/patch-1
rxin Jul 31, 2013
49e6344
Removed master URL from job UI, reduced heading size of basic spark p…
karenfeng Jul 31, 2013
a386ced
Merge pull request #754 from rxin/compression
mateiz Jul 31, 2013
9a444cf
Use the Char version of split() instead of the String one for efficiency
mateiz Jul 31, 2013
c453967
Reduced size of heading
karenfeng Jul 31, 2013
4692ea4
Used 'uri.split('/').last' instead of 'new File(uri).getName()'.
benh Jul 31, 2013
529ac81
Do not try and use 'scala' in 'run' from within a "release".
benh Jul 31, 2013
4ba4c3f
Merge pull request #759 from mateiz/split-fix
shivaram Jul 31, 2013
14bf2fe
Merge pull request #749 from benh/spark-executor-uri
mateiz Jul 31, 2013
f607ffb
Added data generator for K-means
mateiz Jul 31, 2013
b2b86c2
Merge pull request #753 from shivaram/glm-refactor
mateiz Jul 31, 2013
39c75f3
Merge pull request #757 from BlackNiuza/result_task_generation
mateiz Jul 31, 2013
ef1f22b
Merge branch 'master' of https://github.com/mesos/spark
karenfeng Jul 31, 2013
a6f43a9
SPARK-842. Maven assembly is including examples libs and dependencies
wandjenkins Jul 31, 2013
ecab635
Merge pull request #763 from c0s/assembly
mateiz Aug 1, 2013
3097d75
Merge remote-tracking branch 'dlyubimov/SPARK-827'
mateiz Aug 1, 2013
52dba89
Turn on caching in KMeans.main
mateiz Aug 1, 2013
58756b7
Merge pull request #761 from mateiz/kmeans-generator
shivaram Aug 1, 2013
3b5a11e
change function name "setName" to "setProperties" as "setName" is als…
xiajunluan Aug 1, 2013
d58502a
fix bug of spark "SubmitStage" listener as unit test error
xiajunluan Aug 1, 2013
ffc034e
Import cleanup
pwendell Aug 1, 2013
3e4d5e5
Merge branch 'master' into master-json
pwendell Aug 1, 2013
9177bea
Removing extra imports
pwendell Aug 1, 2013
cb7dd86
Merge pull request #758 from pwendell/master-json
pwendell Aug 1, 2013
0a96493
Merge pull request #760 from karenfeng/heading-update
mateiz Aug 1, 2013
5e7b38f
Merge pull request #695 from xiajunluan/pool_ui
pwendell Aug 1, 2013
5faac7f
Minor style fixes
pwendell Aug 1, 2013
cfcd77b
Increasing inter job arrival
pwendell Aug 1, 2013
b101994
Slight refactoring to SparkContext functions
pwendell Aug 1, 2013
87fd321
Minor refactoring and code cleanup
pwendell Aug 1, 2013
37bc64a
Adding application-level metrics.
pwendell Jul 29, 2013
12d9c82
Small style fix
pwendell Jul 29, 2013
f1d2ad5
under_scores --> camelCase for config options
pwendell Jul 30, 2013
d3c37ff
Improving documentation in config file example
pwendell Jul 30, 2013
e466a55
Revert Mesos version to 0.9 since the 0.12 artifact has target Java 7
mateiz Aug 1, 2013
999eaac
Merge branch 'master' of https://github.com/mesos/spark
karenfeng Aug 1, 2013
6d7afd7
Merge pull request #768 from pwendell/pr-695
pwendell Aug 2, 2013
9d7dfd2
Merge pull request #743 from pwendell/app-metrics
pwendell Aug 2, 2013
b3ae5b2
Shows time the app has been running
karenfeng Aug 2, 2013
5b3784a
Show user-defined job name in UI
pwendell Aug 2, 2013
abfa9e6
Increase Kryo buffer size in ALS since some arrays become big
mateiz Aug 2, 2013
22abbc1
Merge pull request #772 from karenfeng/ui-843
mateiz Aug 2, 2013
4ab4df5
adding matrix factorization data generator
gingsmith Aug 3, 2013
d93d5fc
SPARK-850: Give better error message on the console
cybermaster Aug 5, 2013
8c8947e
fixing formatting
gingsmith Aug 5, 2013
87134b3
SPARK-850: give better console message
cybermaster Aug 5, 2013
9df66cd
Merge branch 'master' of github.com:cybermaster/spark
cybermaster Aug 5, 2013
33b9b15
JBoss repository working now
cybermaster Aug 5, 2013
550b0cf
Merge pull request #780 from cybermaster/master
pwendell Aug 5, 2013
e8bec83
Only reduce the number of cores once when removing an executor
mbautin Jul 31, 2013
cdd1af5
Timeout zombie workers
markhamstra Aug 1, 2013
37ccf93
milliseconds -> seconds in timeOutDeadWorkers logging
markhamstra Aug 1, 2013
35d8f5e
Moved handling of timed out workers within the Master actor
markhamstra Aug 4, 2013
8b27789
Merge pull request #774 from pwendell/job-description
mateiz Aug 6, 2013
bf7033f
fixing formatting, style, and input
gingsmith Aug 6, 2013
828aff7
Merge pull request #776 from gingsmith/master
mateiz Aug 6, 2013
a308664
SHARK-94 Log the files computed by HadoopRDD and NewHadoopRDD
apivovarov Aug 6, 2013
1b63dea
Merge pull request #769 from markhamstra/NegativeCores
mateiz Aug 6, 2013
d031f73
Merge pull request #782 from WANdisco/master
rxin Aug 6, 2013
42942fc
In the process of bringing the GraphLab api back and fixing the analy…
jegonzal Aug 6, 2013
413b0c1
merged with upstream
jegonzal Aug 6, 2013
499a0d8
Merged graphx from @rxin into master
jegonzal Aug 6, 2013
0704d85
Merging local changes to @rxin graph branch.
jegonzal Aug 6, 2013
7ae83f6
Switching to Long vids instead of integers. This required a surprisi…
jegonzal Aug 6, 2013
b454314
Added 2d partitioning
jegonzal Aug 6, 2013
ddf126e
added subgraph
jegonzal Aug 7, 2013
5ccb60d
Working on graph test suite
jegonzal Aug 11, 2013
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,5 @@ dependency-reduced-pom.xml
.ensime_lucene
checkpoint
derby.log
dist/
spark-*-bin.tar.gz
229 changes: 202 additions & 27 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,27 +1,202 @@
Copyright (c) 2010, Regents of the University of California.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the University of California, Berkeley nor the
names of its contributors may be used to endorse or promote
products derived from this software without specific prior written
permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/

TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

1. Definitions.

"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.

"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.

"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.

"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.

"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.

"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.

"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).

"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.

"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."

"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.

2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.

3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.

4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:

(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and

(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and

(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and

(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.

You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.

5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.

6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.

7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.

8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.

9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS

APPENDIX: How to apply the Apache License to your work.

To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright [yyyy] [name of copyright owner]

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
5 changes: 5 additions & 0 deletions NOTICE
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Apache Spark
Copyright 2013 The Apache Software Foundation.

This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ This README file only contains basic setup instructions.

## Building

Spark requires Scala 2.9.2 (Scala 2.10 is not yet supported). The project is
Spark requires Scala 2.9.3 (Scala 2.10 is not yet supported). The project is
built using Simple Build Tool (SBT), which is packaged with it. To build
Spark and its example programs, run:

Expand Down
13 changes: 13 additions & 0 deletions assembly/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
This is an assembly module for Spark project.

It creates a single tar.gz file that includes all needed dependency of the project
except for org.apache.hadoop.* jars that are supposed to be available from the
deployed Hadoop cluster.

This module is off by default to avoid spending extra time on top of repl-bin
module. To activate it specify the profile in the command line
-Passembly

In case you want to avoid building time-expensive repl-bin module, that shaders
all the dependency into a big flat jar supplement maven command with
-DnoExpensive
92 changes: 92 additions & 0 deletions assembly/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.spark-project</groupId>
<artifactId>spark-parent</artifactId>
<version>0.8.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

<groupId>org.spark-project</groupId>
<artifactId>spark-assembly</artifactId>
<name>Spark Project Assembly</name>
<url>http://spark-project.org/</url>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4</version>
<executions>
<execution>
<id>dist</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
<configuration>
<descriptors>
<descriptor>src/main/assembly/assembly.xml</descriptor>
</descriptors>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>

<profiles>
<profile>
<id>hadoop1</id>
<properties>
<classifier.name>hadoop1</classifier.name>
</properties>
</profile>
<profile>
<id>hadoop2</id>
<properties>
<classifier.name>hadoop2</classifier.name>
</properties>
</profile>
<profile>
<id>hadoop2-yarn</id>
<properties>
<classifier.name>hadoop2-yarn</classifier.name>
</properties>
</profile>
</profiles>
<dependencies>
<dependency>
<groupId>org.spark-project</groupId>
<artifactId>spark-core</artifactId>
<classifier>${classifier.name}</classifier>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.spark-project</groupId>
<artifactId>spark-bagel</artifactId>
<classifier>${classifier.name}</classifier>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.spark-project</groupId>
<artifactId>spark-mllib</artifactId>
<classifier>${classifier.name}</classifier>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.spark-project</groupId>
<artifactId>spark-repl</artifactId>
<classifier>${classifier.name}</classifier>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.spark-project</groupId>
<artifactId>spark-streaming</artifactId>
<classifier>${classifier.name}</classifier>
<version>${project.version}</version>
</dependency>
</dependencies>
</project>
Loading