Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Spark / Hadoop / Zeppelin (Issue #535) #590

Merged
merged 13 commits into from
Dec 3, 2021

Conversation

stvoutsin
Copy link
Collaborator

Description
This PR upgrades the versions of our components as described below:

  • Zeppelin 0.10.0
  • Hadoop 3.2.1
  • Spark 3.1.2

What Issue is this related to:
#535

What type of PR is it?
Upgrade

Has this been tested:
Yes. This was tested using the Benchmarking suite that was introduced here: #583
Probably shouldn't have included the same change here, the reason for including them in the first place is that it was used for testing this branch, and the notes are based on a version that includes the benchmarker.

Comment on lines +203 to +207
<property>
<name>zeppelin.interpreter.exclude</name>
<value>angular,livy,alluxio,file,psql,flink,ignite,lens,cassandra,geode,kylin,elasticsearch,scalding,jdbc,hbase,bigquery,beam,groovy,flink-cmd,hazelcastjet,influxdb,java,jupyter,kotlin,ksql,mongodb,neo4j,pig,r,sap,spark-submit,sparql,submarine</value>
<description>All the inteprreters that you would like to exclude. You can only specify either 'zeppelin.interpreter.include' or 'zeppelin.interpreter.exclude'. Specifying them together is not allowed.</description>
</property>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these excluded ?
In might be better to explicitly list the interpreters we do want in zeppelin.interpreter.include rather than excluding an arbitrary list in zeppelin.interpreter.exclude.

I've created an issue to follow this up #593.

Comment on lines +249 to +250
# After some investigation, it looks like the new Zeppelin runs Spark jobs as the logged in Zeppelin user, and fails because it lacks permission.
# Turn this off for now, so that everything is sent as the main Zeppelin user (After this change Spark notebooks work)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we turn this ON/OFF ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to re-visit this.Created a new issue to follow this up #594.

Copy link
Collaborator

@Zarquan Zarquan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, tests pass, go for it.

@Zarquan Zarquan merged commit 8c90c72 into wfau:master Dec 3, 2021
@Zarquan Zarquan mentioned this pull request Dec 6, 2021
@stvoutsin stvoutsin deleted the issue-upgrade-spark-3 branch June 3, 2022 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants