-
Notifications
You must be signed in to change notification settings - Fork 179
Cascalog and Hadoop Security
Stefan Hübner edited this page Jun 26, 2012
·
1 revision
If your cluster features Hadoop Security your queries may run into exceptions like this one:
org.apache.hadoop.ipc.RemoteException: token (...) can't be found in cache
That exception fails the second step in any multi-step Cascalog (or Cascading for that regard) query. Reason is, the Kerberos token gets cancelled after the first step succeeded.
A solution to this is to configure JobConf with
mapreduce.job.complete.cancel.delegation.tokens
set to false
, like
so:
(with-job-conf {"mapreduce.job.complete.cancel.delegation.tokens" false}
...)
Or add it to your job-conf.clj
.
Also, if you happen to schedule your Cascalog jobs via Oozie, you may want to google for HADOOP_TOKEN_FILE_LOCATION and mapreduce.job.credentials.binary and set your jobconf accordingly.
- [Owen O'Malley: Motivations for Apache Hadoop Security](http://hortonworks.c om/blog/motivations-for-apache-hadoop-security/)
- Owen O'Malley: Hadoop Security in Detail (video)
- Cloudera CDH3 Documentation: Introduction to Hadoop Security
- MAPREDUCE-1430
- MAPREDUCE-4324