PSpec-SQL is a privacy-integrated big data analytic system built upon Spark-SQL(1.3.0), and automatically enforces user-provided PSpec policy during query processing.
pspec/ contains language parser and policy analyzer for PSpec, an abstract high-level privacy specification language.
spark-sql/ contains a modified spark project to integrate privacy checking during query processing.
The modifications mainly focus on spark-catalyst, spark-hive, spark-sql.
tpc-ds/ contains the tpc-ds benchmark and transformed sql querues suitable for spark-sql to processing. Mainly used for primary evaluation.
More information will be available soon.
Development with Eclipse
Prerequisites:
Eclipse Kepler 4.3
Eclipse Scala Plugin for Scala 2.10
Scala 2.10
sbt
maven
ivy
Steps:
- clone project
- cd project dir, run "sbt/sbt eclipse -Phive-thriftserver" (this may download a lot of jars...)
- discard changes of .classpath files for eclipse projects (with git discard)
- import projects into Eclipse, including language v2, spark-core, spark-sql, spark-hive, spark-catalys, spark-network-common, spark-network-shuffle
- fix any potential build problems in eclipse...