-
Notifications
You must be signed in to change notification settings - Fork 10
Developer Env
Harish Butani edited this page Jan 13, 2022
·
3 revisions
- We have only used IntelliJ, other IDEs should also work, as this is a standard SBT project.
- Import the project into IntelliJ. You should be able to build the project without any changes.
- To run tests you need to do the following additional steps.
- Grant your demo user additional privileges. For example
-- to run unit tests, add the additional grants
grant drop any table to tpcds;
grant create any function to tpcds;
grant create any procedure to tpcds;
- in SQL/*Plus connected as the demo user, run the
src/test/resources/ddls.sql
script - Set up
src/test/resources/spark-ora-test.properties
locally- This contains the instance connection information, for example:
// oci is an arbitrary prefix that you reference when you run tests
oci.spark.sql.catalog.oracle.authMethod=ORACLE_WALLET
oci.spark.sql.catalog.oracle.url=jdbc:oracle:thin:@<tnsnames_addressname>
oci.spark.sql.catalog.oracle.user=<your_demo_user>
oci.spark.sql.catalog.oracle.password=<demo_user_password
oci.spark.sql.catalog.oracle.net.tns_admin=<wallet_location>
- Create the sparktest oracle user
create user sparktest identified by SparkOnOracle1;
grant CONNECT, RESOURCE, unlimited tablespace to sparktest;
- Run the
ReadPathTestSetup
main class; but before running change theoci.spark.sql.catalog.oracle.user=sparktest
. Run with-Dspark.oracle.test.db_instance=oci
option. - Don't forget to change back
oci.spark.sql.catalog.oracle.user=<your_demo_user>
after you have run the test - Create the write test tables, connecting as
sparktest
create table unit_test_write(
c_char_1 char(1),
c_char_5 char(5),
c_varchar2_10 varchar2(10),
c_varchar2_40 varchar2(40),
c_nchar_1 nchar(1),
c_nchar_5 nchar(5),
c_nvarchar2_10 nvarchar2(10),
c_nvarchar2_40 nvarchar2(40),
c_byte number(2),
c_short number(4),
c_int number(9),
c_long number(18),
c_number number(25),
c_decimal_scale_5 number(25,5),
c_decimal_scale_8 number(25,8),
-- c_float binary_float,
-- c_double binary_double,
-- c_binary long,
c_date date,
c_timestamp timestamp
-- c_timestamp_with_tz timestamp with tz,
-- c_timestamp_with_local_tz timestamp with local tz
);
grant all privileges on unit_test_write to public;
create table unit_test_write_partitioned(
c_varchar2_40 varchar2(40),
c_int number(9),
state VARCHAR2(2),
channel VARCHAR2(1)
)
PARTITION BY LIST (state, channel)
(
PARTITION yearly_west_direct VALUES (('OR','D'),('UT','D'),('WA','D')),
PARTITION yearly_west_indirect VALUES (('OR','I'),('UT','I'),('WA','I')),
PARTITION yearly_south_direct VALUES (('AZ','D'),('TX','D'),('GA','D')),
PARTITION yearly_south_indirect VALUES (('AZ','I'),('TX','I'),('GA','I')),
PARTITION yearly_east_direct VALUES (('PA','D'), ('NC','D'), ('MA','D')),
PARTITION yearly_east_indirect VALUES (('PA','I'), ('NC','I'), ('MA','I')),
PARTITION yearly_north_direct VALUES (('MN','D'),('WI','D'),('MI','D')),
PARTITION yearly_north_indirect VALUES (('MN','I'),('WI','I'),('MI','I')),
PARTITION yearly_ny_direct VALUES ('NY','D'),
PARTITION yearly_ny_indirect VALUES ('NY','I'),
PARTITION yearly_ca_direct VALUES ('CA','D'),
PARTITION yearly_ca_indirect VALUES ('CA','I'),
PARTITION rest VALUES (DEFAULT)
)
;
grant all privileges on unit_test_write_partitioned to public;
- Now you should be able to run the test suite. For any test, set the working_dir to
<project-root>/sql
and set VM options-Dspark.oracle.test.db_instance=oci
. - Currently, the
UDTTypesTest
fail on free tier ADW.
-
common: utilities, helper, logging code. Examples:
timeIt
,OraSparkUtils
... - orastuff: code related to Connection Management, Dataset Splits, ResultSet handling,...
-
sql: code related to SQL translation. Major components are:
- Oracle Catalog
- Operator Translation
- Expression Translation
- Datatype Handling
- Task co-location and parallelism
- Dataframe API extensions
- mllib: any extensions related to MLLib
-
packaging projects:
-
spark-extended: creates the
spark-oracle-<version>.zip
, which can be dropped into almost any spark deployment.
-
spark-extended: creates the
-- to compile, test
sbt clean compile test
-- to build spark-extended package
sbt universal:packageBin
-- to build spark-extended package with oracle jars included
sbt -DaddOraJars=true universal:packageBin
# Oracle Catalog
# Enable Spark Oracle extensions
spark.sql.extensions=org.apache.spark.sql.oracle.SparkSessionExtensions
spark.kryo.registrator=org.apache.spark.sql.connector.catalog.oracle.OraKryoRegistrator
# Enable the Oracle Catalog integration
spark.sql.catalog.oracle=org.apache.spark.sql.connector.catalog.oracle.OracleCatalog
# oracle sql logging and jdbc fetchsize
spark.sql.catalog.oracle.log_and_time_sql.enabled=true
spark.sql.catalog.oracle.log_and_time_sql.log_level=info
spark.sql.catalog.oracle.fetchSize=5000
spark.sql.oracle.enable.querysplitting=false
spark.sql.oracle.querysplit.target=1Mb
# Configure JDBC connection information
# Example for a non wallet instance
#spark.sql.catalog.oracle.url=jdbc:oracle:thin:@<your_instance_details>
#spark.sql.catalog.oracle.user=tpcds
#spark.sql.catalog.oracle.password=tpcds
# Example config for an ADW instance
#spark.sql.catalog.oracle.authMethod=ORACLE_WALLET
#spark.sql.catalog.oracle.url=jdbc:oracle:thin:@<tns-address>
#spark.sql.catalog.oracle.user=<your_demo_user>
#spark.sql.catalog.oracle.password=<your_passwd>
#spark.sql.catalog.oracle.net.tns_admin=<wallet_location>
- Quick Start
- Latest Demo
- Configuration
- Catalog
- Translation
- Query Splitting details
- DML Operations
- Language Integration
- Dockerized Demo env.
- Sharded Database
- Developer Notes