Dockerized Python SAP extractor for analytics engagements.
- Install Docker (including Docker Compose)
- Clone the SAPsucker repository to a working directory on your local machine
- Download a linux copy of NWRFCSDK (Net Weaver RFC SDK) from SAP from here (do not install the SDK)
NOTE: A valid SAP account is required to download these libraries - Extract nwrfcsdk.zip prom the previous step to this repository in a folder named 'nwrfcsdk'
- Open configuration.env and enter the relevant SAP credentials for the current engagement
- Open powershell, navigate to this repository, then enter "docker-compose up" (Watch the logs - this may take some time on first use)
- Watch terminal logs for "Download complete" followed by the results data from:
"select MANDT, BUKRS, BUTXT from T001 where BUKRS <> 0"
- Celebrate! You now have a python-based SAP extraction tool!
(Additional) For minor in-flight tweaks to the docker container, use "docker-compose run" in place of 'up' - this will drop you into a terminal inside the container. The connection and SQL query to SAP is currently specified inside query-builder.py
- Add a feature for defining SAP tables from CSV
- Add automated reconcilliation and summary reporting
- Another project is in progress to automate storage and ETL pipelines for the extracted SAP tables using PySpark and AWS
The 0.1 version code for this project is heavily based on the below blog posts.
http://wbarczynski.pl/calling-bapis-with-python-and-pyrfc/
http://www.alexbaker.me/code/python-and-sap-part-1-connecting-to-sap