Securing confidential data in a database using a ZKML-based cryptographic approach with auto-encoder-based encoding
- The confidential data is encoded using Seq2Seq model
- The decoder of the model is loaded on the on-premise software.
- ZKML is used to generate a compiled model from the decoder model which is used to generate proof using the prover mechanism. The provers generate a ZK proof. The proof is generated such that the weights of the model are hashed + zk-hashed with the encoded inputs zk-hashed in the proof with the help of a proving key generated in the setup phase.
- When the verifier receives the proof, it tries to impose challenges with a verification key generated in the setup phase. If the weights of the model are authentic and the proof is generated with the true proving key, in that case, it responds with the encoded data from the database. The verifier is hosted as an edge function in the database.
- Finally, the client receives the encoded data decoded using the on-premise model. The secure system ensures that the receiver has to use the right model with a true proving key.
Clone the repository and install the required packages using poetry and follow the below steps:
- Setup ZKML using ezkl:
python zkml/setup/ezkl_setup.py
- Run Client (Streamlit):
streamlit run main.py
- Run Verifier Server (FastAPI):
# Run the verifier server on cloud
uvicorn verifier:app --reload
# Note: Run the below command for developer mode
fastapi dev zkml/verifier/verifier.py