This repository contains a simple 2-step verification process for verifying a website's contents before scraping.
- User Authentication: Users can register and login to the website, using Firebase Authentication.
- Website verification: Users will be displayed an IFRAME of a random website for verification.
- Feedback: Users can provide feedback on the website's contents, including a "accept" or "reject" tab where they categorize the website into a categoary such as "news", "blog", "e-commerce", etc. or provide a reason for rejection.The feedback is stored in a Firebase Firestore database along with the metadata.
- Google Cloud Account: The dashboard uses Google Cloud Firestore to store feedback data. You will need to have a Google Cloud account and a Firestore database to run the dashboard.
- Firebase Project: The dashboard uses Firebase authentication to authenticate users. You will need to have a Firebase project and a service account key to run the dashboard.
- Service Account Key: The dashboard uses a service account key to authenticate with Google Cloud Firestore. You will need to have a service account key to run the dashboard.
- Google Cloud SDK: If you want to deploy the dashboard to the Google Cloud App Engine, you will need to have the Google Cloud SDK installed on your machine.
- Anaconda : The dashboard is built using the Streamlit framework. You will need to have Anaconda installed on your machine to run the dashboard.
- Docker: The dashboard can be run in a Docker container. You will need to have Docker installed on your machine to run the dashboard in a container.
To run the application, you will need to have a Firebase service account key. This key should be stored in the root directory as serviceAccountKey.json
.
docker build -t <target-name> .
docker run <target-name>
If you want to deploy to the Google cloud App engine, install the gcloud cli and run the following command:
gcloud app deploy
Akter to run the app locally, run the following command:
conda create -n url_verification python=3.9
conda activate url_verification
pip install -r requirements.txt
streamlit run app.py
Getting DefaultCredentialsError ?
You need to set the environment variable GOOGLE_APPLICATION_CREDENTIALS
to the path of the service account key. You can do this by running the following command:
export GOOGLE_APPLICATION_CREDENTIALS="path/to/serviceAccountKey.json"