This repository implements a complete Named Entity Recognition (NER) pipeline using a pre-trained Hugging Face Transformers model (BERT). It enables to identify and classify named entities (e.g. people, organizations, locations) within text data. The pipeline leverages the power of Google Cloud Platform (GCP) for deployment and scalability, containerized with Docker for portability, and streamlined with CircleCI for continuous integration and continuous delivery (CI/CD).
- Leverages pre-trained BERT model from Hugging Face Transformers for efficient and accurate NER.
- Provides a user-friendly interface to process text data and extract named entities.
- Scales seamlessly on GCP for handling large text datasets.
- Encapsulated in Docker containers for easy deployment across various environments.
- Automated CI/CD pipeline through CircleCI for streamlined development and deployment.
- constants
- config_entity
- artifact_entity
- components
- pipeline
- app.py
git add .
git commit -m "Updated"
git push origin main
#Gcloud cli download link: https://cloud.google.com/sdk/docs/install#windows
gcloud init
conda create -n nerproj python=3.8 -y
conda activate nerproj
pip install -r requirements.txt
python app.py
- artifact registry --> create a repository
- change line 42,50,72,76,54 in circleci config
- Opne circleci --> create a project
GCLOUD_SERVICE_KEY --> service account
GOOGLE_COMPUTE_ZONE = asia-south1
GOOGLE_PROJECT_ID