Skip to content

AI-powered recruiter that streamlines the technical hiring process through automated resume processing, candidate research, and interview management.

Notifications You must be signed in to change notification settings

elvin2words/hireIntel.api

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HireIntel

HireIntel is an intelligent recruitment system specifically designed for hiring software engineers. The system acts as an AI-powered recruiter that streamlines the technical hiring process through automated resume processing, candidate research, and interview management.

Table of Contents

Overview

HireIntel automates and enhances the technical recruitment process through:

  • Automated resume parsing and analysis
  • Multi-source candidate research (GitHub, LinkedIn, Google)
  • AI-powered profile creation
  • Automated interview scheduling
  • Real-time pipeline monitoring

Features

Core Capabilities

  • Advanced resume parsing using AI
  • GitHub repository analysis
  • LinkedIn profile integration
  • Google presence analysis
  • Intelligent candidate-job matching
  • Automated email communications
  • Real-time monitoring dashboard
  • Interview scheduling system

Pipeline Features

  • Continuous background processing
  • Status-based candidate progression
  • Multi-stage data enrichment
  • Automated profile creation
  • Real-time status updates

System Requirements

  • Python 3.8 or higher
  • SQLite database
  • Poppler PDF library (for PDF processing)
  • SMTP server access
  • Required API access tokens
  • Minimum 4GB RAM recommended
  • Storage space for document processing

Architecture

Project Structure

HireIntel/
├── src/
│   ├── config/
│   │   ├── AppSettings.py
│   │   ├── Config.yaml
│   │   └── DBModelsConfig.py
│   ├── Controllers/
│   │   ├── AdminController.py
│   │   ├── AuthController.py
│   │   └── ScheduleMonitorController.py
│   ├── Modules/
│   │   ├── Auth/
│   │   ├── Candidate/
│   │   ├── Jobs/
│   │   ├── Interviews/
│   │   └── PipeLineData/
│   ├── PipeLines/
│   │   ├── Integration/
│   │   ├── PipeLineManagement/
│   │   └── Profiling/
│   └── Static/
│       ├── EmailTemplates/
│       └── Resume/
├── instance/
└── email_attachments/

Installation

  1. Clone the repository:
git clone https://github.com/kudzaiprichard/hireIntel.api
cd HireIntel
  1. Create virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Install Poppler:
  • Windows: Download from poppler releases
  • Linux: sudo apt-get install poppler-utils
  • MacOS: brew install poppler

Configuration

API Keys Setup

  1. GitHub Token
  • Visit GitHub Developer Settings
  • Create new token with:
    • repo scope (repository access)
    • user scope (user data access)
    • read:org scope (organization access)
  • Add to config:
profiler:
  github_token: "your_token"
  1. Google AI (Gemini) API
  • Visit Google AI Studio
  • Sign in and enable the API
  • Create new API key
  • Add to config:
llm:
  genai_token: "your_token"
  poppler_path: "C:\\Program Files\\poppler-24.08.0\\Library\\bin"
  1. RapidAPI (LinkedIn)
  • Create account at RapidAPI
  • Subscribe to LinkedIn Profile & Company Data API
  • Copy API key to config:
profiler:
  rapid_api_key: "your_key"
  1. Gmail Configuration
  • Enable 2-Step Verification
  • Generate App Password:
    • Go to Security → App Passwords
    • Select "Mail" and "Other (Custom name)"
    • Name it "HireIntel"
    • Copy 16-character password
  • Add to config:
email:
  from: "hire"
  username: "your.email@gmail.com"
  password: "your_app_password"
  smtp_host: "smtp.gmail.com"
  smtp_port: 465
  imap_host: "imap.gmail.com"
  imap_port: 993

Complete Configuration Example

server:
  ip: "0.0.0.0"
  port: 12345
  debug: true
  ssl: false

database:
  uri: "sqlite:///hire.db"
  track_modifications: false

jwt:
  secret_key: "your_jwt_secret"

assets:
  resume: "./src/Static/Resume/Documents"
  json_resume: "./src/Static/Resume/Json"

profiler:
  github_token: "ghp_xxxxxxxxxxxx"
  google_api_key: "your_google_api_key"
  rapid_api_key: "xxxxxxxxxxxxxxxx"
  batch_size: 5
  intervals:
    linkedin_scraping: 1
    text_extraction: 1
    github_scraping: 1
    google_scraping: 1
    profile_creation: 1
  scoring:
    weights:
      technical: 0.4
      experience: 0.35
      github: 0.25
    min_passing_score: 70.0

watcher:
  watcher_folder: "./src/PipeLines/Integration/FileWatcher/Watcher/watcher_folder"
  failed_folder: "./src/PipeLines/Integration/FileWatcher/Watcher/failed_folder"
  check_interval: 1

email_pipe_line:
  batch_size: 10
  check_interval: 1
  folder: "INBOX"
  allowed_attachments: [".pdf", ".doc", ".docx"]

Integration Systems

File Watcher Integration

XML Structure

Candidate applications must be submitted in XML format:

<?xml version="1.0" encoding="UTF-8"?>
<candidate>
    <email>example@email.com</email>
    <first_name>John</first_name>
    <last_name>Doe</last_name>
    <job_id>93f14b11-da25-4a9a-8bb2-4ac8509ddac0</job_id>
    <phone>+1234567890</phone>
    <current_company>Company Name</current_company>
    <current_position>Current Role</current_position>
    <years_of_experience>5</years_of_experience>
    <documents>
        <document name="resume.pdf" type="resume">resume.pdf</document>
    </documents>
</candidate>

Required fields:

  • email
  • first_name
  • last_name
  • job_id (valid UUID)
  • documents (with resume)

File Watcher Flow

  1. Input Detection:

    • Monitors watcher_folder for new XML files
    • Validates XML structure and schema
    • Checks for associated resume document
  2. Document Processing:

    • Moves resume to document storage
    • Generates unique document identifiers
    • Maintains document associations
  3. Candidate Creation:

    • Creates new candidate record
    • Sets initial pipeline status to XML
    • Triggers pipeline processing

Configuration

watcher:
  watcher_folder: "./src/PipeLines/Integration/FileWatcher/Watcher/watcher_folder"
  failed_folder: "./src/PipeLines/Integration/FileWatcher/Watcher/failed_folder"
  check_interval: 1  # minutes

assets:
  resume: "./src/Static/Resume/Documents"
  json_resume: "./src/Static/Resume/Json"

Pipeline Architecture

Daemon Thread Architecture

Each pipeline operates as a daemon thread, continuously monitoring for candidates in specific states:

Pipeline Threads (All Running Continuously):
├── File Watcher Thread
│   └── Monitors folder for new XML files
│   └── Creates candidates with XML status
│
├── Email Watcher Thread
│   └── Monitors email inbox
│   └── Converts to XML and triggers File Watcher
│
├── Text Extraction Thread
│   └── Watches for status: XML
│   └── Processes resume text
│   └── Updates to: EXTRACT_TEXT
│
├── Google Scraping Thread
│   └── Watches for status: EXTRACT_TEXT
│   └── Gathers web presence
│   └── Updates to: GOOGLE_SCRAPE
│
├── LinkedIn Scraping Thread
│   └── Watches for status: GOOGLE_SCRAPE
│   └── Fetches LinkedIn data
│   └── Updates to: LINKEDIN_SCRAPE
│
├── GitHub Scraping Thread
│   └── Watches for status: LINKEDIN_SCRAPE
│   └── Analyzes GitHub activity
│   └── Updates to: GITHUB_SCRAPE
│
└── Profile Creation Thread
    └── Watches for status: GITHUB_SCRAPE
    └── Creates final profile
    └── Updates to: PROFILE_CREATED

Thread Management

Each pipeline uses an infinite loop for continuous processing:

def _run_pipeline(self):
    with self.app.app_context():
        while not self.stop_flag.is_set():
            try:
                # Get candidates in specific status
                candidates = self.get_input_data()
                
                # Process batch if found
                if candidates:
                    self.process_batch()
                
                # Wait for next interval
                self.stop_flag.wait(self.config.process_interval)
            except Exception as e:
                self.handle_error(e)

Batch Processing

  1. Continuous Polling:

    • Each thread continuously polls database
    • Looks for candidates in its input status
    • Processes in configurable batch sizes
  2. Status-Based Processing:

    XML → EXTRACT_TEXT → GOOGLE_SCRAPE → LINKEDIN_SCRAPE → 
    GITHUB_SCRAPE → PROFILE_CREATION → PROFILE_CREATED
    
  3. Thread Safety:

    • Isolated database transactions
    • Atomic status updates
    • Pipeline-specific state management

Configuration Control

profiler:
  batch_size: 5  # Number of candidates per batch
  intervals:     # Polling intervals in minutes
    linkedin_scraping: 1
    text_extraction: 1
    github_scraping: 1
    google_scraping: 1
    profile_creation: 1

Process Flow

# Each pipeline continuously:
while not stop_flag:
    # Find candidates in input status
    candidates = find_candidates_in_status(INPUT_STATUS)
    
    if candidates:
        try:
            # Process candidates
            process_candidates(candidates)
            # Update to next status
            update_status(candidates, OUTPUT_STATUS)
        except:
            # Mark as failed
            update_status(candidates, FAILED_STATUS)
    
    # Wait for next interval
    wait(process_interval)

Status Progression Examples

Candidate A: XML → EXTRACT_TEXT → GOOGLE_SCRAPE → ...
Candidate B: XML → EXTRACT_TEXT → GOOGLE_SCRAPE_FAILED
Candidate C: XML → EXTRACT_TEXT_FAILED

Error Handling

  • Failed states don't block pipeline
  • Detailed error logging
  • Automatic retry mechanism
  • Status-based error tracking
  • Error notification system

Status Management

Pipeline states for each candidate:

class CandidatePipelineStatus(Enum):
    XML = "xml"
    EXTRACT_TEXT = "extract_text"
    GOOGLE_SCRAPE = "google_scrape"
    LINKEDIN_SCRAPE = "linkedin_scrape"
    GITHUB_SCRAPE = "github_scrape"
    PROFILE_CREATION = "profile_creation"
    PROFILE_CREATED = "profile_created"
    
    # Failed states
    XML_FAILED = "xml_failed"
    EXTRACT_TEXT_FAILED = "extract_text_failed"
    GOOGLE_SCRAPE_FAILED = "google_scrape_failed"
    LINKEDIN_SCRAPE_FAILED = "linkedin_scrape_failed"
    GITHUB_SCRAPE_FAILED = "github_scrape_failed"
    PROFILE_CREATION_FAILED = "profile_creation_failed"

Email System

Application Format

Email applications must follow this format:

Applying for [position name] position. Please find below attached resume and documents for your reference
First Name: [Required]
Middle Name: [Optional]
Last Name: [Required]
Job Id: [Required UUID]

Email Templates

  1. Application Received:
Subject: Application Received - [Position]
Dear [First Name],
Your application for [Position] has been received...
  1. Invalid Job ID:
Subject: Application Error - Invalid Job ID
Dear [First Name],
The Job ID [Job ID] is not valid...
  1. Missing Fields:
Subject: Application Error - Missing Information
Dear Applicant,
The following required fields are missing:
[Missing Fields List]

API Documentation

Auth Controller (/api/v1/auth)

Authentication Endpoints:
├── POST /register
├── POST /login
├── POST /logout
├── POST /refresh/tokens
└── GET  /user/fetch

Admin Controller (/api/v1/admin)

Protected Endpoints:
├── Jobs Management
│   ├── GET  /jobs
│   ├── POST /jobs
│   └── PUT  /jobs/<id>
└── Interview Management
    ├── POST /interviews/schedule
    └── GET  /interviews/schedules

Monitor Controller

Real-time Endpoints:
├── GET /api/monitor/status
└── GET /api/monitor/status/stream

Real-Time Monitoring

SSE Streams

  1. Pipeline Monitor:
{
    "timestamp": "2025-02-12T10:00:00Z",
    "pipelines": {
        "text_extraction": {
            "status": "PROCESSING",
            "last_updated": "2025-02-12T09:59:55Z"
        }
    }
}
  1. Candidate Monitor:
{
    "data": {
        "candidates": [...],
        "pagination": {
            "total": 100,
            "page": 1,
            "per_page": 10
        }
    }
}

Security

  • JWT-based authentication
  • Role-based access control
  • API rate limiting
  • Secure password storage
  • Email validation
  • Input sanitization

Deployment

  1. Set up environment:
    • Configure API keys
    • Set up email server
    • Configure database
  2. Install dependencies
  3. Initialize database
  4. Start application:
python app.py

Troubleshooting

Common Issues

  1. Pipeline Failures:

    • Check API quotas
    • Verify credentials
    • Check network connectivity
  2. Email Issues:

    • Verify SMTP settings
    • Check email templates
    • Validate email format
  3. Database Issues:

    • Check connections
    • Verify permissions
    • Monitor disk space

Contributing

  1. Fork repository
  2. Create feature branch
  3. Submit pull request

License

MIT License

About

AI-powered recruiter that streamlines the technical hiring process through automated resume processing, candidate research, and interview management.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.1%
  • HTML 5.9%