Skip to content
This repository has been archived by the owner on Mar 24, 2020. It is now read-only.

Reconstructs bug versions from bugzilla history and stores them in ElasticSearch

Notifications You must be signed in to change notification settings

mozilla-metrics/bugzilla_etl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bugzilla ETL

Notice: This ETL is no longer used - active development has moved to https://github.com/klahnakoski/Bugzilla-ETL.

A set of Pentaho DI jobs to extract bug versions from a bugzilla database and store them in an elasticsearch index. This ETL drives dashboards for BMO, for various teams at Mozilla Corporation.

Requirements

  • an elasticsearch cluster where you can CRUD the index bugs
  • a working PDI (a.k.a kettle) installation (free community edition should work fine). Tested with PDI CE 4.3

Minimal instructions

  • Clone this project into a local directory

  • Configure the elasticsearch indexes (put a cluster node in place of localhost):

  • Configure Pentaho DI:

    • add a directory .kettle in your $KETTLE_HOME
    • there, create a file kettle.properties
    • in that file, add settings for bugs_db_host, bugs_db_port, bugs_db_user, bugs_db_pass and bugs_db_name for your bugzilla-database connection.
    • add settings for ES_NODES, ES_CLUSTER, ES_INDEX
  • If necessary, modify bin/import_bugs.sh, then run it to import the full data set.

  • Later on, use bin/update_bugs_incr.sh to read incremental modifications from the MySQL database

Known issues

  • Some cases where a user's bugzilla ID changes mid-history for a bug can't be handled automatically, and should be added to configuration/kettle/bugzilla_aliases.txt. There are several alias-related scripts and transformations that help to detect these types of changes. See bin/find_aliases.sh, bin/find_all_aliases.sh, transformations/find_aliases.ktr, and transformations/detect_new_aliases.ktr.
  • Mozilla Bug 804946 causes some trouble with the ETL. See Bug 804961 for details.

About

Reconstructs bug versions from bugzilla history and stores them in ElasticSearch

Resources

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •