This is a MonoRepository which aims to automatically find bugs in source code
based on software-metrics.
This package contains all programs needed to run to generate a learning dataset.
It uses partial implementations of the bugFinder-framework which are placed in
the packages directory. You can find these packages on npm, too.
This package uses bugFinder-framework and implementation-packages of bugFinder-framework-interfaces.
Have a look at the thesis of this project. Especially chapter 4 is part of the documentation of this project and especially of its plugins. The thesis is written in german, but all illustrations are in english and you might understand a lot of chapter 4.
- Description
- Table of contents
- Pre
- Configuration
- Running
- Machine Learning
- Concept
- Pipeline
- Blackboard
- Knowledge sources available
git submodule update --init --recursive
npm install
cd ..
git clone https://github.com/microsoft/TypeScript.git
cd TypeScript && git checkout -f 474cf0d57586ff7e6ea1b09210dd3da642de2030
You won´t need SonarQube, if you do not like to quantify with SonarQubeQuantifier.
If you do not want to use localityRecorder-commit or localityRecorder-commitPath you won´t need git either.
You won´t need to clone the TypeScript repository, if you do not like to quantify TypeScript.
Used versions
git v2.33.1
SonarQube v9.0.1.46107
node v14.17.0
MongoDB v4.4.1 2008R2Plus SSL
Read SonarQube-Documentation carefully and follow the instructions.
- Add SonarScanner.bat to your path environment variable
- Start the SonarQube-Server
- Configure a project in the SonarQube-Webinterface (See SonarQube official documentation)
- Adjust quantifying config to your needs: src/01-recording/02a-quantifying/inversify.config.ts
Each script has a configuration file.
See src/.../module_name/inversify.config.ts
Please consider configuring the scripts before running.
You can run the scripts:
npm run recording-01a-localityRecording
npm run recording-01b-localityPreprocessing
npm run recording-02a-quantifying
npm run recording-02b-annotating
npm run preprocessing
npm run training
The training phase is not automated. You can use packages/bugFinder-machineLearning as a template.
For further readings see bugFinder-machineLearning
The architecture of this project is based on a pipeline and a blackboard architecture.
The whole process of finding bugs in source code with machine learning is modeled as a pipeline:
Record localities you want to find bugs in. F.e. a Commit or a path in a commit.
Preprocess the recorded localities. You might want to filter localities you do not want to consider for now or inject localities.
You need to quantify your preprocessed localities with the goal of generating features used for machine learning. F.e. measure software-metrics about the last changes of your source-code file.
You need to annotate you localities to generate able to generate a suitable dataset for supervised learning.
Do the localities contain a bug or do the not?
F.e. Take the next five changes of a file into account and measure how many bug fixes were made.
With the goal of achieving a dataset, which can be easily used with scikit-learn your quantified and annotated localities need to be transformed to a suitable format. You might want to filter features or samples you do not want to consider.
For each step of the pipeline there is a controller. Each steps uses a knowledge source. The controllers are pictured on the left side of the picture. The Recording-Component consists of the components localityRecording, localityPreprocessing, quantifying and annotating.
The knowledge sources (right part of the picture) can be exchanged. Dependency injection with InversifyJS is used.
You can find different components realisations open source on github and npm. Search for bugfinder-*
npm search: bugfinder-localityrecorder-*
npm search: bugfinder-$LOCALITY_CLASS-localityPreprocessor-*
npm search: bugfinder-$LOCALITY_CLASS-quantifier-*
npm search: bugfinder-$LOCALITY_CLASS-annotator-*
npm search: bugfinder-$LOCALITY_CLASS-$ANNOTATION_TYPE-$QUANTIFICATION_TYPE-preprocessor-*