This is a Data Mining and Machine Learning algorithm called Apriori Algorithm. It takes input and generates association rules.
- Clone this repo and fire up generateDatabse.py file.
- This file will create the five sample data sources for testing purposes.
- Ones you see the .txt data source files in your prject folder, you are ready to go.
- Now fire up AprioriAlgorithm.py file which is the actual code for this algorithm.
Need to have python 3.6 installed on your machine. Other version support will be provided as soon as possible.
- The program takes data source, Minimum Support in percentage and Minimum Confidence in percentage as input.
- Data Source: This is to select where the input is coming from. For this test, the data is coming from one of the five files that were created using generateDatabse.py.
- Minimum Support: A minimum support is applied to find all frequent itemsets in a database.
- Minimum Confidence: A minimum confidence is applied to these frequent itemsets in order to form rules.
- Result: The result will show the association rules in the given dataset with the given minimum support and minimum confidence if there are any. If there are no association rules in the the set with the given support and confidence conditions, try to plug in some different (if you didn't get any results, try feeding some lower values) values of them.
- You are welcomed to modify generateDatabse.py file and try to come up with some new data set.
- You can use any data set which is a text file, comma seperated items and one transaction per line.
Fork the repo and try to come up with some optimized version of the algorithm.
It is crucial to stay social ;)