Automatic propositionalization (flattening) of relational data for the purposes of data mining (classification or regression). See Wiki for more details.
- Download the latest release from GitHub.
- Unpack the archive.
- Check Java 8 or newer is installed (
java -version
). Iff you are using OpenJDK, install OpenJFX (sudo apt-get install openjfx
). - Run PredictorFactory.jar (double click the jar file or enter in the command line:
java -jar PF.jar
). - Follow the manual.
- Install JDK 1.8 or newer.
- Install Eclipse, IDEA, Netbeans or other IDE.
- Install Gradle build system.
- Import the project from Git:
https://github.com/janmotl/PredictorFactory.git
. - Run the command line interface from:
src/run/Launcher.java
. - Or run the graphical user interface from:
gui/controller/MainApp.java
. - Deploy with Gradle: other>zip.
- Install ANTLR 4 plugin for IntelliJ IDEA.
- See stackoverflow.com how to setup the directory with the autogenerated code.
- Set Idea to use javadoc from http://www.antlr.org/api/Java/ (the documentation is not great but better something than nothing).
- To test a rule, right click on it and select "Test rule bracket".
- Collect metadata about the database (list of tables, columns and foreign key constraints).
- Create a "base table" from the "target table". Base table is a (subsample) of the target table with just the essential attributes (id, target, and optionally a timestamp), which gets propagated into all other tables.
- Propagate base table into all tables in the database with joins, as defined by the foreign key constraints.
- Calculate predictors on the propagated tables.
- Join the calculated predictors into the mainsample table.
- Company firewall can block the access to the database port. Connect over a cellphone to test the hypothesis.
- Connect with your favourite database tool to the database to check that the credentials are working.
- Does your IDE complain during the build process? Check that your IDE is using JDK (not JRE) in the right version (1.8 or higher).