Skip to content

Implementing Braverman's 1970 extremal parameter grouping algorithm.

Notifications You must be signed in to change notification settings

brozonoyer/Parameter-Clusterization

Repository files navigation

Extremal grouping program works in 2 steps:
STEP1. Build membership of features to groups and save information necessary for calculation of extremal factors.
STEP2. Calculate value of extremal group factors for new data not used in the first step.

This 2 steps procedure is required because we want to calculate extremal factors for new datasets.
It is not sufficient to calculate membership and value of factors on one dataset only.

INTERFACE:

Input data for STEP1:
1. Input training CSV-file
2. Number of extremal groups
3. Number of eigenvalues to use in calculation of functional's value (c.f. Braverman)

Output data for STEP1:
1. Memberships to groups
2. Mean and StDev values of features in input CSV-file
3. Weights of features in groups
4. Pickled MODEL storing the factors for the groupings

Arguments for program:

algorithm.py   --step
               --input_filename
               --output_filename
               --model_pickle_file
               --num_groups
               --eigenvalues

Example of possible command line to run the program:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
python algorithm.py 1 Input/m255_train.csv output model 3 1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

where output file output will contain the following information:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ExtremalGroupsInfo
Built on features
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67
nParams,67
nGroups,3
inGroup, group number starts from 1
2,3,3,1,1,1,2,2,3,1,1,1,1,2,2,2,3,3,3,2,2,2,2,3,3,3,2,2,2,2,3,3,3,1,1,1,1,3,1,1,1,1,1,1,1,1,2,3,3,3,3,2,2,2,3,3,3,3,2,2,2,3,3,1,1,1,2
Weights
-0.186458,-0.0816229,-0.064139,0.16274,0.180544,0.163993,-0.128864,-0.086013,-0.0308409,0.0306928,0.134885,0.104327,0.106939,-0.0859965,0.0646137,0.0938427,0.0742317,0.122684,0.121096,0.115043,0.167587,0.20639,0.161277,0.137589,0.226445,0.191443,0.147012,0.17435,0.155488,0.0732077,0.150944,0.20764,0.067144,-0.157152,-0.118166,-0.103356,-0.106509,0.110805,-0.206371,-0.232019,-0.179604,-0.167401,-0.0723795,-0.0851955,-0.0487207,-0.130955,-0.0812782,-0.117166,-0.105671,-0.139035,-0.0974891,-0.146558,-0.185041,-0.152154,-0.152799,-0.13035,-0.137212,-0.0572567,-0.180454,-0.194798,-0.129724,-0.130205,-0.123183,0.132646,0.123184,0.122309,-0.0989264
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Input data for STEP2:
1. Input test CSV-file
2. The (pickled) MODEL from training that stores the factors for extremal groupings
NOTE: the "num_groups" and "eigenvalues" parameters are not provided for step 2

Output data:
1. Value of factors for test CSV-file

Example of possible command line for program:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
algorithm.py 2 test.csv model output.csv
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

where output will be the output.csv file containing the factors for the newly grouped parameters

About

Implementing Braverman's 1970 extremal parameter grouping algorithm.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages