Skip to content

caiks/NIST

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MNIST - handwritten digits

This repository contains tests of the AlignmentRepa repository using data from the MNIST dataset. The AlignmentRepa repository is a fast Haskell implementation of some of the practicable inducers described in the paper The Theory and Practice of Induction by Alignment at https://greenlake.co.uk/.

Documentation

There is an analysis of this dataset here.

Installation

The NIST executables require the AlignmentRepa module which is in the AlignmentRepa repository. The AlignmentRepa module requires the Haskell platform to be installed. The project is managed using stack.

Download the zip files or use git to get the NIST repository and the underlying Alignment and AlignmentRepa repositories -

cd
git clone https://github.com/caiks/Alignment.git
git clone https://github.com/caiks/AlignmentRepa.git
git clone https://github.com/caiks/NIST.git

Then download the dataset files, for example -

cd ~/NIST
wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz

Then build with the following -

cd ~/NIST
stack build --ghc-options -w

Usage

The practicable model induction is described here.

NIST_engine3 Ubuntu 16.04 Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz using 1756 MB memory in 11505 seconds,

cd ~/NIST
stack exec NIST_engine3.exe +RTS -s >NIST_engine3.log 2>&1 &

tail -f NIST_engine3.log

To experiment with the dataset in the interpreter use stack ghci or stack repl for a run-eval-print loop (REPL) environment,

cd ~/NIST
stack ghci --ghci-options -w

Press return when prompted to choose the main executable. Load NISTDev to import the modules and define various useful abbreviated functions,

:l NISTDev

(uu,hrtr) <- nistTrainBucketedIO 2

let digit = VarStr "digit"
let vv = uvars uu
let vvl = sgl digit
let vvk = vv `minus` vvl

let hr = hrev [i | i <- [0.. hrsize hrtr - 1], i `mod` 8 == 0] hrtr 

hrsize hr

let hrtr = undefined

let (wmax,lmax,xmax,omax,bmax,mmax,umax,pmax,fmax,mult,seed) = (2^10, 8, 2^10, 10, (10*3), 3, 2^8, 1, 15, 1, 5)

Just (uu1,df) <- decomperIO uu vvk hr wmax lmax xmax omax bmax mmax umax pmax fmax mult seed

summation mult seed uu1 df hr
(148378.04791361679,74189.02395680839)

BL.writeFile ("NIST_model1.json") $ decompFudsPersistentsEncode $ decompFudsPersistent df

If you wish to use compiled code rather than interpreted you may specify the following before loading MUSHDev -

:set -fobject-code

Note that some modules may become unresolved, for example,

rp $ Set.fromList [1,2,3]

<interactive>:9:1: Not in scope: Set.fromList

In this case, re-import the modules explicitly as defined in NISTDev, for example,

import qualified Data.Set as Set
import qualified Data.Map as Map
import Alignment
import AlignmentRepa
import AlignmentDevRepa hiding (aahr)

rp $ Set.fromList [1,2,3]
"{1,2,3}"

rp $ fudEmpty
"{}"

Releases

No releases published

Packages

No packages published