Skip to content

Parameter description

QiHong Lu edited this page Aug 26, 2016 · 2 revisions

This page describe the parameters that you need to specify before running the RAM model. The numbers we used kind of works but we never did any grid search or formal tuning. Namely, you might be able to make the model work better play with these parameters.

general parameters

load_path:(str) the path to saved models

eval_only: (bool) test an existing model if eval_only == true

draw: (bool) do visualization

animate: (bool) visualize the glimpses, glimpse trajectory, and the digits

condition parameters

translateMnist: (bool) translate the raw training images on the fly

eyeCentered: (bool) use eye centered system (under development, just fixed max-min thresholding, haven't tested)

preTraining: (bool) pre-train the model to reconstruct the raw image (under development, currently not tunning the looks with the reconstruction loss. Also, good reconstruction might require deconvolution.) preTraining_epoch: (int) number of pre-training epochs

drawReconsturction:(int) visualization that compares the true image versus the reconstructed image

parameters for the data

MNIST_SIZE: (int) the side length of the training image (the image doesn't have to be MNIST, by the way)

translated_img_size: (int) the side length of the image after translation (probably should be bigger than MNIST_SIZE)

img_size: (int) actual image side length used when executing the program. You shouldn't need to set this parameter. The code should automatically figure it out based on the translateMnist parameter

n_classes: (int) number of classes for the training data set (e.g. it is 10 for MNIST)

learning related parameters

initLr:(float) the initial learning rate

lrDecayRate: (float, from 0 to 1) the decay factor for the learning rate, exponential decay is used

lrDecayFreq: (int) the frequency (in terms of training epoch) for the learning rate decay

momentumValue: (float, from 0 to 1) the momentum value for the gradient decent step

batch_size:(int) just the batch size...

model parameters

depth: (int) number of "zooms" or granularities

sensorBandwidth: (int) the side length for the smallest zoom (finest granularity)

minRadius: (int) the radius for the smallest zoom. It will be automatically figured out by the code

channels: (int) 1 = grayscale images, i have no idea if the code works on colored images

totalSensorBandwidth: (int) depth x channels x (sensorBandwidth^2) the dimension of the input space

nGlimpses: (int) number of glimpses allowed

loc_sd: (int) the std for the noise, imposed on the glimpse locations

network parameters

hg_size: (int) the size for the glimpse network

hl_size: (int) the size of the location network

g_size: (int) the size for the glimpse feature

cell_size: (int) the size of the core network (the recurrent part)

cell_out_size: (int) = cell_size

training parameters

start_step: (int) index for the starting epoch

max_iters: (int) max number of epochs allowed

other parameters

SMALL_NUM: (int) just a very small number, used to avoid the underflow issue when take log

Clone this wiki locally