Loss and scoring changes #14

leaprovenzano · 2017-06-06T15:50:23Z

re part 1 and 6 of #13 ..

part1 loss

I have added the optimize for validation loss. devol.run now has a param metric which defaults to accuracy but can be set to loss. In which case the devol.set_objective will set it's properties accordingly to use a minimizing objective in various parts of the code as follows:

when initializing a Population, the min flag is passed and after min/max scaling we flip the values. So we don't need to make changes to the other Population methods or worry too much about operators.

        scores = fitnesses - fitnesses.min()
        scores /= scores.max()
        if obj is 'min':
            scores = 1 - scores

dvol.bssf is now in the format model, loss, accuracy. loss, accuracy is the way keras returns from model.evaluate() so this seemed like a logical choice. The demo notebook has been updated to reflect this.
Obviously run uses the loss to evaluate best model and tqdm output in tells us if it's measuring loss or accuracy

the readme has been updated to mention this and i have tried to add some docstring into run while I was at it.

part2 changes to population scoring.

as mentioned in #13 . I have removed the default scoring method from the Population class in favor of just min-max scaling the fitnesses scores (losses or accuracies). As before the devol.run accepts a fitness parameter defaulting to None and passes that to Population.__init__ which will effect the scores if provided. The big change worth mentioning here is that the fitnesses passed to Population.__init__ are now a numpy array. I did this for a couple reasons:

a user created scoring function that only looks at a single value may be less useful than one looking at the entire score space.
most basic math stuff can be called on an array just as easily as on a single value.

I have mentioned the fact that any scoring function given to devol.run should expect a numpy array in the docstring i wrote.

also..

I also updated the notebook as it was not checking the backend before setting the image shape. I have noticed there are also no backend checks for BatchNormalization in genome_handler.py which could cause issues for theano people but I will address that in a seperate commit.

… added some docstring for DEvol.run

…age shape incompatiability with tensorflow backend

joeddav

I commented with a few minor revisions on comments, readme, etc., but the code looks good overall.

I'm also going to experiment a little and see how the different scoring method affects results. If you have any experience with that already I'd love to hear it.

joeddav · 2017-06-07T00:35:32Z

readme.md

@@ -1,4 +1,4 @@
-# DEvol - Deep Neural Network Evolution
+f# DEvol - Deep Neural Network Evolution


Looks like you've got a typo here

joeddav · 2017-06-07T00:40:17Z

devol.py

-    def run(self, dataset, num_generations, pop_size, epochs, fitness=None):
+    # Returns best model found in the form of (model, loss, accuracy)
+    def run(self, dataset, num_generations, pop_size, epochs, fitness=None, metric='accuracy'):
+        """run genetic search on dataset given number of generations and population size


These comments are great - it'd probably be good to get rid of the preexisting ones on line 43-44 since these take their place

joeddav · 2017-06-07T00:55:19Z

demo.ipynb

+   "outputs": [],
+   "source": [
+    "# set image format so that it's compatable with backend\n",
+    "image_format = (28, 28, 1) if K.image_data_format() is 'channels_last' else (1, 28, 28 )\n",


I think it might be cleaner here to just go K.set_image_data_format("channels_first") and keeping (1, 28, 28) rather than changing the shape of the images based on their current setting.

wow I was unaware K.set_image_data_format() actually changes the fact what tf or theano expects, I was under the impression sets the format in keras. but i can see that it works* in tf.

I have to very very respectfully put my two cents in here and say that generally I think people use one main backend and are used to seeing dimensions ordered that way. I personally think a backend check is more polite to other backend users, and it's also the way it's handled in keras examples, I think i'd find it a bit annoying if someone changed my expected dim ordering.

But of course it's just a demo and if theano dims are your preference that's totally fine. Maybe we should just comment that we're changing the dims to theano style if we do that, particularly since it's a one liner it could be missed and could be a faff for someone used to using tf and who wants to play with the notebook and import some of their own images etc... also for new users this backend dimensions thing can be confusing.

#setting backend dims to theano format because ... K.set_image_data_format("channels_first")

what do you think?
if you want me to just shut up and do K.set_image_data_format("channels_first") that fine.

*I am concerned about the batch normalization though for theano since it should be checking backend in genome_handler and setting axis to 1 for theano users since it's using the default which is -1 for tensorflow. Since you seem to use theano I would think this would be messing up your results on batch normalized networks? Have you had any issues?

I can open a separate PR for this... but I'm sure you'll agree that while changing dim ordering is fine in demo we should not be changing backend dim ordering in actual library since it would really mess people up and need to implement a proper check and a global or something for bn axis.

Well, K.set_image_data_format doesn't permanently change your backend configuration in your keras.conf file or anything - it just changes it within the scope that you're in, so I don' think it'll be a problem. If people prefer it the other way then they can just swap them. I'm more concerned about the code looking simple and straightforward, and inferring the backend is much less conventional and straightforward looking than setting it. That said, it may be best to set it to channels_last as I believe that's tensorflow's default, and therefore more people will be using it that way. A comment is fine if you'd like, but I don't think it's needed.

Until you mentioned it, I've never heard of problems with BN with Theano... Is it not supported or something? It doesn't say anything like that in the Keras docs. I've used Theano and Tensorflow and have never had problems (at least not that I was aware of) with it.

cool I shall update to channels_last this as you say.

Re BN keras default axis for bn is -1 (channels_last) so if you're using Theano and channels_first config it should be set to 1 to normalize along the feature axis.

see docstring at BatchNormalization

Wow. Great catch. Yeah, we should add that in.

joeddav · 2017-06-07T00:56:00Z

demo.ipynb

   "source": [
    "num_generations = 10\n",
    "population_size = 20\n",
    "num_epochs = 1\n",
    "\n",
    "devol = DEvol(genome_handler)\n",
-    "model, accurracy = devol.run(dataset, num_generations, population_size, num_epochs)\n",
-    "print model.summary()"
+    "model, accurracy, loss = devol.run(dataset, num_generations, population_size, num_epochs)\n",


This also needs to be added to demo.py

silly me, I didn't even look in there! I have updated it for the return values of devol.run and also updated the print function to be python3 compatible. What I haven't done is the backend thing... I'll await your response to above and make those changes whatever changes you think are right.

…ion for python3 compatability

leaprovenzano · 2017-06-07T09:42:20Z

I'm also going to experiment a little and see how the different scoring method affects results. If you have any experience with that already I'd love to hear it.

I'm also going to experiment a little and see how the different scoring method affects results. If you have any experience with that already I'd love to hear it.

I haven't experimented too much with different scoring functions but would be interested to hear your results. I was thinking last night it might be good to be able to access loss training loss/acc to do some heuristic based on overfitting... but this would require some changes etc and im not sure it's worth it.

basically for me using the loss made a big difference especially as i was using a smaller dataset with imbalanced classes, and a small number of labels where accuracy tended not to change very much but loss had much more movement.

for a while I was kind of thinking that maybe the min max scaling is a bit drastic as it essentially kills a member of the population, but it's probably for the best.... and certainly speeds up search I would think.

After some changes in the GenomeHandler to use the functional API (see last comment in #13), i have prelu & leaky and I've been messing around a lot with network structure. I have added different kernel sizes, and factorized convs and i've found this gives me good results.

I'm wondering if it would be worth trying a version of the net that works with conv blocks in place of layers , particularly since implementing factorized convs has done so well. ...like a parameter for convs_per_block (like [1 -3]). this would also allow sticking more complex stuff like residual blocks or even inception blocks in as long as they are identified in some parameter mapping so i may try it out.

joeddav · 2017-06-07T21:39:05Z

I've messed around with it a little bit, and I think there is some difference in the function I used, but I'm not necessarily convinced that the way we did it before was better. I like your approach (with the option of providing a heuristic scoring function) better.

I'm 100% cool with using the functional API, especially if it does make it easier to add advanced activations. I think that it'll be necessary to do that eventually anyway honestly. That'd probably be good in a separate PR, however. This one's already already pretty big.

If you can resolve the dimension ordering stuff, we should be good to merge I believe.

leaprovenzano · 2017-06-08T23:51:58Z

hey, i've sorted the last bits with tf channels etc.... I'll do a new PR for stuff in genome_handler with the func api and advanced activations, global pooling etc as soon as i get a chance, but busy for a bit so it might be a couple days.

leaprovenzano added 3 commits June 6, 2017 15:49

added the ability to optimize loss, and changes to population scoring…

4c53913

… added some docstring for DEvol.run

updated demo since run now returns loss as well as accuracy, fixed im…

f6ac03d

…age shape incompatiability with tensorflow backend

updated readme to include the bit about search using loss.

29ae87f

joeddav requested changes Jun 7, 2017

View reviewed changes

leaprovenzano added 3 commits June 7, 2017 09:41

fixed typo in readme

1d85344

removed old inline comments above run method in devol.py

3624e38

added changes to fit return to include loss, also changed print funct…

675cbba

…ion for python3 compatability

set image dim for backend to channels_last (tf style) for demos

29328e6

joeddav approved these changes Jun 9, 2017

View reviewed changes

joeddav merged commit 29328e6 into joeddav:master Jun 9, 2017

leaprovenzano deleted the loss_and_scoring_changes branch June 9, 2017 06:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss and scoring changes #14

Loss and scoring changes #14

leaprovenzano commented Jun 6, 2017

joeddav left a comment

joeddav Jun 7, 2017

joeddav Jun 7, 2017

joeddav Jun 7, 2017

leaprovenzano Jun 7, 2017

joeddav Jun 7, 2017

leaprovenzano Jun 8, 2017

joeddav Jun 8, 2017

joeddav Jun 7, 2017

leaprovenzano Jun 7, 2017

leaprovenzano commented Jun 7, 2017

joeddav commented Jun 7, 2017 •

edited

Loading

leaprovenzano commented Jun 8, 2017

		@@ -1,4 +1,4 @@
		# DEvol - Deep Neural Network Evolution
		f# DEvol - Deep Neural Network Evolution

Loss and scoring changes #14

Loss and scoring changes #14

Conversation

leaprovenzano commented Jun 6, 2017

re part 1 and 6 of #13 ..

part1 loss

part2 changes to population scoring.

also..

joeddav left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leaprovenzano commented Jun 7, 2017

joeddav commented Jun 7, 2017 • edited Loading

leaprovenzano commented Jun 8, 2017

joeddav commented Jun 7, 2017 •

edited

Loading