-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Faster training in reduced bases (#4)
* working on a more efficient reduced basis training * updating the fast training code, still an issue in the end when re-stitching the networks * the fast training is working currently for the as_dense architecture * testing accuracy in l2 * updating reduced basis training codes * fixing some last minute issues with the dino training * more improvements * updating fast training to accomodate more than as * adding callbacks to the opt_parameters default dict to accomodate learning rate scheduling * adding callbacks to the opt_parameters default dict to accomodate learning rate scheduling * adding readme and drivers, working on finishing this pull request * updating some code, getting drivers sorted out in the refactoring * updating training drivers * further refactoring * more refactoring * updating * updating, lets see if the dipnet stuff still runs correctly * massive commit incoming * running hyperelasticity runs again * updating * updating again more rb_dense typos and such * needed to move the save weights out of the train_dino which could be run in reduced setting * hyperelasticity evaluation and post-processing working * updating everything * hyper and rdiff evaluation have all been checked * a few debugging leftovers needed to be removed * all examples are completely documented and functional in the refactoring * adding INSTALL file * updating README * some updates regarding the evaluation suite
- Loading branch information
Showing
55 changed files
with
4,303 additions
and
1,183 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
Derivative Informed Neural Operator | ||
|
||
_____ ___ ___ | ||
/ /::\ ___ /__/\ / /\ | ||
/ /:/\:\ / /\ \ \:\ / /::\ | ||
/ /:/ \:\ / /:/ \ \:\ / /:/\:\ | ||
/__/:/ \__\:| /__/::\ _____\__\:\ / /:/ \:\ | ||
\ \:\ / /:/ \__\/\:\__ /__/::::::::\ /__/:/ \__\:\ | ||
\ \:\ /:/ \ \:\/\ \ \:\~~\~~\/ \ \:\ / /:/ | ||
\ \:\/:/ \__\::/ \ \:\ ~~~ \ \:\ /:/ | ||
\ \::/ /__/:/ \ \:\ \ \:\/:/ | ||
\__\/ \__\/ \ \:\ \ \::/ | ||
\__\/ \__\/ | ||
|
||
An Efficient Framework for High-Dimensional Parametric Derivative Learning | ||
|
||
|
||
* PDE data generation is handled by `FEniCS` `hIPPYlib`, and `hippyflow`. For this [`hIPPYlib`](https://github.com/hippylib/hippylib) and [`hippyflow`](https://github.com/hippylib/hippyflow) must be installed. | ||
|
||
With conda | ||
|
||
* `conda create -n hippyflow -c uvilla -c conda-forge fenics==2019.1.0 tensorflow matplotlib scipy tensorflow=2.7.0` | ||
|
||
Assumes that the environmental variables `HIPPYLIB_PATH`, `HIPPYFLOW_PATH` and `DINO_PATH` have been set. | ||
|
||
* `export HIPPYLIB_PATH=path/to/hippylib` | ||
* `export HIPPYFLOW_PATH=path/to/hippyflow` | ||
* `export DINO_PATH=path/to/dino` | ||
|
||
|
||
## Machine learning in Tensorflow (Beware of version / eager execution) | ||
|
||
Neural network training is handled by `keras` within `Tensorflow`. The way that the Jacobians are extracted at present requires that some tensorflow v2 behaviour is disabled. This creates issues with eager execution in later versions of tensorflow. This library works with tensorflow `2.7.0`. In the future, `dino` may be reworked to handle the eager execution issue in later versions of tensorflow. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# Instructions for convection-reaction-diffusion problem | ||
|
||
## 1. Generate the training data | ||
|
||
|
||
First in order to generate the training data run one of the following commands: | ||
|
||
`python confusionProblemSetup.py` | ||
|
||
or with several simultaneous MPI processes: | ||
|
||
`python generateConfusion.py` | ||
|
||
The command line arguments `-save_jacobian_data`, `-save_as` are set to `True` (`1`) by default. In order to generate a basis for PCANet (i.e., KLE of the input parameter), additional set the argument `-save_kle` to `True` (`1`). The data will initially be saved to `./data/` in a subfolder that specifies the specifics of the problem. When the data become large, it is also suitable to save them to a different location (e.g. a dedicated storage location) by modifying the location in `hyperelasticityProblemSetup.py`, or simply move the data after the process is complete. | ||
|
||
## 2. Train the neural networks | ||
|
||
The neural network scripts are all located in `dino_training/`. To run all neural network trainings used in the DINO paper, run | ||
|
||
`python training_runs.py` | ||
|
||
Note that these runs may take very long, and were all run on a cluster with 1TB of RAM. The data are assumed to be loaded from a subfolder in `data/`. If this was moved somewhere else I suggest using symbolic links, (e.g., in bash `ln -s /path/to/moved/data/ data/`). | ||
|
||
When these runs finish they will output trained weights (as pickled dictionaries) to a folder `trained_weights/` within the `dino_training/` directory. The reason the neural networks are not directly [saved and loaded using tensorflow](https://www.tensorflow.org/tutorials/keras/save_and_load) is due to the significant computational graph overhead due to extracting the Jacobians from the neural network. Perhaps a better way to handle breaking up the training of DINOs from their evaluation and deployment would be to instance an identical architecture (without the Jacobian computational graph overhead) at the end of training, and copying over the weights from the trained DINO to the copy, and then saving the copy using tensorflow. | ||
|
||
## 3. Evaluate the trained neural networks | ||
|
||
Once the neural networks are trained, and their weights have been saved, the networks can be evaluated using the following codes, which are located in `dino_training/evaluation/`. | ||
|
||
Once in `dino_training/evaluation/` | ||
|
||
`python evaluation_loop.py -weights_dir ../trained_weights/` | ||
|
||
|
||
These scripts will output dictionaries of evaluated accuracies, gradient errors and Jacobian errors to `dino_training/evaluation/postproc/` | ||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
136 changes: 136 additions & 0 deletions
136
applications/confusion/dino_paper/evaluation/evaluate_network_accuracies.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,136 @@ | ||
# This file is part of the dino package | ||
# | ||
# dino is free software: you can redistribute it and/or modify | ||
# it under the terms of the GNU Lesser General Public License as published by | ||
# the Free Software Foundation, either version 2 of the License, or any later version. | ||
# | ||
# dino is distributed in the hope that it will be useful, | ||
# but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
# GNU Lesser General Public License for more details. | ||
# | ||
# You should have received a copy of the GNU Lesser General Public License | ||
# If not, see <http://www.gnu.org/licenses/>. | ||
# | ||
# Author: Tom O'Leary-Roseberry | ||
# Contact: tom.olearyroseberry@utexas.edu | ||
|
||
import os, sys | ||
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' | ||
os.environ['KMP_DUPLICATE_LIB_OK']='True' | ||
os.environ["KMP_WARNINGS"] = "FALSE" | ||
import numpy as np | ||
import tensorflow as tf | ||
if int(tf.__version__[0]) > 1: | ||
import tensorflow.compat.v1 as tf | ||
tf.disable_v2_behavior() | ||
|
||
|
||
import time | ||
import pickle | ||
|
||
sys.path.append(os.environ.get('HIPPYLIB_PATH')) | ||
import hippylib | ||
|
||
sys.path.append(os.environ.get('HIPPYFLOW_PATH')) | ||
import hippyflow | ||
|
||
|
||
# Import dino inference module | ||
sys.path.append(os.environ.get('DINO_PATH')) | ||
from dino import * | ||
|
||
from dino.evaluation.surrogatePostProcessing import evaluateJacobianNetwork | ||
|
||
# Import reaction diffusion problem specifics | ||
sys.path.append('../../') | ||
from confusionModelSettings import confusion_problem_settings | ||
|
||
try: | ||
tf.random.set_seed(0) | ||
except: | ||
tf.set_random_seed(0) | ||
|
||
from argparse import ArgumentParser | ||
# Arguments to be parsed from the command line execution | ||
parser = ArgumentParser(add_help=True) | ||
# Weights directory | ||
parser.add_argument("-weights_dir", dest='weights_dir', required=True, help="Weights directory",type=str) | ||
# parser.add_argument("-ndata", dest='ndata', required=True, help="ndata",type=str) | ||
parser.add_argument("-input_dim", dest = 'input_dim',required=False,default = 4225, help = "input dim",type = int) | ||
parser.add_argument("-logging_dir", dest = 'logging_dir',required=False,default = 'postproc/accuracies/', help = "input dim",type = str) | ||
args = parser.parse_args() | ||
|
||
problem_settings = confusion_problem_settings() | ||
|
||
weights_dir = args.weights_dir+'/' | ||
|
||
|
||
weights_files = os.listdir(weights_dir) | ||
|
||
n_obs = 50 | ||
gamma = 0.1 | ||
delta = 1.0 | ||
nx = 64 | ||
data_dir = '../../data/confusion_nobs_'+str(n_obs)+'_g'+str(gamma)+'_d'+str(delta)+'_nx'+str(nx)+'/' | ||
|
||
print(os.path.isdir(data_dir)) | ||
|
||
|
||
for weights_name in weights_files: | ||
print('weights_name = ',weights_name) | ||
t0 = time.time() | ||
#### | ||
evaluate_network = False | ||
settings = jacobian_network_settings(problem_settings) | ||
settings['nullspace_constraints'] = False | ||
settings['opt_parameters']['loss_weights'] = [1.0,1.0] | ||
settings['depth'] = 6 | ||
settings['fixed_input_rank'] = 50 | ||
settings['full_jacobian'] = True | ||
settings['full_JVVT'] = False | ||
#### | ||
|
||
if ('as_dense' in weights_name.lower()) or ('dipnet' in weights_name.lower()): | ||
settings['architecture'] = 'rb_dense' | ||
if ('10050' in weights_name) or ('100-50' in weights_name): | ||
print('100') | ||
settings['fixed_input_rank'] = 100 | ||
|
||
evaluate_network = True | ||
|
||
elif 'generic_dense' in weights_name: | ||
settings['architecture'] = 'generic_dense' | ||
# What is a better way in general to set the input and output dimensions. | ||
settings['input_dim'] = args.input_dim | ||
settings['output_dim'] = 50 | ||
evaluate_network = True | ||
else: | ||
print('Not implemented, passing for now') | ||
pass | ||
|
||
if evaluate_network: | ||
file_name = weights_dir+weights_name | ||
jacobian_network = observable_network_loader(settings, file_name = file_name) | ||
for i in range(2): | ||
print(80*'#') | ||
print('Running for :'.center(80)) | ||
print(weights_name.center(80)) | ||
for i in range(2): | ||
print(80*'#') | ||
results = evaluateJacobianNetwork(settings,jacobian_network = jacobian_network,data_dir = data_dir) | ||
logging_dir = args.logging_dir | ||
logger_name = weights_name.split(weights_dir)[-1].split('.pkl')[0]+'_accuracies.pkl' | ||
|
||
os.makedirs(logging_dir,exist_ok = True) | ||
import pickle | ||
|
||
with open(logging_dir+logger_name, 'wb+') as f: | ||
pickle.dump(results, f, pickle.HIGHEST_PROTOCOL) | ||
|
||
print(' Time = ',time.time() - t0,'s') | ||
|
||
|
||
|
||
|
||
|
Oops, something went wrong.