Skip to content

Commit

Permalink
updated code to new version
Browse files Browse the repository at this point in the history
  • Loading branch information
gritzner committed Feb 28, 2024
1 parent 9dd4c8d commit 3692d2e
Show file tree
Hide file tree
Showing 85 changed files with 3,818 additions and 3,370 deletions.
2 changes: 2 additions & 0 deletions .cargo/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[build]
rustflags = ["-C", "target-cpu=native"]
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
BSD 3-Clause License

Copyright (c) 2023, Daniel Gritzner
Copyright (c) 2024, Daniel Gritzner

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
Expand Down
155 changes: 66 additions & 89 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,116 +15,93 @@ Reference implementation of SegForestNet, a model which predicts binary space pa
```

# Results
Our model delivers state-of-the-art performance while using up to 60% fewer model parameters. The table below shows results comparing our model to various other models, all using MobileNetv2 as a backbone. The paper includes more comparisons, e.g., to U-Net variants and using Xception as a backbone instead.

| | Hannover | Nienburg | Buxtehude | Potsdam | Vaihingen | Toulouse | iSAID |
| :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
| FCN | 72.6% | 73.2% | 76.2% | 78.1% | 72.0% | 54.1% | 38.3% |
| __SegForestNet__ | 72.9% | 73.8% | 76.2% | **78.9%** | 72.9% | 52.9% | 44.5% |
| DeepLab v3+ | 72.9% | 73.8% | 76.5% | 77.8% | 72.2% | 52.8% | 34.9% |
| RA-FCN | 71.0% | 71.0% | 74.1% | 74.1% | 70.3% | 49.4% | 35.4% |
| __SegForestNet*__ | **73.6%** | 74.1% | 76.2% | 78.8% | **72.9%** | **54.2%** | 42.8% |
| PFNet | 73.0% | **74.2%** | **76.8%** | **78.9%** | 72.6% | 53.9% | **45.8%** |

![](miou.png)
Our model delivers state-of-the-art performance, even under non optimal training conditions (see paper for details). While other models, e.g., DeepLab v3+, deliver performance on a similar level, SegForestNet is better at predicting small object such as cars properly. It predicts proper rectangles rather than round-ish shapes. Also, car segments which should be disconnected may merge into one larger region when using other models.

Mean $F_1$ scores:

| | Hannover | Buxtehude | Nienburg | Schleswig | Hameln | Vaihingen | Potsdam | Toulouse |
| :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
| FCN | 84.9% | 87.7% | 85.5% | 82.6% | 87.8% | 86.6% | 91.3% | 75.8% |
| DeepLab v3+ | __85.7%__ | 88.7% | 86.7% | __83.6%__ | 88.6% | __86.9%__ | __91.5%__ | __77.6%__ |
| __SegForestNet__ | 85.5% | __88.8%__ | 86.2% | 83.0% | __88.7%__ | 86.8% | 91.3% | 74.8% |
| PFNet | 85.4% | 88.4% | 86.3% | 83.2% | 88.4% | 86.8% | __91.5%__ | 75.8% |
| FarSeg | __85.7%__ | 88.5% | __86.8%__ | 82.8% | 88.4% | 86.7% | 91.4% | 75.0% |
| U-Net | 84.3% | 86.7% | 85.5% | 78.5% | 86.8% | 84.2% | 88.6% | 75.2% |
| RA-FCN | 78.5% | 83.1% | 80.0% | 74.6% | 83.9% | 82.6% | 86.6% | 66.9% |

![](samples.png)

# How to run
### Dependencies
The code has been tested on openSUSE Leap 15.4 running the following software:
* cargo 1.54.0
* cudatoolkit 10.2.89
* gdal 3.4.1
* geos 3.8.0
* geotiff 1.7.0
* libgdal 3.4.1
* libtiff 4.2.0
* matplotlib 3.5.1
* numpy 1.21.5
* pillow 9.0.1
* python 3.9.12
* pytorch 1.10.1
* cargo 1.74.1 (1.67.1 may also be sufficient)
* cuda 11.6.1
* libtiff 4.5.0
* matplotlib 3.7.0
* numpy 1.23.5
* opencv 4.6.0
* python 3.10.10
* pytorch 1.13.1
* pyyaml 6.0
* rustc 1.54.0
* scikit-learn 1.0.2
* scipy 1.7.3
* rustc 1.74.1 (1.67.1 may also be sufficient)
* scikit-learn 1.2.1
* scipy 1.10.0
* torchvision 0.14.1

Optional dependencies:
* geotiff 1.7.0
* tifffile 2021.7.2
* torchvision 0.11.2

To run the code on the publically available datasets, you do not need certain libraries, e.g., GDAL. The following should be sufficient:
* Python
* PyYAML
* PyTorch
* NumPy
* Matplotlib
* Pillow
* tifffile
* Rust
* timm 0.9.2

### Preparing the training environment (optional)
Using pretrained encoder weights requires executing ```utils/preprocess/model_weights.py``` once to download the necessary model weights (for legacy reasons this will also download weights for another encoder which is no longer used in this codebase). Two of the datasets (DLR Multi-Sensor Land-Cover Classification (MSLCC) and SemCity Toulouse) also require executing the appropriate Python script in ```utils/preprocess/``` once. This is necessary to convert some ```.tif``` files into a format that OpenCV likes. The scripts in ```utils/preprocess/``` need the optional dependencies.

### Running the code
Open a terminal in the directory you cloned this repository into and execute the following command:

```sh
python aethon.py semseg potsdam 0 SegForestNet MobileNetv2
```shell
python aethon.py PW potsdam SegForestNet
```

This will use the configuration file ```cfgs/semseg.yaml``` to run our framework. When you are running it for the first time, you also need to add ```--compile``` to the command to compile the code written in Rust. This needs to be done only once. Furthermore, you will need a user configuration file called ```~/.aethon/user.yaml```. An example user configuration can be found below.

The full configuration our framework will parse will be the concatenation of ```core/defaults.yaml``` and ```cfgs/semseg.yaml```. Additionally, all the occurances of ```$N``` in ```cfgs/semseg.yaml``` will be replaced by the parameters given in the commandline, e.g., ```$0``` will become ```potsdam``` and ```$1```will become ```0```. The example above will run our framework to train our model with a MobileNetv2 backbone on the Potsdam dataset using the first random seed from the array in ```core/random_seeds.npy``` for data augmentation. This is the same random seed we used for the experiments in our paper.
This will use the configuration file ```cfgs/PW.yaml``` to run our framework. Furthermore, you will need a user configuration file called ```~/.aethon/user.yaml```. An example user configuration can be found in ```user.yaml```. The full configuration our framework will parse will be the concatenation of ```core/defaults.yaml``` and ```cfgs/semseg.yaml```. Additionally, all the occurances of ```$N``` in ```cfgs/PW.yaml``` will be replaced by the parameters given in the commandline, e.g., ```$0``` will become ```potsdam``` and ```$1```will become ```SegForestNet```. The example above will run our framework to train our model with on the Potsdam dataset using the first random seed from the array in ```core/random_seeds.npy``` for data augmentation. This is the same random seed we used for the experiments in our paper.

The data loaders in ```datasets/``` use hardcoded paths to the datasets. You need to edit those files directly to point our framework towards the datasets on your system. The files requiring editing are:
* ```ISAIDDatasetLoader.py```
* ```ISPRSDatasetLoader.py```
* ```SemcityToulouseDatasetLoader.py```
Even though we cannot provide some of the datasets used in the paper for legal reasons we still provide their data loaders as a reference. The data loaders can be found in ```datasets/```.

Technically, ```LGNDatasetLoader.py``` needs to be edited as well, but since we cannot publish the datasets Hannover, Nienburg and Buxtehude for legal reasons, this file is included only for reference anyway.
The training results, including an evaluation of the trained model on the validation and test subsets, can be found in the appropriate subfolder in ```tmp/PW/```once training is complete.

The valid options for the segmentation model to use and the backbone to use can be found in ```models/segmentation/__init__.py``` and ```models/backbone/__init__.py``` respectively.
### Running within a Jupyter notebook
You can run our code in Jupyter by simply copying the content of ```aethon.py``` to a notebook and adding the commandline parameters to the second line. Example for the second line:

The training results, including an evaluation of the trained model on the validation and test subsets, can be found in the appropriate subfolder in ```tmp/segmentation/```once training is complete.
```python
core.init("PW potsdam SegForestNet")
```

### ~/.aethon/user.yaml example
Example of a user configuration file. This file is only used to submit SLURM jobs from within our framework but it still needs to exist, even if you do not use this feature of our framework.
# Model code
If you are only interested in the code of our model, take a look at ```models/SegForest*.py```. The class ```SegForestNet``` implements our model. It uses several helper classes to give our already complicated code some additional structuring. The constructor of our model has two parameters in addition to ```self```:
* ```params``` is an object with the two attributes ```input_shape``` and ```num_classes``` so that the model knows what kind of data to expect. See line 29 ```tasks/semanticsegmentation.py``` for an example.
* ```config``` is an object which is a parsed version of the relevant subset of the configuration file used to run our framework, in particular the section ```SegForestNet_params``` in ```cfgs/PW.yaml``` in the example above. The parsing is done by the ```parse_dict``` function in ```core/__init__.py```.

```yaml
conda:
path: /home/USERNAME/miniconda3
environment: myEnv
slurm:map:
- [mail-user, user@example.com]
- [mail-type, "ALL,ARRAY_TASKS"]
- [time, "24:00:00"]
- [partition, gpus]
- [nodes, 1]
- [cpus-per-task, 8]
- [gres, gpu:turing:1]
- [mem, 44G]
pushover:
api_token: ...
user_key: ...
```
The ```trees``` subsection of the configuration is of particular interest. It defines the number of trees to predict per block. Each entry of the list ```trees``` will later become an instance of ```models/SegForestTree.py``` with each tree object consisting of a pair of decoders and representing a different tree. The attribute ```graph``` defines the tree structure in terms of components (found in ```models/SegForestComponents.py```). ```eval``` is used to turn ```graph``` into an actual tree object which is technically a security problem. However, the only use cases our framework is supposed to be used in are use cases in which the person triggering the execution of our framework has full system access anyway or at least enough system access to execute arbitrary Python or Rust code. **Note:** this is not the only instance of insecure code in our framework. Examples of valid tree graphs are:
* ```BSPTree(2, Line)```: for a BSP tree of depth two, i.e., a total of three inner nodes and four leaf nodes, using $f_1$ from our paper as signed distance function
* ```BSPTree(2, Circle)```: same as above but using $f_3$ instead of $f_1$
* ```BSPNode(BSPTree(1, Line), Leaf, Line)```: a BSP tree with two inner nodes (the left child of the root node is a BSP tree of depth one while the right child is a leaf node already) and three leaf nodes, using $f_1$ in all inner nodes

All entries in the "slurm" dictionary can be overridden in configuration files. The "pushover" section is optional and only used if you want push notifications once a job is completed. If you want to optimize the amount of memory to reserve for your job, these are the memory usages we measured:
The different signed distance functions are defined in the appendix of the paper.

| Dataset | Memory Usage [GB] |
| :---: | :---: |
| hannover | 4.1 |
| nienburg | 4.1 |
| buxtehude | 4.1 |
| vaihingen | 4.5 |
| potsdam | 15 |
| toulouse | 4 |
| toulouse_multi | 5 |
| isaid | 50 |
The attribute ```one_tree_per_class``` causes the list of ```trees``` to automatically be expanded such that there is exactly one tree for each class. All trees will use the same configuration, e.g., the same ```graph```. In case multiple trees are defined manually an attribute called ```outputs``` must defined for each tree. It is a list of integers defining which tree is responsible for predicting the logits of which class. Examples:
* ```[0]``` predict logits for the first class
* ```[1, 2, 4]``` predict logits for classes two, three and five

Our framework loads each dataset into CPU memory in its entirety.
The union of all ```outputs``` must be the set of all classes and the intersection of ```outputs``` of any two different trees must be empty.

### Model code
If you are only interested in the code of our model, take a look at ```models/segmentation/SegForestNet.py```. The class of the same name as the file implements our model. It uses several helper classes to give our already complicated code some additional structuring. The constructor of our model has two parameters in addition to ```self```:
* ```params``` is an object whose attributes are parameters passed by the generic model training code to the model, e.g., the input shape so that the model knows what kind of data to expect
* ```config``` is an object which is a parsed version of the relevant subset of the configuration file used to run our framework, in particular the section ```SegForestNet_params``` in ```cfgs/semseg.yaml``` in the example above
If you want to use SegForestNet outside our framework you need these files:
* models/SegForestNet.py
* models/SegForestTree.py
* models/SegForestTreeDecoder.py
* models/SegForestComponents.py
* models/Xception.py
* models/xception.json.bz2
* utils/\_\_init\_\_.py
* utils/vectorquantization.py

The ```trees``` subsection of the configuration is of particular interest. It defines the number of trees to predict per block. Each entry of the list ```trees``` will later become an instance of ```models/segmentation/SegForestTree.py``` with each tree object consisting of a pair of decoders and representing a different tree. The list ```outputs``` of each tree specifies for which classes that particular tree will predict logits. The union of all ```outputs``` must be the set of all classes and the intersection of two ```outputs``` of different trees must be empty. The attribute ```graph``` defines the tree structure in terms of components (found in ```models/segmentation/SegForestComponents.py```). ```eval``` is used to turn ```graph``` into an actual tree object which is technically a security problem. However, the only use cases our framework is supposed to be used in are use cases in which the person triggering the execution of our framework has full system access anyway or at least enough system access to execute arbitrary Python or Rust code. **Note:** this is not the only instance of insecure code in our framework. Examples of valid tree graphs are:
* ```BSPTree(2, Line)```: for a BSP tree of depth two, i.e., a total of three inner nodes and four leaf nodes, using $f_1$ from our paper as signed distance function
* ```BSPTree(2, Circle)```: same as above but using $f_3$ instead of $f_1$
* ```BSPNode(BSPTree(1, Line), Leaf, Line)```: a BSP tree with two inner nodes (the left child of the root node is a BSP tree of depth one while the right child is a leaf node already) and three leaf nodes, using $f_1$ in all inner nodes
You need to fix several dependencies. To remove the dependency on the core module, replace all instances of ```core.device``` with the appropriate device, usually ```torch.device("cuda:0")```. Add ```import gzip``` to ```Xception.py``` and in line 142 use ```gzip.open(...)``` instead of ```core.open(...)```. Also, in the same line, change the first argument of ```open``` to the path of your downloaded Xception model weights. In ```utils/__init__.py``` comment out lines one to three as well as line five.
28 changes: 9 additions & 19 deletions aethon.py
Original file line number Diff line number Diff line change
@@ -1,23 +1,13 @@
import core
core.init()
# IMPORTANT: Do not add any imports above here otherwise you may break the code in core.init() that limits the number of threads!

import utils
import sys
import atexit
import tasks
import os

if core.args.notify:
argv = " ".join([arg for arg in sys.argv if arg != "--notify"])
if hasattr(core, "embedded_parameters"):
argv = f"{argv}\n\nembedded parameters: {str(core.embedded_parameters)}"
atexit.register(utils.push_notification, core.args.configuration, argv)

task = core.create_object(tasks, core.task)
task.run()
os.system(f"touch {core.output_path}/done")

if core.args.notify and hasattr(task, "push_notification"):
atexit.unregister(utils.push_notification)
utils.push_notification(core.args.configuration, f"{argv}\n\n{task.push_notification}")
if core.args.configuration[0] != "@":
task = core.create_object(tasks, core.task)
task.run()
core.call(f"touch {core.output_path}/done")
else:
import os
if "LD_LIBRARY_PATH" in os.environ:
del os.environ["LD_LIBRARY_PATH"] # fixes an issue with calling ssh after cv2 has been imported
getattr(tasks, core.args.configuration[1:])()
107 changes: 107 additions & 0 deletions cfgs/PG.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
task: SemanticSegmentation
output_path: tmp/PG
clear_output_path: True

SegmentationDataset_params:
<<: *$0-defaults
random_seed: 0
patch_size: [224, 224]
training_samples: 8000
augmentation:
scaling: .3
x_shear: 6
y_shear: 6
rotation: 30
h_flip: 0
v_flip: 0
contrast: 0
brightness: 0
noise: .1
at_test_time: False

SegForestNet_params:
<<: *SegForestNet-defaults
downsampling: 3
pretrained_encoder: False
trees:
- num_features:
shape: 8
content: 24
shape_to_content: 0
graph: BSPTree(2, Line)
classifier: []
classifier_skip_from: 0
classifier_context: 2
one_tree_per_class: True
decoder:
type: TreeFeatureDecoder
num_blocks: 8
context: 1
intermediate_features: 96
use_residual_blocks: True
vq:
type: [0, 0]
region_map:
accumulation: add
node_weight: 1
softmax_temperature:
parameters: [epoch, epochs]
func: 0
value_range: [1, 1]
loss:
cross_entropy: pixels
ce_constant: 10
distribution_metric: gini
min_region_size: 4
weights: [.8625,.0475,.035,.055,0]

FCN_params:
downsampling: 5
pretrained_encoder: False

RAFCN_params:
downsampling: 5
pretrained_encoder: False

FarSeg_params:
downsampling: 5
pretrained_encoder: False

PFNet_params:
downsampling: 5
pretrained_encoder: False

UNet_params:

DeepLabv3p_params:
downsampling: 3
pretrained_encoder: False
aspp_dilation_rates: [] # default for MobileNetv2 backbone (performs best, even with Xception)
#aspp_dilation_rates: [12, 24, 36] # default for Xception backbone with downsampling == 3 according to DeepLab paper
#aspp_dilation_rates: [6, 12, 18] # default for Xception backbone with downsampling == 4 according to DeepLab paper

SemanticSegmentation_params:
<<: *SemanticSegmentation-defaults
dataset: SegmentationDataset
model: $1
epochs: $2
mini_batch_size: 18
shuffle_seed: -1
num_samples_per_epoch: 8000
unique_iterations: False
optimizer:
type: AdamW
arguments:
betas: [0.9, 0.999]
weight_decay: 0.01
learning_rate:
max_value: $3
min_value: 0
num_cycles: 1
cycle_length_factor: 2
num_iterations_factor: 1
gradient_clipping: 0
class_weights:
ignore_dataset: False
ignored_class_weight: 0
dynamic_exponent: 0
Loading

0 comments on commit 3692d2e

Please sign in to comment.