IntelRealSense · Nir-Az · Aug 29, 2023 · Aug 21, 2023 · Aug 29, 2023
diff --git a/wrappers/tensorflow/README.md b/wrappers/tensorflow/README.md
@@ -180,122 +180,6 @@ detections = net.forward()
 
 Resulting `detections` array will capture all detections and associated information. 
 
-## Part 4 - Training on Depth data using TensorFlow
-
-#### Problem Statement
-In this tutorial we'll show how to train a network for depth denoising and hole filling. Please note this project is provided for education purposes and resulted depth is not claimed to be superior to camera output. Our goal is to document end-to-end process of developing new network using depth data as its input and output. 
-
-#### Unet Network Architecture
-Unet is a deep learning architecture commonly used in image segmentation, denoising and inpainting applications. For original paper please refer to [U-Net: Convolutional Networks for Biomedical Image Segmentation](https://arxiv.org/pdf/1505.04597.pdf).
-For the problem of depth post-processing, we are looking to solve a combination of denoising and inpaiting (hole-filling) problems making Unet architecture very appropriate. 
-
-Additional information on Unet:
-
-- [github.com/zhixuhao/unet](https://github.com/zhixuhao/unet) - Open-source implementation of Unet architecture with Keras 
-- [Introduction to image segmentation with Unet](https://towardsdatascience.com/understanding-semantic-segmentation-with-unet-6be4f42d4b47)
-
-Unet offers significant advantages compared to classic autoencoder architecture, improving edge fidelity (see image below).
-
-![foxdemo](images/Unet.PNG)
-###### The image is taken from the article referred above.
-
-Edge comparison between Unet and basic convolution network:
-
-![foxdemo](images/Unet_vs_Basic.PNG)
-
-In the left pathway of Unet, the number of filters (features) increase as we go down, it means that it becomes
-very good at detecting more and more features, the first few layers of a convolution network capture a very small semantic information and lower level
-features, as you go down these features become larger and larger, but when we throw away information the CNN
-knows only approximate location about where those features are.
-When we upsample we get the lost information back (by the concatenation process)
-so we can see last-layer features in the perspective of the layer above them.
-
-#### Training Dataset
-Download [part 1](https://librealsense.intel.com/rs-tests/ML/Depth_Learning/Dataset/part1_1_4000.zip) and [part 2](https://librealsense.intel.com/rs-tests/ML/Depth_Learning/Dataset/part2_4001_8375.zip) of the dataset. It contains 4 types of 848x480 images in uncompressed PNG format: 
-
-###### 1. Simulated Left Infrared:
-- Synthetic view from left infrared sensor of the virtual camera, including infrared projection pattern
-- 3-channel grayscale image of 8 bits per channel
-- Name Filter: left-*.png
-###### 2. Simulated Right Infrared:
-- Synthetic view from right infrared sensor of the virtual camera, including infrared projection pattern
-- 3-channel grayscale image of 8 bits per channel
-- Name Filter: right-*.png
-###### 3. Ground Truth Depth:
-- Ground truth depth images 
-- Single channel 16-bits per pixel values in units of 1mm
-- Name filter : gt-*.png
-###### 4. Generated Depth: 
-- Depth images generated from Left and Right pairs using D400 stereo matching algorithm configured with parameters similar to the D435 camera
-- Single channel 16-bits per pixel values in units of 1mm
-- Name filter: res-*.png
-
-![foxdemo](images/dataset.PNG)
-
-#### Data Augmentation
-To help the neural network learning image features we decide to crop input images into tiles of 128x128 pixels. 
-
-Each ground truth image has a corresponding depth and infrared image. Given that, the dataset is augmented as following:
-
-###### 1. Cropping 
-
-Each image in the dataset is padded to get a size of 896x512 then each of them is cropped to 128x128. In total, each image is cropped to 28 images of size 128x128.  
-Each cropped image is saved with the original image name, adding to it information about the column and row the image was cropped from. It helps corresponding to each ground-truth cropped-image, the IR and depth image from the cropped set.
-
-![foxdemo](images/cropping.PNG)
-
-###### 2. Channeling
-We expand left infrared image to 16-bits and attach it as second channel to network input and output. This gives the network additional visual features to learn from. 
-
-Eventually, the data that is fed to Unet network contains:
-- Noisy images: 
-2 channels: first channel is a depth image and second channel is the corresponding IR image
-- Pure images: 
-2 channels: first channel is a ground truth image and second channel is the corresponding IR image
-
-![foxdemo](images/channeling.PNG)
-
-#### Training Process
-In order to start a training process, the following is required:
-- Unet Network Implementation: choosing the sizes of convolutions, max pools, filters and strides, along downsampling and upsampling.
-- Data Augmentation: preparing dataset that contains noisy and pure images as explained above.
-- Old model (optional): there is an option of training the network starting from a previously-trained model. 
-- Epochs: epoch is one cycle through the full training dataset (forth and back). The default value of epochs number is 100, it could be controlled by an argument passed to the application.
-
-###### console output
-![foxdemo](images/training.PNG)
-
-
-#### File Tree 
-The application will create automatically a file tree:
-- `images` folder: contains original and cropped images for training and testing, and also the predicted images
-- `logs` folder: all tensorflow outputs throughout the training are stored in txt file that has same name as the created model. It contains also a separate log for testing statistics.
-- `models` folder: each time a training process starts, it creates a new folder for a model inside models folder. If the training starts with old model, 
-				 it will create a folder with same name as old model adding to it the string "_new"
-
-		.
-		├───images
-		│   ├───denoised
-		│   ├───test
-		│   ├───test_cropped
-		│   ├───train
-		│   ├───train_cropped
-		│   └───train_masked
-		├───logs
-		└───models
-			└───DEPTH_20200910-203131.model
-				├───assets
-				└───variables
-
-####  Testing Process
-The tested images should be augmented like trained images, except the cropping size should be as big as possible in order to improve prediction performance. The cropped image size should be squared, because Unet is trained on squared images, so rectangled images don't get good prediction. Also, the cropped size shouldn't exceed original image limits, that's why tested image is cropped to 480x480 (each image is cropped to 2 images).
-For testing, there is no need to ground truth data, only depth and IR images are required.
-The relevant folders in file tree are: 
-- `test`: original images to test of sizes 848x480
-- `test_cropped`: cropped testing images, size: 480x480
-- `denoised`: the application stores predicted images in this folder.
-
-#### Monitoring with Tensorboard 
 
 # Tools
 There are several helper tools located under `tools` folder: 
@@ -326,20 +210,6 @@ The output is a BAG file that could be opened by RealSense viewer.
 ![foxdemo](images/conver_to_bag.PNG)
 
 
-## Part 5 - Applying trained network to real data:
-[Example 5](https://github.com/nohayassin/librealsense/blob/tensorflow/wrappers/tensorflow/example5%20-%20denoise.py) is showing how to use trained network from Part 4 ([Keras Unet model](https://librealsense.intel.com/rs-tests/ML/Depth_Learning/DEPTH_Keras_Unet.model.zip)) on live data from Intel RealSense Camera. It can be invoke as follows: 
-
-```py
-python example5-denoised.py <path to the model>
-```
-
-For prediction, both IR and depth frames are streamed.
-
-Expected output is the original frame and model prediction given it as an input. 
-
-
-![foxdemo](images/camera_simulation_06.PNG)
-
 ## Conclusions
 
 This article is showing small number of examples for using deep learning together with Intel RealSense hardware. It is intended to be further extended and you are welcomed to propose enhancements and new code samples. You are also free to use provided sample code, dataset and model for research or commercial use, in compliance with Intel RealSense SDK 2.0 [License](https://github.com/IntelRealSense/librealsense/blob/master/LICENSE)