diff --git a/doc/2.0/fate/components/feature_binning.md b/doc/2.0/fate/components/feature_binning.md
index bc56d55c1f..ddea261399 100644
--- a/doc/2.0/fate/components/feature_binning.md
+++ b/doc/2.0/fate/components/feature_binning.md
@@ -14,7 +14,7 @@ As for calculating the federated iv and woe values, the following figure
 can describe the principle properly.
 
 ![Figure 1 (Federated Feature Binning
-Principle)](../images/binning_principle.png)
+Principle)](../../images/binning_principle.png)
 
 As the figure shows, B party which has the data labels encrypt its
 labels with Addiction homomorphic encryption and then send to A. A
@@ -26,7 +26,7 @@ encrypted label information to all hosts, and each of the hosts
 calculates and sends back the static info.
 
 ![Figure 2： Multi-Host Binning
-Principle](../images/multiple_host_binning.png)
+Principle](../../images/multiple_host_binning.png)
 
 ## Features
 
diff --git a/doc/2.0/fate/components/feature_selection.md b/doc/2.0/fate/components/feature_selection.md
index dc40d88f2b..a2895b7b3f 100644
--- a/doc/2.0/fate/components/feature_selection.md
+++ b/doc/2.0/fate/components/feature_selection.md
@@ -54,4 +54,4 @@ whether a feature is left or not. Then guest sends result filter back to hosts.
 During this selection process, guest will not know the real name of host(s)' features.
 
 ![Figure 4: Multi-Host Selection
-Principle\</div\>](../images/multi_host_selection.png)
\ No newline at end of file
+Principle\</div\>](../../images/multi_host_selection.png)
\ No newline at end of file
diff --git a/doc/2.0/fate/components/hetero_nn.md b/doc/2.0/fate/components/hetero_nn.md
new file mode 100644
index 0000000000..9566ce1fbd
--- /dev/null
+++ b/doc/2.0/fate/components/hetero_nn.md
@@ -0,0 +1,55 @@
+# Hetero NN
+
+In FATE-2.0, we introduce our new Hetero-NN framework which allows you to quickly set up a hetero federated NN learning task. Built on the PyTorch and transformers, our framework ensures smooth integration of your existing datasets and models. For a quick introduction to Hetero-NN, refer to our [quick start](../ml/hetero_nn_tutorial.md).
+
+The architecture of the Hetero-NN framework is depicted in the figure below. In this structure, all submodels from guests and hosts are encapsulated within the HeteroNNModel, enabling independent forwards and backwards. Both guest and host trainers are developed based on the HuggingFace trainer, allowing for rapid configuration of heterogeneous federated learning tasks with your existing datasets and models. These tasks can be run independently, without the need for FATEFlow. The FATE-pipeline Hetero-NN components are built upon this foundational framework.
+
+<div align="center">
+    <img src="../../images/hetero_nn.png" width="800" height="480" alt="Figure 2 (FedPass)">
+</div>
+
+Besides the new framework, we also introduce two new privacy-preserving strategies for federated learning: SSHE and FedPass. These strategies can be configured in the aggregate layer configuration. For more information on these strategies, refer to the [SSHE](#sshe) and [FedPass](#fedpass) sections below.
+
+## SSHE
+
+SSHENN is a privacy-preserving strategy that uses homomorphic encryption and secure sharing to protect the privacy of the model and data. The weights of guest/host aggregate layer are split into two parts, and are shared with cooperating party. Picture blow illustrates the process of SSHE. The design of SSHE is inspired by the paper: [When Homomorphic Encryption Marries Secret Sharing:
+Secure Large-Scale Sparse Logistic Regression and Applications
+in Risk Control](https://arxiv.org/pdf/2008.08753.pdf).
+
+![Figure 1 (SSHE)](../../images/sshe.png)
+
+
+
+## FedPass
+
+FedPass works by embedding private passports into a neural network to enhance privacy and obfuscation. It utilizes the DNN passport technique for adaptive obfuscation, which involves inserting a passport layer into the network. This layer adjusts the scale factor and bias term using model parameters and private passports, followed by an autoencoder and averaging. Picture below illustrates
+the process of FedPass.
+<div align="center">
+    <img src="../../images/fedpass_1.png" alt="Figure 2 (FedPass)">
+</div>
+
+
+In FATE-2.0, you can specify the Fedpass strategy for guest top model and host bottom model, picture below shows the architecture of FedPass when running a hetero-nn task.
+
+<div align="center">
+    <img src="../../images/fedpass_0.png" width="500" height="400" alt="Figure 2 (FedPass)">
+</div>
+
+For more details of Fedpass, please refer to the [paper](https://arxiv.org/pdf/2301.12623.pdf).
+
+
+The features of Fedpass are:
+
+- Privacy Preserving: Without access to the passports, it's extremely difficult for an attacker to infer inputs from outputs.
+- Preserved model performance: The model parameters are optimized through backpropagation, adapting the obfuscation to the model, which offers superior performance compared to fixed obfuscation.
+- Speed Comparable to Plaintext Training: Fedpass does not require homomorphic encryption or secure sharing, ensuring that the training speed is nearly equivalent to that of plaintext training.
+
+
+## Features 
+
+- A brand new hetero-nn framework develop based on pytorch and transformers. Able to intergrate exsiting resources, like models, dataset into hetero-nn federated learning. If you are using Hetero-NN in FATE pipelien, you can configure your cutomize models, datasets via confs.
+
+- Support SSHE strategy for privacy preserving training. You can set passport for host bottom models and guest
+top model.
+
+- Support FedPass strategy for privacy preserving training. Support single GPU training.
diff --git a/doc/2.0/fate/components/hetero_secureboost.md b/doc/2.0/fate/components/hetero_secureboost.md
new file mode 100644
index 0000000000..8c09981e56
--- /dev/null
+++ b/doc/2.0/fate/components/hetero_secureboost.md
@@ -0,0 +1,106 @@
+# Hetero SecureBoost
+
+Gradient Boosting Decision Tree(GBDT) is a widely used statistic model
+for classification and regression problems. FATE provides a novel
+lossless privacy-preserving tree-boosting system known as
+[SecureBoost: A Lossless Federated Learning Framework.](https://arxiv.org/abs/1901.08755)
+
+This federated learning system allows a learning process to be jointly
+conducted over multiple parties with partially common user samples but
+different feature sets, which corresponds to a vertically partitioned
+data set. An advantage of SecureBoost is that it provides the same level
+of accuracy as the non privacy-preserving approach while revealing no
+information on private data.
+
+The following figure shows the proposed Federated SecureBoost framework.
+
+![Figure 1: Framework of Federated SecureBoost](../../images/secureboost.png)
+
+  - Active Party
+
+    > We define the active party as the data provider who holds both a data
+    > matrix and the class label. Since the class label information is
+    > indispensable for supervised learning, there must be an active party
+    > with access to the label y. The active party naturally takes the
+    > responsibility as a dominating server in federated learning.
+
+  - Passive Party
+
+    > We define the data provider which has only a data matrix as a passive
+    > party. Passive parties play the role of clients in the federated
+    > learning setting. They are also in need of building a model to predict
+    > the class label y for their prediction purposes. Thus they must
+    > collaborate with the active party to build their model to predict y
+    > for their future users using their own features.
+
+We align the data samples under an encryption scheme by using the
+privacy-preserving protocol for inter-database intersections to find the
+common shared users or data samples across the parties without
+compromising the non-shared parts of the user sets.
+
+To ensure security, passive parties cannot get access to gradient and
+hessian directly. We use a "XGBoost" like tree-learning algorithm. In
+order to keep gradient and hessian confidential, we require that the
+active party encrypt gradient and hessian before sending them to passive
+parties. After encrypted the gradient and hessian, active party will
+send the encrypted [gradient] and [hessian] to passive
+party. Each passive party uses [gradient] and [hessian] to
+calculate the encrypted feature histograms, then encodes the (feature,
+split\_bin\_val) and constructs a (feature, split\_bin\_val) lookup
+table; it then sends the encoded value of (feature, split\_bin\_val)
+with feature histograms to the active party. After receiving the feature
+histograms from passive parties, the active party decrypts them and
+finds the best gains. If the best-gain feature belongs to a passive
+party, the active party sends the encoded (feature, split\_bin\_val) to
+back to the owner party. The following figure shows the process of
+finding split in federated tree building.
+
+![Figure 2: Process of Federated Split Finding](../../images/split_finding.png)
+
+The parties continue the split finding process until tree construction
+finishes. Each party only knows the detailed split information of the
+tree nodes where the split features are provided by the party. The
+following figure shows the final structure of a single decision tree.
+
+![Figure 3: A Single Decision Tree](../../images/tree_structure.png)
+
+To use the learned model to classify a new instance, the active party
+first judges where current tree node belongs to. If the current tree
+belongs to the active party, then it can use its (feature,
+split\_bin\_val) lookup table to decide whether going to left child node
+or right; otherwise, the active party sends the node id to designated
+passive party, the passive party checks its lookup table and sends back
+which branch should the current node goes to. This process stops until
+the current node is a leave. The following figure shows the federated
+inference process.
+
+![Figure 4: Process of Federated Inference](../../images/federated_inference.png)
+
+By following the SecureBoost framework, multiple parties can jointly
+build tree ensemble model without leaking privacy in federated learning.
+If you want to learn more about the algorithm, you can read the paper
+attached above.
+
+## HeteroSecureBoost Features
+
+- Support federated machine learning tasks:
+    - binary classification, the objective function is binary:bce
+    - multi classification, the objective function is multi:ce
+    - regression, the objective function is regression:l2
+
+- Support multi-host federated machine learning tasks.
+
+- Support Paillier and Ou homogeneous encryption schemes.
+
+- Support common-used Xgboost regularization methods:
+    - L1 & L2 regularization
+    - Min childe weight
+    - Min Sample Split
+
+- Support GOSS Sampling
+
+- Support complete secure tree
+
+- Support hist-subtraction, grad and hess optimization
+
+
diff --git a/doc/2.0/fate/components/homo_nn.md b/doc/2.0/fate/components/homo_nn.md
new file mode 100644
index 0000000000..31228a68af
--- /dev/null
+++ b/doc/2.0/fate/components/homo_nn.md
@@ -0,0 +1,21 @@
+# Homo NN
+
+The Homo(Horizontal) federated learning in FATE-2.0 allows multiple parties to collaboratively train a neural network model without sharing their actual data. In this arrangement, different parties possess datasets with the same features but different user samples. Each party locally trains the model on its data subset and shares only the model updates, not the data itself.
+
+Our neural network (NN) framework in FATE-2.0 is built upon PyTorch and transformers libraries, easing the integration of existing models ,including computer vision (CV) models, pretrained large language (LLM), etc., and datasets into federated training. The framework is also compatible with advanced computing resources like GPUs and DeepSpeed for enhanced training efficiency. In the HomoNN module, we support standard FedAVG algorithms. Using the FedAVGClient and FedAVGServer trainer classes, homo federated learning tasks can be set up quickly and efficiently. The trainers, developed on the transformer trainer, facilitate the consistent setting of training and federation parameters via TrainingArguments and FedAVGArguments.
+
+Below show the architecture of the 2.0 Homo-NN framework.
+
+![Figure 1 (SSHE)](../../images/homo_nn.png)
+
+## Features
+
+-  A new neural network (NN) framework, developed leveraging PyTorch and transformers. This framework offers easy integration of existing models, including CV, LLM models, etc., and datasets. It's ready to use right out of the box. If you are using Homo-NN in FATE pipelien, you can configure your cutomize models, datasets via confs.
+
+- Provides support for the FedAVG algorithm, featuring secure aggregation.
+
+- The Trainer class includes callback support, allowing for customization of the training process.
+
+- FedAVGClient supports a local model mode for local testing.
+
+- Compatible with single and multi-GPU training. The framework also allows for easy integration of DeepSpeed.
diff --git a/doc/2.0/fate/components/linear_regression.md b/doc/2.0/fate/components/linear_regression.md
index 9eb35c777c..915be62760 100644
--- a/doc/2.0/fate/components/linear_regression.md
+++ b/doc/2.0/fate/components/linear_regression.md
@@ -24,7 +24,7 @@ keys.
 The process of HeteroLinR training is shown below:
 
 ![Figure 1 (Federated HeteroLinR
-Principle)](../images/HeteroLinR.png)
+Principle)](../../images/HeteroLinR.png)
 
 A sample alignment process is conducted before training. The sample
 alignment process identifies overlapping samples in databases of all
diff --git a/doc/2.0/fate/components/logistic_regression.md b/doc/2.0/fate/components/logistic_regression.md
index ad6984ff14..e2249aff0f 100644
--- a/doc/2.0/fate/components/logistic_regression.md
+++ b/doc/2.0/fate/components/logistic_regression.md
@@ -28,7 +28,7 @@ alignment process will **not** leak confidential information (e.g.,
 sample ids) on the two parties since it is conducted in an encrypted
 way.
 
-![Figure 1 (Federated HeteroLR Principle)](../images/HeteroLR.png)
+![Figure 1 (Federated HeteroLR Principle)](../../images/HeteroLR.png)
 
 In the training process, party A and party B compute out the elements
 needed for final gradients. Arbiter aggregate them and compute out the
@@ -44,7 +44,7 @@ criterion. Since the arbiter can obtain the completed model weight, the
 convergence decision is happening in Arbiter.
 
 ![Figure 2 (Federated Multi-host HeteroLR
-Principle)](../images/hetero_lr_multi_host.png)
+Principle)](../../images/hetero_lr_multi_host.png)
 
 # Heterogeneous SSHE Logistic Regression
 
@@ -57,12 +57,12 @@ We have also made some optimization so that the code may not exactly
 same with this paper.
 The training process could be described as the
 following: forward and backward process.
-![Figure 3 (forward)](../images/sshe-lr_forward.png)
-![Figure 4 (backward)](../images/sshe-lr_backward.png)
+![Figure 3 (forward)](../../images/sshe-lr_forward.png)
+![Figure 4 (backward)](../../images/sshe-lr_backward.png)
 
 The training process is based secure matrix multiplication protocol(SMM),
 which HE and Secret-Sharing hybrid protocol is included.
-![Figure 5 (SMM)](../images/secure_matrix_multiplication.png)
+![Figure 5 (SMM)](../../images/secure_matrix_multiplication.png)
 
 ## Features
 
diff --git a/doc/2.0/fate/components/psi.md b/doc/2.0/fate/components/psi.md
index 53f65a2874..72057299ce 100644
--- a/doc/2.0/fate/components/psi.md
+++ b/doc/2.0/fate/components/psi.md
@@ -8,7 +8,7 @@ which offers 128 bits of security with key size of 256 bits.
 Below is an illustration of ECDH intersection.
 
 ![Figure 1 (ECDH
-PSI)](../images/ecdh_intersection.png)
+PSI)](../../images/ecdh_intersection.png)
 
 For details on how to hash value to given curve,
 please refer [here](https://datatracker.ietf.org/doc/html/draft-irtf-cfrg-hash-to-curve-10#section-6.7.1).
diff --git a/doc/2.0/fate/components/union.md b/doc/2.0/fate/components/union.md
index b3384f7805..2caa62055f 100644
--- a/doc/2.0/fate/components/union.md
+++ b/doc/2.0/fate/components/union.md
@@ -11,3 +11,4 @@ Union currently supports concatenation along axis 0.
 
 For tables to be concatenated, their header, including sample id, match id, and label column (if label exists),
 should match. Example of such a union task may be found [here](../../../examples/pipeline/union/test_union.py).
+Signed-off-by: weijingchen <talkingwallace@sohu.com>
\ No newline at end of file
diff --git a/doc/2.0/fate/ml/homo_nn_deepspeed_on_eggroll.md b/doc/2.0/fate/ml/homo_nn_deepspeed_on_eggroll.md
new file mode 100644
index 0000000000..e1439cdbf4
--- /dev/null
+++ b/doc/2.0/fate/ml/homo_nn_deepspeed_on_eggroll.md
@@ -0,0 +1,531 @@
+# Running Homo-NN with Deeepspeed on Eggroll
+
+Our latest Homo framework enables accelerated model training using DeepSpeed. In this tutorial, we'll guide you through the process of training a federated GPT-2 model within an Eggroll environment.
+
+
+## Prepare FATE Context
+
+In FATE-2.0, the running environment, including the party settings (guest, host, arbiter, and their party IDs), is configured using a context object. This object can be created with the create_context function. The following Python code illustrates how to set up the context:
+
+```python
+from fate.arch.federation import FederationEngine
+from fate.arch.computing import ComputingEngine
+from fate.arch.context import create_context
+
+def create_ctx(local_party, federation_session_id, computing_session_id):
+    return create_context(
+        local_party=local_party,
+        parties=[guest, host, arbiter],
+        federation_session_id=federation_session_id,
+        federation_engine=FederationEngine.OSX,
+        federation_conf={"federation.osx.host": "xxx", "federation.osx.port": xxx, "federation.osx.mode": "stream"},
+        computing_session_id=computing_session_id,
+        computing_engine=ComputingEngine.EGGROLL,
+        computing_conf={"computing.eggroll.host": "xxx",
+                        "computing.eggroll.port": xxx,
+                        "computing.eggroll.options": {'eggroll.session.processors.per.node': 4, 'nodes': 1},
+                        "computing.eggroll.config": None,
+                        "computing.eggroll.config_options": None,
+                        "computing.eggroll.config_properties_file": None
+                        }
+    )
+```
+
+In this example, creating a FATE context differs from other tutorials because the models are being trained in a distributed environment using multiple GPUs and an Eggroll backend. It's necessary to set the computing engine as EGGROLL and provide several configurations for the federation and Eggroll ports. The task will be running on a machine with serverl GPUs.
+When you are using this example, remember to replace the host and port with your own values.
+
+## Models and Dataset
+
+Before submitting the training job to Eggroll, where DeepSpeed operates, it's essential to prepare the models and datasets in the same way as a regular neural network training job. The process begins with importing the necessary packages, including the FedAVGClient and FedAVGServer trainers, and the GPT2 classification model from transformers. Subsequently, a tokenizer dataset is developed to load and tokenize the dataset, as the codes shown below.
+
+```python
+import pandas as pd
+import torch as t
+from transformers import AutoTokenizer
+import os
+import numpy as np
+import argparse
+from fate.ml.nn.homo.fedavg import FedAVGClient, FedArguments, TrainingArguments, FedAVGServer
+from transformers import GPT2ForSequenceClassification
+
+
+class TokenizerDataset(Dataset):
+
+    def __init__(
+            self,
+            truncation=True,
+            text_max_length=128,
+            tokenizer_name_or_path="bert-base-uncased",
+            return_label=True,
+            padding=True,
+            padding_side="right",
+            pad_token=None,
+            return_input_ids=True):
+
+        super(TokenizerDataset, self).__init__()
+        self.text = None
+        self.word_idx = None
+        self.label = None
+        self.tokenizer = None
+        self.sample_ids = None
+        self.padding = padding
+        self.truncation = truncation
+        self.max_length = text_max_length
+        self.with_label = return_label
+        self.tokenizer_name_or_path = tokenizer_name_or_path
+        self.tokenizer = AutoTokenizer.from_pretrained(
+            self.tokenizer_name_or_path)
+        self.tokenizer.padding_side = padding_side
+        self.return_input_ids = return_input_ids
+        if pad_token is not None:
+            self.tokenizer.add_special_tokens({'pad_token': pad_token})
+
+    def load(self, file_path):
+
+        tokenizer = self.tokenizer
+        self.text = pd.read_csv(file_path)
+        text_list = list(self.text.text)
+
+        self.word_idx = tokenizer(
+            text_list,
+            padding=self.padding,
+            return_tensors='pt',
+            truncation=self.truncation,
+            max_length=self.max_length)
+
+        if self.return_input_ids:
+            self.word_idx = self.word_idx['input_ids']
+
+        if self.with_label:
+            self.label = t.Tensor(self.text.label).detach().numpy()
+            self.label = self.label.reshape((len(self.text), -1))
+
+        if 'id' in self.text:
+            self.sample_ids = self.text['id'].values.tolist()
+
+    def get_classes(self):
+        return np.unique(self.label).tolist()
+
+    def get_vocab_size(self):
+        return self.tokenizer.vocab_size
+
+    def get_sample_ids(self):
+        return self.sample_ids
+
+    def __getitem__(self, item):
+
+        if self.return_input_ids:
+            ret = self.word_idx[item]
+        else:
+            ret = {k: v[item] for k, v in self.word_idx.items()}
+
+        if self.with_label:
+            ret.update(dict(labels=self.label[item]))
+            return ret
+
+        return ret
+
+    def __len__(self):
+        return len(self.text)
+
+    def __repr__(self):
+        return self.tokenizer.__repr__()
+```
+
+In this example we adopt the IMDB movie review dataset, which is a binary classification dataset. You can download from [here](https://webank-ai-1251170195.cos.ap-guangzhou.myqcloud.com/fate/examples/data/IMDB.csv). 
+
+To complete the setup for running federated learning tasks, we define run_server() and run_client() functions. The script is divided into two main functions: run_server() and run_client(), each tailored to the specific roles within the federated learning setup.
+
+- Server Setup (run_server() Function):
+
+    The server (arbiter) initializes its context using the create_ctx function, incorporating the federation session ID and computing session ID.
+    A FedAVGServer trainer instance is created and launched into the training process. During this phase, the server automatically collects and aggregates model updates from all participating clients.
+
+- Client Setup (run_client() Function):
+
+    The client's role (guest or host) determines its local party setting.
+    Similar to the server, the client establishes its context with the relevant session IDs.
+    The client loads a pretrained GPT-2 model from the transformers library.
+    A TokenizerDataset instance is initialized and loads'IMDB.csv'. For simplicity, the same dataset is used for both guest and host in this example.
+    The FedAVGClient trainer is instantiated with the model, federated arguments, training arguments, loss function, and tokenizer. And we define a deepspeed config for the trainer. When training, it will automatically load the config and use deepspeed to accelerate the training process. Since our trianer is developed based on the transformers trainer, the configuration codes of the trainer is nearly the same.
+    Then training process is initiated by calling the train function on the FedAVGClient trainer.
+
+
+- Key Considerations:
+
+    Consistent Session IDs: It's critical to ensure that both the server and client use the same federation_session_id for successful communication and data exchange.
+
+
+```python
+
+deepspeed_config = {
+    "train_micro_batch_size_per_gpu": 16,
+    "train_batch_size": "auto",
+    "optimizer": {
+        "type": "Adam",
+        "params": {
+            "lr": 5e-4
+        }
+    },
+    "scheduler": {
+        "type": "WarmupLR",
+        "params": {
+            "warmup_min_lr": 0
+        }
+    },
+    "fp16": {
+        "enabled": False
+    },
+    "zero_optimization": {
+        "stage": 2,
+        "allgather_partitions": True,
+        "allgather_bucket_size": 5e8,
+        "overlap_comm": False,
+        "reduce_scatter": True,
+        "reduce_bucket_size": 5e8,
+        "contiguous_gradients": True,
+        "stage3_gather_16bit_weights_on_model_save": True,
+    }
+}
+
+def run_server():
+    federation_session_id = args.federation_session_id
+    computing_session_id = f"{federation_session_id}_{arbiter[0]}_{arbiter[1]}"
+    ctx = create_ctx(arbiter, federation_session_id, computing_session_id)
+    trainer = FedAVGServer(ctx)
+    trainer.train()
+
+
+def run_client():
+
+    if args.role == "guest":
+        local_party = guest
+        save_path = './guest_model'
+    else:
+        local_party = host
+        save_path = './host_model'
+    federation_session_id = args.federation_session_id
+    computing_session_id = f"{federation_session_id}_{local_party[0]}_{local_party[1]}"
+    ctx = create_ctx(local_party, federation_session_id, computing_session_id)
+
+    pretrained_path = "gpt2"
+    model = GPT2ForSequenceClassification.from_pretrained(pretrained_path, num_labels=1)
+    model.config.pad_token_id = model.config.eos_token_id
+    ds = TokenizerDataset(
+        tokenizer_name_or_path=pretrained_path,
+        text_max_length=128,
+        padding_side="left",
+        return_input_ids=False,
+        pad_token='<|endoftext|>'
+    )
+    ds.load("./IMDB.csv")
+
+    fed_args = FedArguments(aggregate_strategy="epoch", aggregate_freq=1, aggregator="secure_aggregate")
+    training_args = TrainingArguments(
+        num_train_epochs=5,
+        per_device_train_batch_size=16,
+        learning_rate=5e-4,
+        logging_strategy="steps",
+        logging_steps=5,
+        deepspeed=deepspeed_config,
+    )
+    trainer = FedAVGClient(
+        ctx=ctx,
+        model=model,
+        fed_args=fed_args,
+        training_args=training_args,
+        loss_fn=t.nn.BCELoss(),
+        train_set=ds,
+        tokenizer=ds.tokenizer
+    )
+    trainer.train()
+    trainer.save_model(save_path)
+```
+
+## Full Script
+
+Here is the full script of this example:
+
+```python   
+import pandas as pd
+import torch as t
+from transformers import AutoTokenizer
+import os
+import numpy as np
+import argparse
+from fate.ml.nn.homo.fedavg import FedAVGClient, FedArguments, TrainingArguments, FedAVGServer
+from transformers import GPT2ForSequenceClassification
+from fate.arch.federation import FederationEngine
+from fate.arch.computing import ComputingEngine
+from fate.arch.context import create_context
+from fate.ml.nn.dataset.base import Dataset
+
+# avoid tokenizer parallelism
+os.environ["TOKENIZERS_PARALLELISM"] = "false"
+
+
+class TokenizerDataset(Dataset):
+
+    def __init__(
+            self,
+            truncation=True,
+            text_max_length=128,
+            tokenizer_name_or_path="bert-base-uncased",
+            return_label=True,
+            padding=True,
+            padding_side="right",
+            pad_token=None,
+            return_input_ids=True):
+
+        super(TokenizerDataset, self).__init__()
+        self.text = None
+        self.word_idx = None
+        self.label = None
+        self.tokenizer = None
+        self.sample_ids = None
+        self.padding = padding
+        self.truncation = truncation
+        self.max_length = text_max_length
+        self.with_label = return_label
+        self.tokenizer_name_or_path = tokenizer_name_or_path
+        self.tokenizer = AutoTokenizer.from_pretrained(
+            self.tokenizer_name_or_path)
+        self.tokenizer.padding_side = padding_side
+        self.return_input_ids = return_input_ids
+        if pad_token is not None:
+            self.tokenizer.add_special_tokens({'pad_token': pad_token})
+
+    def load(self, file_path):
+
+        tokenizer = self.tokenizer
+        self.text = pd.read_csv(file_path)
+        text_list = list(self.text.text)
+
+        self.word_idx = tokenizer(
+            text_list,
+            padding=self.padding,
+            return_tensors='pt',
+            truncation=self.truncation,
+            max_length=self.max_length)
+
+        if self.return_input_ids:
+            self.word_idx = self.word_idx['input_ids']
+
+        if self.with_label:
+            self.label = t.Tensor(self.text.label).detach().numpy()
+            self.label = self.label.reshape((len(self.text), -1))
+
+        if 'id' in self.text:
+            self.sample_ids = self.text['id'].values.tolist()
+
+    def get_classes(self):
+        return np.unique(self.label).tolist()
+
+    def get_vocab_size(self):
+        return self.tokenizer.vocab_size
+
+    def get_sample_ids(self):
+        return self.sample_ids
+
+    def __getitem__(self, item):
+
+        if self.return_input_ids:
+            ret = self.word_idx[item]
+        else:
+            ret = {k: v[item] for k, v in self.word_idx.items()}
+
+        if self.with_label:
+            ret.update(dict(labels=self.label[item]))
+            return ret
+
+        return ret
+
+    def __len__(self):
+        return len(self.text)
+
+    def __repr__(self):
+        return self.tokenizer.__repr__()
+
+
+def create_ctx(local_party, federation_session_id, computing_session_id):
+    return create_context(
+        local_party=local_party,
+        parties=[guest, host, arbiter],
+        federation_session_id=federation_session_id,
+        federation_engine=FederationEngine.OSX,
+        federation_conf={"federation.osx.host": "xxx", "federation.osx.port": xxx, "federation.osx.mode": "stream"},
+        computing_session_id=computing_session_id,
+        computing_engine=ComputingEngine.EGGROLL,
+        computing_conf={"computing.eggroll.host": "xxx",
+                        "computing.eggroll.port": xxx,
+                        "computing.eggroll.options": {'eggroll.session.processors.per.node': 4, 'nodes': 1},
+                        "computing.eggroll.config": None,
+                        "computing.eggroll.config_options": None,
+                        "computing.eggroll.config_properties_file": None
+                        }
+    )
+
+
+def run_server():
+    federation_session_id = args.federation_session_id
+    computing_session_id = f"{federation_session_id}_{arbiter[0]}_{arbiter[1]}"
+    ctx = create_ctx(arbiter, federation_session_id, computing_session_id)
+    trainer = FedAVGServer(ctx)
+    trainer.train()
+
+deepspeed_config = {
+    "train_micro_batch_size_per_gpu": 16,
+    "train_batch_size": "auto",
+    "optimizer": {
+        "type": "Adam",
+        "params": {
+            "lr": 5e-4
+        }
+    },
+    "scheduler": {
+        "type": "WarmupLR",
+        "params": {
+            "warmup_min_lr": 0
+        }
+    },
+    "fp16": {
+        "enabled": False
+    },
+    "zero_optimization": {
+        "stage": 2,
+        "allgather_partitions": True,
+        "allgather_bucket_size": 5e8,
+        "overlap_comm": False,
+        "reduce_scatter": True,
+        "reduce_bucket_size": 5e8,
+        "contiguous_gradients": True,
+        "stage3_gather_16bit_weights_on_model_save": True,
+    }
+}
+
+
+def run_client():
+    if args.role == "guest":
+        local_party = guest
+        save_path = './guest_model'
+    else:
+        local_party = host
+        save_path = './host_model'
+    federation_session_id = args.federation_session_id
+    computing_session_id = f"{federation_session_id}_{local_party[0]}_{local_party[1]}"
+    ctx = create_ctx(local_party, federation_session_id, computing_session_id)
+
+    pretrained_path = "gpt2"
+    model = GPT2ForSequenceClassification.from_pretrained(pretrained_path, num_labels=1)
+    model.config.pad_token_id = model.config.eos_token_id
+    ds = TokenizerDataset(
+        tokenizer_name_or_path=pretrained_path,
+        text_max_length=128,
+        padding_side="left",
+        return_input_ids=False,
+        pad_token='<|endoftext|>'
+    )
+    ds.load("./IMDB.csv")
+
+    fed_args = FedArguments(aggregate_strategy="epoch", aggregate_freq=1, aggregator="secure_aggregate")
+    training_args = TrainingArguments(
+        num_train_epochs=5,
+        per_device_train_batch_size=16,
+        learning_rate=5e-4,
+        logging_strategy="steps",
+        logging_steps=5,
+        deepspeed=deepspeed_config,
+    )
+    trainer = FedAVGClient(
+        ctx=ctx,
+        model=model,
+        fed_args=fed_args,
+        training_args=training_args,
+        loss_fn=t.nn.BCELoss(),
+        train_set=ds,
+        tokenizer=ds.tokenizer
+    )
+    trainer.train()
+    trainer.save_model(save_path)
+
+
+arbiter = ("arbiter", '9999')
+guest = ("guest", '9999')
+host = ("host", '9999')
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+
+    parser.add_argument("--role", type=str)
+    parser.add_argument("--federation_session_id", type=str)
+
+    args = parser.parse_args()
+    if args.role == "arbiter":
+        run_server()
+    else:
+        run_client()
+```
+
+## Sumbmit the Job
+
+Once the script is ready, we can submit job to eggroll and it will start the training process. The following command will submit the job to eggroll.
+
+```bash
+nohup eggroll task submit --script-path gpt2_ds.py --conf role=guest --conf federation_session_id=${federation_session_id} --num-gpus 1 > guest.log &
+nohup eggroll task submit --script-path gpt2_ds.py --conf role=host --conf federation_session_id=${federation_session_id} --num-gpus 1 > host.log &
+nohup python gpt2_ds.py --role arbiter --federation_session_id ${federation_session_id} > arbiter.log &
+```
+
+Below is the partial output of guest process:
+
+```bash
+[2023-12-29 13:02:46,862] [INFO] [config.py:976:print]   zero_enabled ................. True
+[2023-12-29 13:02:46,862] [INFO] [config.py:976:print]   zero_force_ds_cpu_optimizer .. True
+[2023-12-29 13:02:46,862] [INFO] [config.py:976:print]   zero_optimization_stage ...... 2
+[2023-12-29 13:02:46,863] [INFO] [config.py:962:print_user_config]   json = {
+    "train_micro_batch_size_per_gpu": 16,
+    "train_batch_size": 16,
+    "optimizer": {
+        "type": "Adam",
+        "params": {
+            "lr": 0.0005
+        }
+    },
+    "scheduler": {
+        "type": "WarmupLR",
+        "params": {
+            "warmup_min_lr": 0
+        }
+    },
+    "fp16": {
+        "enabled": false
+    },
+    "zero_optimization": {
+        "stage": 2,
+        "allgather_partitions": true,
+        "allgather_bucket_size": 5.000000e+08,
+        "overlap_comm": false,
+        "reduce_scatter": true,
+        "reduce_bucket_size": 5.000000e+08,
+        "contiguous_gradients": true,
+        "stage3_gather_16bit_weights_on_model_save": true
+    }
+}
+{'loss': 2.075, 'learning_rate': 0.0002329900014453396, 'epoch': 0.04}
+[2023-12-29 13:02:49,647] [INFO] [logging.py:96:log_dist] [Rank 0] step=10, skipped=0, lr=[0.0003333333333333334], mom=[(0.9, 0.999)]
+[2023-12-29 13:02:49,654] [INFO] [timer.py:260:stop] epoch=0/micro_step=10/global_step=10, RunningAvgSamplesPerSec=110.86936118523683, CurrSamplesPerSec=111.33670009091573, MemA          llocated=1.87GB, MaxMemAllocated=6.12GB
+{'loss': 0.8091, 'learning_rate': 0.0003333333333333334, 'epoch': 0.08}
+{'loss': 0.5003, 'learning_rate': 0.00039203041968522714, 'epoch': 0.12}
+[2023-12-29 13:02:51,108] [INFO] [logging.py:96:log_dist] [Rank 0] step=20, skipped=0, lr=[0.0004336766652213271], mom=[(0.9, 0.999)]
+[2023-12-29 13:02:51,116] [INFO] [timer.py:260:stop] epoch=0/micro_step=20/global_step=20, RunningAvgSamplesPerSec=111.0120624893855, CurrSamplesPerSec=111.13499047776766, MemAl          located=1.87GB, MaxMemAllocated=6.12GB
+{'loss': 0.466, 'learning_rate': 0.0004336766652213271, 'epoch': 0.16}
+{'loss': 0.4338, 'learning_rate': 0.0004659800028906792, 'epoch': 0.2}
+[2023-12-29 13:02:52,576] [INFO] [logging.py:96:log_dist] [Rank 0] step=30, skipped=0, lr=[0.0004923737515732208], mom=[(0.9, 0.999)]
+[2023-12-29 13:02:52,584] [INFO] [timer.py:260:stop] epoch=0/micro_step=30/global_step=30, RunningAvgSamplesPerSec=110.88921855143799, CurrSamplesPerSec=110.59852763526753, MemA          llocated=1.87GB, MaxMemAllocated=6.12GB
+{'loss': 0.319, 'learning_rate': 0.0004923737515732208, 'epoch': 0.24}
+{'loss': 0.3094, 'learning_rate': 0.0005146893481167587, 'epoch': 0.28}
+[2023-12-29 13:02:54,048] [INFO] [logging.py:96:log_dist] [Rank 0] step=40, skipped=0, lr=[0.0005340199971093208], mom=[(0.9, 0.999)]
+[2023-12-29 13:02:54,055] [INFO] [timer.py:260:stop] epoch=0/micro_step=40/global_step=40, RunningAvgSamplesPerSec=110.76942622083511, CurrSamplesPerSec=109.53330286609649, MemA          llocated=1.87GB, MaxMemAllocated=6.12GB
+```
+
+We can see that the training is running correctly.
diff --git a/doc/2.0/fate/ml/homo_nn_tutorial.md b/doc/2.0/fate/ml/homo_nn_tutorial.md
index 2510d97cb5..767c0dfd4a 100644
--- a/doc/2.0/fate/ml/homo_nn_tutorial.md
+++ b/doc/2.0/fate/ml/homo_nn_tutorial.md
@@ -1,7 +1,5 @@
 # Homo-NN Tutorial
 
-The Homo(horizontal) federated learning allows parties to collaboratively train a neural network model without sharing their actual data. In a horizontally partitioned data setting, multiple parties have the same feature set but different user samples. In this scenario, each party trains the model locally on its own subset of data and only shares the model updates.
-
 The Homo-NN framework is designed for horizontal federated neural network training. 
 In this tutorial, we demonstrate how to run a Homo-NN task under FATE-2.0 locally without using a Pipeline and provide several demos to show you how to integrate linear models, image models, language models in federated
 learning. These setting are particularly useful for local experimentation, model/training setting modifications and testing. 
diff --git a/doc/2.0/images/federated_inference.png b/doc/2.0/images/federated_inference.png
new file mode 100644
index 0000000000..cadfc1479d
Binary files /dev/null and b/doc/2.0/images/federated_inference.png differ
diff --git a/doc/2.0/images/fedpass_0.png b/doc/2.0/images/fedpass_0.png
new file mode 100644
index 0000000000..2821eacb34
Binary files /dev/null and b/doc/2.0/images/fedpass_0.png differ
diff --git a/doc/2.0/images/fedpass_1.png b/doc/2.0/images/fedpass_1.png
new file mode 100644
index 0000000000..fd4b3fb5a0
Binary files /dev/null and b/doc/2.0/images/fedpass_1.png differ
diff --git a/doc/2.0/images/hetero_nn.png b/doc/2.0/images/hetero_nn.png
new file mode 100644
index 0000000000..ca77821cee
Binary files /dev/null and b/doc/2.0/images/hetero_nn.png differ
diff --git a/doc/2.0/images/homo_nn.png b/doc/2.0/images/homo_nn.png
new file mode 100644
index 0000000000..61931ff1c3
Binary files /dev/null and b/doc/2.0/images/homo_nn.png differ
diff --git a/doc/2.0/images/secureboost.png b/doc/2.0/images/secureboost.png
new file mode 100644
index 0000000000..b3293dd3f9
Binary files /dev/null and b/doc/2.0/images/secureboost.png differ
diff --git a/doc/2.0/images/split_finding.png b/doc/2.0/images/split_finding.png
new file mode 100644
index 0000000000..984cc99b1f
Binary files /dev/null and b/doc/2.0/images/split_finding.png differ
diff --git a/doc/2.0/images/sshe.png b/doc/2.0/images/sshe.png
new file mode 100644
index 0000000000..2ee0cc6fe6
Binary files /dev/null and b/doc/2.0/images/sshe.png differ
diff --git a/doc/2.0/images/tree_structure.png b/doc/2.0/images/tree_structure.png
new file mode 100644
index 0000000000..f53f9cd79d
Binary files /dev/null and b/doc/2.0/images/tree_structure.png differ
diff --git a/eggroll b/eggroll
index 0285c2b9e0..59a4144c4c 160000
--- a/eggroll
+++ b/eggroll
@@ -1 +1 @@
-Subproject commit 0285c2b9e072056c9d59c713ee4ff1e15f02785a
+Subproject commit 59a4144c4c8873caa86e8007b78dec28e44b513a
diff --git a/examples/benchmark_quality/hetero_nn/breast_config.yaml b/examples/benchmark_quality/hetero_nn/breast_config.yaml
new file mode 100644
index 0000000000..3aab71b40d
--- /dev/null
+++ b/examples/benchmark_quality/hetero_nn/breast_config.yaml
@@ -0,0 +1,18 @@
+guest_model:
+  bottom: [10, 8]
+  agg_layer: [8, 8]
+  top: [8, 1]
+host_model:
+  bottom: [20, 8]
+  agg_layer: [8, 8]
+  top: [8, 1]
+data_guest: "examples/data/breast_hetero_guest.csv"
+data_host: "examples/data/breast_hetero_host.csv"
+guest_table: "breast_hetero_guest"
+host_table: "breast_hetero_host"
+lr: 0.01
+epochs: 50
+batch_size: 256
+is_binary: True
+label_name: 'y'
+id_name: 'id'
\ No newline at end of file
diff --git a/examples/benchmark_quality/hetero_nn/epsilon_5k.yaml b/examples/benchmark_quality/hetero_nn/epsilon_5k.yaml
new file mode 100644
index 0000000000..8238c61db5
--- /dev/null
+++ b/examples/benchmark_quality/hetero_nn/epsilon_5k.yaml
@@ -0,0 +1,18 @@
+guest_model:
+  bottom: [20, 8]
+  agg_layer: [8, 8]
+  top: [8, 1]
+host_model:
+  bottom: [80, 16]
+  agg_layer: [16, 8]
+  top: [8, 1]
+data_guest: "examples/data/epsilon_5k_hetero_guest.csv"
+data_host: "examples/data/epsilon_5k_hetero_host.csv"
+guest_table: "epsilon_5k_hetero_guest"
+host_table: "epsilon_5k_hetero_host"
+lr: 0.01
+epochs: 50
+batch_size: 1024
+is_binary: True
+label_name: 'y'
+id_name: 'id'
\ No newline at end of file
diff --git a/examples/benchmark_quality/hetero_nn/hetero_nn.py b/examples/benchmark_quality/hetero_nn/hetero_nn.py
new file mode 100644
index 0000000000..6477e4d047
--- /dev/null
+++ b/examples/benchmark_quality/hetero_nn/hetero_nn.py
@@ -0,0 +1,139 @@
+#
+#  Copyright 2019 The FATE Authors. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+#
+import argparse
+from fate_test.utils import parse_summary_result
+from fate_client.pipeline.utils import test_utils
+from fate_client.pipeline import FateFlowPipeline
+from fate_client.pipeline.components.fate.nn.torch import nn, optim
+from fate_client.pipeline.components.fate.nn.torch.base import Sequential
+from fate_client.pipeline.components.fate.hetero_nn import HeteroNN, get_config_of_default_runner
+from fate_client.pipeline.components.fate.reader import Reader
+from fate_client.pipeline.components.fate.psi import PSI
+from fate_client.pipeline.components.fate.nn.algo_params import TrainingArguments, SSHEArgument
+from fate_client.pipeline.components.fate import Evaluation
+
+
+
+def main(config="../../config.yaml", param="./breast_config.yaml", namespace=""):
+
+    # obtain config
+    if isinstance(config, str):
+        config = test_utils.load_job_config(config)
+    parties = config.parties
+    guest = parties.guest[0]
+    host = parties.host[0]
+
+    if isinstance(param, str):
+        param = test_utils.JobConfig.load_from_file(param)
+
+
+    pipeline = FateFlowPipeline().set_parties(guest=guest, host=host)
+
+    reader_0 = Reader("reader_0", runtime_parties=dict(guest=guest, host=host))
+    reader_0.guest.task_parameters(
+        namespace=f"experiment{namespace}",
+        name=param['guest_table']
+    )
+    reader_0.hosts[0].task_parameters(
+        namespace=f"experiment{namespace}",
+        name=param['host_table']
+    )
+    psi_0 = PSI("psi_0", input_data=reader_0.outputs["output_data"])
+
+    training_args = TrainingArguments(
+        num_train_epochs=int(param['epochs']),
+        per_device_train_batch_size=int(param['batch_size']),
+        logging_strategy='epoch',
+    )
+
+    guest_top = param['guest_model']['top']
+    guest_bottom = param['guest_model']['bottom']
+    host_bottom = param['host_model']['bottom']
+    agg_layer_guest = param['guest_model']['agg_layer']
+    agg_layer_host = param['host_model']['agg_layer']
+    lr = param['lr']
+
+    guest_conf = get_config_of_default_runner(
+        bottom_model=nn.Linear(guest_bottom[0], guest_bottom[1]),
+        top_model=Sequential(
+            nn.Linear(guest_top[0], guest_top[1]),
+            nn.Sigmoid()
+        ),
+        training_args=training_args,
+        optimizer=optim.Adam(lr=lr),
+        loss=nn.BCELoss(),
+        agglayer_arg=SSHEArgument(
+            guest_in_features=agg_layer_guest[0],
+            host_in_features=agg_layer_host[0],
+            out_features=agg_layer_guest[1]
+        )
+    )
+
+    host_conf = get_config_of_default_runner(
+        bottom_model=nn.Linear(host_bottom[0], host_bottom[1]),
+        optimizer=optim.Adam(lr=lr),
+        training_args=training_args,
+        agglayer_arg=SSHEArgument(
+            guest_in_features=agg_layer_guest[0],
+            host_in_features=agg_layer_host[0],
+            out_features=agg_layer_host[1]
+        )
+    )
+
+    hetero_nn_0 = HeteroNN(
+        'hetero_nn_0',
+        train_data=psi_0.outputs['output_data'], validate_data=psi_0.outputs['output_data']
+    )
+
+    hetero_nn_0.guest.task_parameters(runner_conf=guest_conf)
+    hetero_nn_0.hosts[0].task_parameters(runner_conf=host_conf)
+
+    hetero_nn_1 = HeteroNN(
+        'hetero_nn_1',
+        test_data=psi_0.outputs['output_data'],
+        predict_model_input=hetero_nn_0.outputs['train_model_output']
+    )
+
+    if param['is_binary']:
+        metrics = ['auc']
+    else:
+        metrics = ['accuracy']
+
+    evaluation_0 = Evaluation(
+        'eval_0',
+        runtime_parties=dict(guest=guest),
+        metrics=metrics,
+        input_data=[hetero_nn_0.outputs['train_data_output'], hetero_nn_1.outputs['predict_data_output']]
+    )
+
+    pipeline.add_tasks([reader_0, psi_0, hetero_nn_0, hetero_nn_1, evaluation_0])
+    pipeline.compile()
+    pipeline.fit()
+    result_summary = parse_summary_result(pipeline.get_task_info("eval_0").get_output_metric()[0]["data"])
+    print(f"result_summary: {result_summary}")
+    data_summary = {}
+
+    return data_summary, result_summary
+
+if __name__ == "__main__":
+
+    parser = argparse.ArgumentParser("BENCHMARK-QUALITY PIPELINE JOB")
+    parser.add_argument("-c", "--config", type=str,
+                        help="config file", default="../../config.yaml")
+    parser.add_argument("-p", "--param", type=str,
+                        help="config file for params", default="./breast_config.yaml")
+    args = parser.parse_args()
+    main(args.config, args.param)
diff --git a/examples/benchmark_quality/hetero_nn/hetero_nn_benchmark.yaml b/examples/benchmark_quality/hetero_nn/hetero_nn_benchmark.yaml
new file mode 100644
index 0000000000..45069bf684
--- /dev/null
+++ b/examples/benchmark_quality/hetero_nn/hetero_nn_benchmark.yaml
@@ -0,0 +1,89 @@
+data:
+  - file: "examples/data/breast_hetero_guest.csv"
+    meta:
+      delimiter: ","
+      dtype: float64
+      input_format: dense
+      label_type: int64
+      label_name: y
+      match_id_name: "id"
+      match_id_range: 0
+      tag_value_delimiter: ":"
+      tag_with_value: false
+      weight_type: float64
+    partitions: 4
+    head: true
+    extend_sid: true
+    table_name: breast_hetero_guest
+    namespace: experiment
+    role: guest_0
+  - file: "examples/data/breast_hetero_host.csv"
+    meta:
+      delimiter: ","
+      dtype: float64
+      input_format: dense
+      match_id_name: "id"
+      match_id_range: 0
+      tag_value_delimiter: ":"
+      tag_with_value: false
+      weight_type: float64
+    partitions: 4
+    head: true
+    extend_sid: true
+    table_name: breast_hetero_host
+    namespace: experiment
+    role: host_0
+  - file: "examples/data/epsilon_5k_hetero_guest.csv"
+    meta:
+      delimiter: ","
+      dtype: float64
+      input_format: dense
+      match_id_name: "id"
+      match_id_range: 0
+      label_type: int64
+      label_name: y
+      tag_value_delimiter: ":"
+      tag_with_value: false
+      weight_type: float64
+    partitions: 4
+    head: true
+    extend_sid: true
+    table_name: epsilon_5k_hetero_guest
+    namespace: experiment
+    role: guest_0
+  - file: "examples/data/epsilon_5k_hetero_host.csv"
+    meta:
+      delimiter: ","
+      dtype: float64
+      input_format: dense
+      match_id_name: "id"
+      match_id_range: 0
+      tag_value_delimiter: ":"
+      tag_with_value: false
+      weight_type: float64
+    partitions: 4
+    head: true
+    extend_sid: true
+    table_name: epsilon_5k_hetero_host
+    namespace: experiment
+    role: host_0
+hetero_nn_sshe_binary_0:
+  local:
+    script: "./nn.py"
+    conf: "./breast_config.yaml"
+  FATE-hetero-nn:
+    script: "./hetero_nn.py"
+    conf: "./breast_config.yaml"
+  compare_setting:
+    relative_tol: 0.05
+hetero_nn_sshe_binary_1:
+  local:
+    script: "./nn.py"
+    conf: "./epsilon_5k.yaml"
+  FATE-hetero-nn:
+    script: "./hetero_nn.py"
+    conf: "./epsilon_5k.yaml"
+  compare_setting:
+    relative_tol: 0.05
+
+
diff --git a/examples/benchmark_quality/hetero_nn/nn.py b/examples/benchmark_quality/hetero_nn/nn.py
new file mode 100644
index 0000000000..10b588c433
--- /dev/null
+++ b/examples/benchmark_quality/hetero_nn/nn.py
@@ -0,0 +1,116 @@
+import argparse
+import os
+import torch as t
+import pandas as pd
+import math
+from sklearn.metrics import roc_auc_score, precision_score, accuracy_score, recall_score, roc_curve, mean_absolute_error, mean_squared_error
+from fate_client.pipeline.utils.test_utils import JobConfig
+
+
+class FakeNNModel(t.nn.Module):
+
+    def __init__(self, guest_bottom, host_bottom, guest_top, agg_layer_guest, agg_layer_host, is_binary=True):
+        super(FakeNNModel, self).__init__()
+        self.guest_bottom = t.nn.Linear(guest_bottom[0], guest_bottom[1])
+        self.host_bottom = t.nn.Linear(host_bottom[0], host_bottom[1])
+        self.guest_top = t.nn.Linear(guest_top[0], guest_top[1])
+        self.agg_layer_guest = t.nn.Linear(agg_layer_guest[0], agg_layer_guest[1])
+        self.agg_layer_host = t.nn.Linear(agg_layer_host[0], agg_layer_host[1])
+        self.is_binary = is_binary
+
+    def forward(self, x_g, x_h):
+        x_g = self.guest_bottom(x_g)
+        x_h = self.host_bottom(x_h)
+        x_g = self.agg_layer_guest(x_g)
+        x_h = self.agg_layer_host(x_h)
+        x = x_g + x_h
+        x = self.guest_top(x)
+        if self.is_binary:
+            x = t.nn.Sigmoid()(x)
+        return x
+    
+
+
+def main(config="../../config.yaml", param="./default_credit_config.yaml"):
+
+    if isinstance(param, str):
+        param = JobConfig.load_from_file(param)
+        print('param is {}'.format(param))
+
+    if isinstance(config, str):
+        config = JobConfig.load_from_file(config)
+        print(f"config: {config}")
+        data_base_dir = config["data_base_dir"]
+    else:
+        data_base_dir = config.data_base_dir
+
+    data_dir_path = data_base_dir + '/'
+
+    model = FakeNNModel(param['guest_model']['bottom'], param['host_model']['bottom'], param['guest_model']['top'],
+                        param['guest_model']['agg_layer'], param['host_model']['agg_layer'], param['is_binary'])
+    guest_data_path = param['data_guest']
+    host_data_path = param['data_host']
+    id_name = param['id_name']
+    label_name = param['label_name']
+
+    df_g = pd.read_csv(data_dir_path + guest_data_path)
+    df_g = df_g.drop(columns=[id_name])
+    label = df_g[label_name].values.reshape(-1, 1)
+    df_g = df_g.drop(columns=[label_name])
+    df_h = pd.read_csv(data_dir_path + host_data_path)
+    df_h = df_h.drop(columns=[id_name])
+
+    # tensor dataset
+    dataset = t.utils.data.TensorDataset(t.tensor(df_g.values).float(), t.tensor(df_h.values).float(),
+                                         t.tensor(label).float())
+
+    epoch = param['epochs']
+    batch_size = param['batch_size']
+    lr = param['lr']
+
+    optimizer = t.optim.Adam(model.parameters(), lr=lr)
+    data_loader = t.utils.data.DataLoader(dataset, batch_size=batch_size, shuffle=False)
+
+    for i in range(epoch):
+
+        loss_sum = 0
+        for x_g, x_h, label in data_loader:
+            optimizer.zero_grad()
+            output = model(x_g, x_h)
+            loss = t.nn.BCELoss()(output, label)
+            loss.backward()
+            optimizer.step()
+            loss_sum += loss.item()
+
+        print(f'epoch {i} loss: {loss_sum}')
+
+    # predict
+    y_prob = []
+    true_label = []
+    for x_g, x_h, label in data_loader:
+        output = model(x_g, x_h)
+        y_prob.append(output.detach().numpy())
+        true_label.append(label.detach().numpy())
+
+    if param['is_binary']:
+        # compute auc
+        import numpy as np
+        from sklearn.metrics import roc_auc_score
+        y_prob = np.concatenate(y_prob)
+        y_prob = y_prob.reshape(-1)
+        y_true = np.concatenate(true_label)
+        auc_score = roc_auc_score(y_true, y_prob)
+        print(f'auc score: {auc_score}')
+        return {}, {'auc': auc_score}
+    else:
+        pass
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser("BENCHMARK-QUALITY XGBoost JOB")
+    parser.add_argument("-c", "--config", type=str,
+                        help="config file", default="../../config.yaml")
+    parser.add_argument("-p", "--param", type=str,
+                        help="config file for params", default="./breast_config.yaml")
+    args = parser.parse_args()
+    main(args.config, args.param)
\ No newline at end of file