Skip to content

d294270681/tbnet

Repository files navigation

Contents

TB-Net is a knowledge graph based explainable recommender system.

Paper: Shendi Wang, Haoyang Li, Xiao-Hui Li, Caleb Chen Cao, Lei Chen. Tower Bridge Net (TB-Net): Bidirectional Knowledge Graph Aware Embedding Propagation for Explainable Recommender Systems

TB-Net constructs subgraphs in knowledge graph based on the interaction between users and items as well as the feature of items, and then calculates paths in the graphs using bidirectional conduction algorithm. Finally we can obtain explainable recommendation results.

Interaction of users and games, and the games' feature data on the game platform Steam are public on Kaggle.

Dataset directory: ./data/{DATASET}/, e.g. ./data/steam/.

  • train: train.csv, evaluation: test.csv

Each line indicates a <user>, an <item>, the user-item <rating> (1 or 0), and PER_ITEM_NUM_PATHS paths between the item and the user's <hist_item> (<hist_item> is the item whose the user-item <rating> in historical data is 1).

#format:user,item,rating,relation1,entity,relation2,hist_item,relation1,entity,relation2,hist_item,...,relation1,entity,relation2,hist_item  # module [relation1,entity,relation2,hist_item] repeats PER_ITEM_NUM_PATHS times
  • infer and explain: infer.csv

Each line indicates the <user> and <item> to be inferred, <rating>, and PER_ITEM_NUM_PATHS paths between the item and the user's <hist_item> (<hist_item> is the item whose the user-item <rating> in historical data is 1). Note that the <item> needs to traverse candidate items (all items by default) in the dataset. <rating> can be randomly assigned (all values are assigned to 0 by default) and is not used in the inference and explanation phases.

#format:user,item,rating,relation1,entity,relation2,hist_item,relation1,entity,relation2,hist_item,...,relation1,entity,relation2,hist_item  # module [relation1,entity,relation2,hist_item] repeats PER_ITEM_NUM_PATHS times

We have to download the data package and put it underneath the current project path。

wget https://mindspore-website.obs.myhuaweicloud.com/notebook/datasets/xai/tbnet_data.tar.gz
tar -xf tbnet_data.tar.gz

After installing MindSpore via the official website, you can start training and evaluation as follows:

  • Data preprocessing

Download the data package(e.g. 'steam' dataset) and put it underneath the current project path.

wget https://mindspore-website.obs.myhuaweicloud.com/notebook/datasets/xai/tbnet_data.tar.gz
tar -xf tbnet_data.tar.gz
cd scripts

and then run code as follows.

  • Training
bash run_train.sh [DATA_NAME] [DEVICE_ID] [DEVICE_TARGET]

Example:

bash run_train.sh steam 0 Ascend
  • Evaluation

Evaluation model on test dataset.

bash run_eval.sh [CHECKPOINT_ID] [DATA_NAME] [DEVICE_ID] [DEVICE_TARGET]

Argument [CHECKPOINT_ID] is required.

Example:

bash run_eval.sh 19 steam 0 Ascend
  • Inference and Explanation

Recommende items to user acrodding to user, the number of items is determined by items.

python infer.py \
  --dataset [DATASET] \
  --checkpoint_id [CHECKPOINT_ID] \
  --user [USER] \
  --items [ITEMS] \
  --explanations [EXPLANATIONS] \
  --csv [CSV]

Arguments --checkpoint_id and --user are required.

Example:

python infer.py \
  --dataset steam \
  --checkpoint_id 19 \
  --user 2 \
  --items 1 \
  --explanations 3 \
  --csv test.csv
.
└─tbnet
  ├─README.md
  ├── scripts
      ├─run_infer_310.sh    # Ascend310 inference script
      ├─run_train.sh    # training script
      └─run_eval.sh    # evaluation script
  ├─data
    ├─steam
        ├─config.json               # data and training parameter configuration
        ├─infer.csv                 # inference and explanation dataset
        ├─test.csv                  # evaluation dataset
        ├─train.csv                 # training dataset
        └─trainslate.json           # explanation configuration
  ├─src
    ├─aggregator.py                 # inference result aggregation
    ├─config.py                     # parsing parameter configuration
    ├─dataset.py                    # generate dataset
    ├─embedding.py                  # 3-dim embedding matrix initialization
    ├─metrics.py                    # model metrics
    ├─steam.py                      # 'steam' dataset text explainer
    └─tbnet.py                      # TB-Net model
  ├─export.py                         # export mindir script
  ├─preprocess_dataset.py           # dataset preprocess script
  ├─preprocess.py                         # inference data preprocess script
  ├─postprocess.py                         # inference result calculation script
  ├─eval.py                         # evaluation
  ├─infer.py                        # inference and explanation
  └─train.py                        # training
  • preprocess.py parameters
--dataset         'steam' dataset is supported currently
--device_target   run code on GPU or Ascend NPU
--same_relation   only generate paths that relation1 is same as relation2
  • train.py parameters
--dataset         'steam' dataset is supported currently
--train_csv       the train csv datafile inside the dataset folder
--test_csv        the test csv datafile inside the dataset folder
--device_id       device id
--epochs          number of training epochs
--device_target   run code on GPU or Ascend NPU
--run_mode        run code by GRAPH mode or PYNATIVE mode
  • eval.py parameters
--dataset         'steam' dataset is supported currently
--csv             the csv datafile inside the dataset folder (e.g. test.csv)
--checkpoint_id   use which checkpoint(.ckpt) file to eval
--device_id       device id
--device_target   run code on GPU or Ascend NPU
--run_mode        run code by GRAPH mode or PYNATIVE mode
  • infer.py parameters
--dataset         'steam' dataset is supported currently
--csv             the csv datafile inside the dataset folder (e.g. infer.csv)
--checkpoint_id   use which checkpoint(.ckpt) file to infer
--user            id of the user to be recommended to
--items           no. of items to be recommended
--reasons         no. of recommendation reasons to be shown
--device_id       device id
--device_target   run code on GPU or Ascend NPU
--run_mode        run code by GRAPH mode or PYNATIVE mode
python export.py --config_path [CONFIG_PATH] --checkpoint_path [CKPT_PATH] --device_target [DEVICE] --file_name [FILE_NAME] --file_format [FILE_FORMAT]
  • CKPT_PATH parameter is required.
  • CONFIG_PATH is config.json file, data and training parameter configuration.
  • DEVICE should be in ['Ascend', 'GPU'].
  • FILE_FORMAT should be in ['MINDIR', 'AIR'].

Example:

python export.py \
  --config_path ./data/steam/config.json \
  --checkpoint_path ./checkpoints/tbnet_epoch19.ckpt \
  --device_target Ascend \
  --file_name model \
  --file_format MINDIR

Before performing inference, the mindir file must be exported by export.py script. We only provide an example of inference using MINDIR model.

# Ascend310 inference
bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DEVICE_ID]
  • MINDIR_PATH specifies path of used "MINDIR" model.
  • DATA_PATH specifies path of test.csv.
  • DEVICE_ID is optional, default value is 0.

Example:

bash run_infer_310.sh ../model.mindir ../data/steam/test.csv 0

Inference result is saved in current path, you can find result like this in acc.log file.

auc: 0.8251359368836292

Training Performance

Parameters GPU Ascend NPU
Model Version TB-Net TB-Net
Resource Tesla V100-SXM2-32GB Ascend 910
Uploaded Date 2021-08-01 2022-06-30
MindSpore Version 1.3.0 1.5.1
Dataset steam steam
Training Parameter epoch=20, batch_size=1024, lr=0.001 epoch=20, batch_size=1024, lr=0.001
Optimizer Adam Adam
Loss Function Sigmoid Cross Entropy Sigmoid Cross Entropy
Outputs AUC=0.8596,Accuracy=0.7761 AUC=0.8592,准确率=0.7741
Loss 0.57 0.59
Speed 1pc: 90ms/step 单卡:80毫秒/步
Total Time 1pc: 297s 单卡:336秒
Checkpoint for Fine Tuning 104.66M (.ckpt file) 671K (.ckpt 文件)
Scripts TB-Net scripts

Evaluation Performance

Parameters GPU Ascend NPU
Model Version TB-Net TB-Net
Resource Tesla V100-SXM2-32GB Ascend 910
Uploaded Date 2021-08-01 2022-06-30
MindSpore Version 1.3.0 1.5.1
Dataset steam steam
Batch Size 1024 1024
Outputs AUC=0.8252,Accuracy=0.7503 AUC=0.8486,Accuracy=0.7704
Total Time 1pc: 5.7s 1pc: 1.1秒

Inference and Explanation Performance

Parameters GPU
Model Version TB-Net
Resource Tesla V100-SXM2-32GB
Uploaded Date 2021-08-01
MindSpore Version 1.3.0
Dataset steam
Outputs Recommendation Result and Explanation
Total Time 1pc: 3.66s
  • Initialization of embedding matrix in tbnet.py and embedding.py.

Please check the official homepage.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published