Skip to content

XITASO/lgar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

LGAR: Zero-Shot LLM-Guided Neural Ranking for Abstract Screening in Systematic Literature Reviews

This repo contains the code used for the paper [TODO: link]

Table of Contents

Requirements

Install the requirements listed in requirements.txt, e.g., using a conda environment or venv with Python=3.12

Required Data

The directory structure of the data folder should look like this:

data
├── synergy
│   └── info.json
│   └── csv files of dataset
├── tar2019
│   ├── dta
│   │   └── info.json
│   │   └── csv files of dataset
│   ├── intervention
│   │   └── info.json
│   │   └── csv files of dataset
│   ├── prognosis
│   │   └── info.json
│   │   └── csv files of dataset
│   ├──qualitative
│   │   └── info.json
│   │   └── csv files of dataset

First Stage of Ranking (LLM Ranker)

The scripts to reproduce the results of this paper are located in ./implementation/src/scripts/:

  • Experiment with different scales: run_scale_experiment.py
  • Experiment with different Prompts: run_prompt_experiment.py
  • Experiment with different LLMs: run_llm_epxeriment.py
  • Experiment with LLM (Title only): run_llm_ti_experiment.py The scripts can be started using a batch file (e.g., when using SLURM on a cluster). All scripts except for that used for LLM with title only expect to receive the index of the array job, since the scripts run only part of the code for a given index. In all files, you need to include the paths to the directory, where the model is stored or the huggingface tag of the model.

Example usage for run_prompt_experiment.py:

  • generate_examples=True: The script creates few-shot examples for zero-shot (index=0) and CoT prompting (index=1) for all SLRs of the current dataset (current dataset is selected by setting the respective paths in the config.json file) at the specified location (config.folder_path_few_shot_examples).
  • generate_examples=False: Runs the experiment for all prompting techniques. The provided index specifies the prompting technique (index = 0: 2-shot, index = 1: CoT, index = 2: CoT (n=3), index = 3: 2-shot CoT, index = 4: 2-shot CoT (n=3)).
    • The script saves the results to config.llm_client_output_directory_path with the following order structure:
    ├── Prompting Technique 1
    │   ├── Dataset 1
    │   │   ├── Results of SLR 1
    │   │   │   └── log_file_0.json
    │   │   ├── Results of SLR 2
    │   │   │   └── log_file_0.json
    │   │   ...
    │   └── Dataset 2
    │   ...
    ├── Prompting Technique 2
    ...
    
    • The results of the LLM ranker are stored in the folder of the respective SLR, in a json log-file with an index corresponding to the run (i.e., for not self-consistency runs: log_file_0.json).

Re-Ranking results of LLMs

This section describes how to re-rank the papers with a secondary (dense) ranker, after having completed the first stage of our ranking pipeline. There are two options for performing this step, either by using the script evaluate_experiments_single.py or by using evaluate_experiments.py. Both scripts expect that config.folder_path_slrs and config.file_path_slr_infos are correctly set to the dataset that should be evaluated.

Option 1: evaluate_experiments_single.py

  • Evaluates all experiments that are located in the subfolders of the experiment that is to be evaluated.
  • The experiment can be specified by setting config.llm_client_output_directory_path to the desired folder.
  • The specified folder is expected to be a folder above the folders of different SLRs of a dataset, so e.g., "./implementation/data/paper/prompts/synergy/CoT/"
  • To not always manually adapt the experiment path for all experiments in a SLR dataset folder, it is possible provide the path to the parent folder, e.g., "./implementation/data/paper/prompts/synergy/" and uncomment the respective lines, which split up the folders by the index provided to the main function.

Option 2: evaluate_experiments.py

  • Expects that config.llm_client_output_directory_path is at the level of a dataset folder, e.g., "./implementation/data/paper/prompts/synergy/"
  • To use this script for re-ranking the papers: is_lm_only=False and rerank=True
  • The script reranks (sequentially) the papers for all SLRs of all different configurations stored in config.llm_client_output_directory_path

If you want to use a different re-ranker, you need to change config.llm_client_config.path_to_reranker to a different dense ranker (if none is provided, BM25 will be used as fallback). To distinguish between different re-ranking queries, you need to set config.llm_client_config.system_message_type accordingly. If you specifiy it to be system_message_basic, only the title of the respective SLR is used as query, while if you specifiy it to be system_message_rq, title and rqs are used as query.

Evaluation

This section describes how the experiments can be evaluated using the script evaluate_experiments.py.

  • To generate only the result csv file and not re-rank the documents again: rerank=False → the script will then directly access the already ranked list stored in ranked_df.json for each SLR
  • Depending on what experiment you want to evaluate you can provide a function for name_label that renames the experiments abbreviation accordingly (either create_run_label_exp1 (scale experiment), create_run_label_exp2 (prompt experiment), or create_run_label_exp3 (all other experiments))
  • For the scale experiments: sort=True → ensures an ascending order or the scales; for other experiments it should be set to false
  • To evaluate only the performance of a dense ranker (without LLM): is_lm_only=True
  • As already described above, the script expects that you have set config.folder_path_slrs and config.file_path_slr_infos correctly to the dataset you want to evaluate. Furthermore, you need to specify, which results should be evaluated. This can be achieved by setting config.llm_client_output_directory_path to the respective results folder.

Dense Ranker Only

  • To evaluate only the performance of a dense ranker, you can use the script /src/scripts/run_lm_experiment.py
  • The model paths need to be set according to the models' location.
  • If the script is provided with an index which is not equal to 1, the query for the ranker is the title of the SLR; otherwise the query is title and research questions of the SLR
  • The evaluation of the dense ranker can be performed as described in the previous section, but it is necessary to set is_lm_only=True

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published