Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

[Engine] Apply the STS task to bge models #673

Merged
merged 19 commits into from
Nov 29, 2023
Merged

Conversation

Zhenzhong1
Copy link
Contributor

@Zhenzhong1 Zhenzhong1 commented Nov 14, 2023

Type of Change

  • Example add
  • API extend
  • Doc update

Description

  • Add the new directory for the text-embedding task and Massive Text Embedding Benchmark (mteb).
  • Verifyed the BAAI/bge-small-en-v1.5 accuracy with the Neural Engine backend.
  • Verifyed the BAAI/bge-base-en-v1.5 accuracy with the Neural Engine backend.

Accuracy
int8 bge-base with the Neural Engine on the STS: 82.191
image

int8 bge-small with the Neural Engine on the STS: 81.43
image

Usage:

from transformers import AutoTokenizer
from intel_extension_for_transformers.transformers import AutoModel

sentences_batch = ['sentence-1', 'sentence-2', 'sentence-3', 'sentence-4']

tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-base-en-v1.5')
encoded_input = tokenizer(sentences_batch,
                            padding=True,
                            truncation=True,
                            max_length=512,
                            return_tensors="np")

engine_input = [encoded_input['input_ids'], encoded_input['token_type_ids'], encoded_input['attention_mask']]

model = AutoModel.from_pretrained('./model_and_tokenizer/int8-model.onnx', use_embedding_runtime=True)
sentence_embeddings = model.generate(engine_input)['last_hidden_state:0']

print("Sentence embeddings:", sentence_embeddings)

Expected Behavior & Potential Risk

N/A

How has this PR been tested?

bge benchmark

Dependency Change?

remove C_MTEB

@Zhenzhong1 Zhenzhong1 changed the title apply the STS task [Example] Apply tee STS task to bge models Nov 16, 2023
@Zhenzhong1 Zhenzhong1 changed the title [Example] Apply tee STS task to bge models [Example] Apply the STS task to bge models Nov 17, 2023
@Zhenzhong1 Zhenzhong1 changed the title [Example] Apply the STS task to bge models [Engine] Apply the STS task to bge models Nov 17, 2023
@Zhenzhong1 Zhenzhong1 marked this pull request as ready for review November 28, 2023 08:25
Copy link
Contributor

@a32543254 a32543254 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@VincyZhang
Copy link
Contributor

Will test LLM deprecated python mode in release tests.

@VincyZhang VincyZhang merged commit 0c4c5ed into main Nov 29, 2023
20 of 21 checks passed
@VincyZhang VincyZhang deleted the zhenzhong/bge-sts branch November 29, 2023 07:56
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants