Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent outputs of MiniGPT-v2 (peft version is 0.2.0) #525

Open
c-hepburn opened this issue Aug 14, 2024 · 0 comments
Open

Inconsistent outputs of MiniGPT-v2 (peft version is 0.2.0) #525

c-hepburn opened this issue Aug 14, 2024 · 0 comments

Comments

@c-hepburn
Copy link

c-hepburn commented Aug 14, 2024

Hi,

Thank you much for your work.

I have a problem with image "grounding".

Based on your evaluation codes, I "assembled" a short script to apply the model on a single image. If I run the script with prompt "[grounding] please describe this image in details", grounding is not performed. The model outputs only image caption. If I run the demo version locally with the same checkpoint it does function. The version of peft package is 0.2.0.

Could you please tell what could be the issue here? Thank you.

The script I run is below:

def list_of_str(arg):
return list(map(str, arg.split(',')))

parser = eval_parser()
parser.add_argument("--dataset", type=list_of_str, default='refcoco', help="dataset to evaluate")
parser.add_argument("--res", type=float, default=100.0, help="resolution used in refcoco")
parser.add_argument("--resample", action='store_true', help="resolution used in refcoco")
args = parser.parse_args()

cfg = Config(args)

model, vis_processor = init_model(args)
model.eval()

Define conversation template and remove system role.
CONV_VISION = CONV_VISION_minigptv2
conv_temp = CONV_VISION.copy()
conv_temp.system = ""

Load and preprocess an image for evaluation.
index = 301246
image_path = f'/home/models/MiniGPT-4/filtered_flickr/images/{index}.jpg'
img = Image.open(image_path)
img = vis_processor(img)

Note: different image size for first and second version of MiniGPT4.
img = torch.reshape(img, (1,3,224,224)) ## version 1
img = torch.reshape(img, (1,3,448,448)) ## version 2

Prepare text for evaluation using conversation template
txt = "[grounding] please describe this image in details"
text = prepare_texts(txt, conv_temp)

answer = model.generate(img, text, max_new_tokens=500, do_sample=False)

Print the generated answer
print(answer)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant