feat: Reward model #1271

Asher-hss · 2024-12-04T10:36:39Z

Description

#889

Motivation and Context

#889

I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Put an x in all the boxes that apply:

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds core functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)
Example (update in the folder of example)

Implemented Tasks

Subtask 1
Subtask 2
Subtask 3

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

I have read the CONTRIBUTION guide. (required)
My change requires a change to the documentation.
I have updated the tests accordingly. (required for a bug fix or a new feature)
I have updated the documentation accordingly.

AveryYay · 2024-12-04T12:19:34Z

camel/models/reward/nemetro_model.py

+                if logprobs
+                else {}
+            )
+        except (KeyError, IndexError):


Maybe add a log here?

Suggested change

except (KeyError, IndexError):

except (KeyError, IndexError):

logging.error(f"Error parsing scores: {e}")

Yeah, I think we shouldn't add the error msg into score dict, better raise error

camel/models/reward/evaluator.py

koch3092

thanks @Asher-hss , left some comments below.

camel/models/reward/base_reward_model.py

camel/models/reward/nemetro_model.py

Wendong-Fan

Thanks @Asher-hss , as mentioned in the issue, we also want to support skywork (Seq. Classifier), could you also add this model?

Asher-hss · 2024-12-10T22:29:45Z

Thanks @Asher-hss , as mentioned in the issue, we also want to support skywork (Seq. Classifier), could you also add this model?

Thanks wendong，I plan to update this model today.

Wendong-Fan

Thanks @Asher-hss ! Left some comments below and added one more commit here 58d7278 based on the review, feel free to review the change, I think we can add the skywork model in another PR to let this PR merged first, but please add unit test first before the merge, thanks!

camel/models/reward/base_reward_model.py

camel/models/reward/evaluator.py

Wendong-Fan · 2024-12-11T12:09:10Z

camel/models/reward/nemetro_model.py

+        )
+
+    @api_keys_required("NVIDIA_API_KEY")
+    def evaluate(self, messages: List[OpenAIMessage]) -> Dict[str, float]:


set messages type as List[Dict[str, str]] would suit for more general use case

camel/models/reward/nemetro_model.py

lightaime · 2024-12-12T21:23:25Z

The model is called [nemotron](https://build.nvidia.com/nvidia/nemotron-4-340b-instruct) not nemetrocan you fix that asap?

Asher-hss · 2024-12-12T21:26:41Z

The model is called [nemotron](https://build.nvidia.com/nvidia/nemotron-4-340b-instruct) not nemetrocan you fix that asap?

I create a new pr to fix it now.

Asher-hss added 2 commits December 4, 2024 01:51

update

9e7e0ae

update

40c7b8d

Asher-hss self-assigned this Dec 4, 2024

Asher-hss requested a review from Wendong-Fan December 4, 2024 10:36

Wendong-Fan changed the title ~~Reward model~~ feat: Reward model Dec 4, 2024

Wendong-Fan added the Model Related to backend models label Dec 4, 2024

Wendong-Fan added this to the Sprint 18 milestone Dec 4, 2024

Wendong-Fan linked an issue Dec 4, 2024 that may be closed by this pull request

[Feature Request] Reward models #889

Closed

2 tasks

minor format fix

16ad481

Wendong-Fan requested review from koch3092 and AveryYay December 4, 2024 11:44

AveryYay approved these changes Dec 4, 2024

View reviewed changes

koch3092 reviewed Dec 6, 2024

View reviewed changes

camel/models/reward/base_reward_model.py Show resolved Hide resolved

camel/models/reward/nemetro_model.py Show resolved Hide resolved

Wendong-Fan reviewed Dec 10, 2024

View reviewed changes

update

a12e10f

Wendong-Fan reviewed Dec 11, 2024

View reviewed changes

Wendong-Fan and others added 2 commits December 11, 2024 20:28

update wendong

58d7278

update

e44abd2

Asher-hss requested a review from Wendong-Fan December 11, 2024 23:14

Asher-hss and others added 4 commits December 11, 2024 20:00

Merge branch 'master' into reward_model

6efcdfd

merge master

8c2ba29

update

6f76be2

Merge branch 'master' into reward_model

783b022

Wendong-Fan approved these changes Dec 12, 2024

View reviewed changes

Wendong-Fan merged commit b3a2fde into master Dec 12, 2024
4 of 6 checks passed

Wendong-Fan deleted the reward_model branch December 12, 2024 19:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Reward model #1271

feat: Reward model #1271

Asher-hss commented Dec 4, 2024

AveryYay Dec 4, 2024

Wendong-Fan Dec 11, 2024

koch3092 left a comment

Wendong-Fan left a comment

Asher-hss commented Dec 10, 2024

Wendong-Fan left a comment •

edited

Loading

Wendong-Fan Dec 11, 2024

lightaime commented Dec 12, 2024

Asher-hss commented Dec 12, 2024

	except (KeyError, IndexError):
	except (KeyError, IndexError):
	logging.error(f"Error parsing scores: {e}")

feat: Reward model #1271

feat: Reward model #1271

Conversation

Asher-hss commented Dec 4, 2024

Description

Motivation and Context

Types of changes

Implemented Tasks

Checklist

AveryYay Dec 4, 2024

Choose a reason for hiding this comment

Wendong-Fan Dec 11, 2024

Choose a reason for hiding this comment

koch3092 left a comment

Choose a reason for hiding this comment

Wendong-Fan left a comment

Choose a reason for hiding this comment

Asher-hss commented Dec 10, 2024

Wendong-Fan left a comment • edited Loading

Choose a reason for hiding this comment

Wendong-Fan Dec 11, 2024

Choose a reason for hiding this comment

lightaime commented Dec 12, 2024

Asher-hss commented Dec 12, 2024

Wendong-Fan left a comment •

edited

Loading