Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement outlines converter model for structured output (#1211) #1318

Merged
merged 13 commits into from
Dec 17, 2024

Conversation

MuggleJinx
Copy link
Collaborator

Description

Integrate outlines library to produce the structured output.

Motivation and Context

Close #1211.

Types of changes

What types of changes does your code introduce? Put an x in all the boxes that apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds core functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)
  • Example (update in the folder of example)

Implemented Tasks

  • Implement outlines converter class.
  • Add examples.

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • I have read the CONTRIBUTION guide. (required)
  • My change requires a change to the documentation.
  • I have updated the tests accordingly. (required for a bug fix or a new feature)
  • I have updated the documentation accordingly.

Copy link
Collaborator

@harryeqs harryeqs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MuggleJinx! I have added two comments for reference, please have a look when possible.

MuggleJinx and others added 2 commits December 15, 2024 16:46
Co-authored-by: Harry Ye <116691547+harryeqs@users.noreply.github.com>
@MuggleJinx
Copy link
Collaborator Author

Thanks @harryeqs, updated!

@harryeqs
Copy link
Collaborator

Thanks @harryeqs, updated!

Thanks @MuggleJinx !

Comment on lines 40 to 55
# 1. Using a Pydantic model
class Temperature(BaseModel):
location: str
date: str
temperature: float


output = model.convert_json(
"Today is 2023-09-01, the temperature in Beijing is 30 degrees.",
output_schema=Temperature,
)

print(output)
'''
location='Beijing' date='2023-09-01' temperature=30.0
'''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the output format use '=' @Wendong-Fan , not {'location':'Beijing'}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I split the it into 2 functions, one returns pydantic object, one returns the internal dict object.

Copy link
Member

@Wendong-Fan Wendong-Fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MuggleJinx ! Overall LGTM, left some comments below

pyproject.toml Outdated Show resolved Hide resolved
examples/schema_outputs/outlines-converter-example.py Outdated Show resolved Hide resolved

from typing import Any, Callable, List, Literal, Type, Union

import outlines
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outlines is not necessary packages

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sry, it is needed in outlines.generate.regex for example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can move import outlines within the class to make it optional

Copy link
Member

@Wendong-Fan Wendong-Fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @MuggleJinx ! left some comments below and added one commit here:87ccb62 feel free to review and check, let me know if there's any issue~

@@ -67,6 +67,7 @@ sentencepiece = { version = "^0", optional = true }
opencv-python = { version = "^4", optional = true }

# tools
outlines = { version = "^0.1.7", optional = true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also add it to tool and all list

)

print(output)
'''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unify by using " instead of '


from typing import Any, Callable, List, Literal, Type, Union

import outlines
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can move import outlines within the class to make it optional

case _:
raise ValueError(f"Unsupported platform: {platform}")

def convert_regex(self, content: str, regex_pattern: str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing return type hint, even it could be Any

)
return json_generator(content)

def convert_type(self, content: str, type_name: type):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing return type hint, same for other methods

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file naming, use _ instead of -

@Wendong-Fan Wendong-Fan merged commit 89691b8 into master Dec 17, 2024
6 checks passed
@Wendong-Fan Wendong-Fan deleted the outlines-as-services branch December 17, 2024 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[Feature Request] Finish the implement of structured output using outlines
5 participants