Added structured generation support to MlxLLM using Outlines #1108

dameikle · 2025-01-22T12:38:13Z

Adds initial support for structured generation using Outlines to the MlxLLM.

I've tried to build on what was there and use wrappers to integrate with the outlines.py as opposed to changing the inferencing approach in the current class, so could be simpler.

davidberenstein1957 · 2025-01-22T13:27:16Z

Awesome, @dameikle! Would you be able to forward some example code and write some tests for this too? Also, normally we work on top of develop would you be able to cherry-pick the commits and change the PR to develop?

dameikle · 2025-01-22T13:41:21Z

Thanks for the reply @davidberenstein1957. Sorry I should have noticed I'd branched of main instead of develop 🙈 Sure thing, I'll add a test, and for the example, should I just add it to the docstring on the model?

davidberenstein1957 · 2025-01-22T13:44:12Z

@dameikle thanks for the quick response🔥 Try to align the docstring with what we've got for other LLMs. W.r.t. the example code, it helps maintainers to quickly copy-paste it and test the integration :)

for more information, see https://pre-commit.ci

dameikle · 2025-01-22T18:10:31Z

@davidberenstein1957 Hopefully this rebase has worked and not left too much noise.

I've added an example code to the docstring as well as a test using the same model the other one uses. You should be able to do something like this with it:

from pathlib import Path
from distilabel.models.llms import MlxLLM
from pydantic import BaseModel, Field

class User(BaseModel):
    name: str
    last_name: str
    email: str


llm = MlxLLM(
    path_or_hf_repo="mlx-community/Meta-Llama-3.1-8B-Instruct-4bit",
    structured_output={"format": "json", "schema": User},
)

llm.load()

output = llm.generate_outputs(inputs=[[{"role": "user", "content": "Create a user profile for John Smith"}]])
print(output)
# [{'generations': ['{ "name": "John Smith", "last_name": "Smith", "email": "john.smith@email.com" }'], 'statistics': {'input_tokens': [7], 'output_tokens': [26]}}]

davidberenstein1957

Looking great! Some minor comments :) The code snippet looks good. For some reason it does work in 0.1.13 but not in 0.1.11, however, we can redirect users to a library upgrade if errors occur.

davidberenstein1957 · 2025-01-23T10:56:51Z

tests/unit/models/llms/test_mlx.py

@@ -63,6 +64,47 @@ def test_generate(self, llm: MlxLLM) -> None:
        assert "input_tokens" in statistics
        assert "output_tokens" in statistics

+    def test_structured_generation_json(self, llm: MlxLLM) -> None:


we have structured generation tests in tests/unit/steps/tasks/structured_outputs/test_outlines.py could you add/integrate this test there?

davidberenstein1957 · 2025-01-23T10:58:05Z

src/distilabel/models/llms/mlx.py

+        self.model = model
+        self.tokenizer = tokenizer
+
+
 class MlxLLM(LLM, MagpieChatTemplateMixin):


we should also be able to pass structured output format during the class init.

davidberenstein1957 · 2025-01-23T11:01:23Z

src/distilabel/models/llms/mlx.py


 if TYPE_CHECKING:
    import mlx.nn as nn
    from mlx_lm.tokenizer_utils import TokenizerWrapper


+class MlxModel:


I think we can import this from outlines.models.mlxlm

davidberenstein1957 · 2025-01-23T11:02:31Z

src/distilabel/models/llms/mlx.py

@@ -99,6 +140,7 @@ def load(self) -> None:
            model_config=self.mlx_model_config,
            adapter_path=self.adapter_path,
        )
+        self._wrapped_model = MlxModel(self._model, self._tokenizer)


I would create this during the load of the class.

davidberenstein1957 · 2025-01-23T11:03:57Z

src/distilabel/steps/tasks/structured_outputs/outlines.py

@@ -101,6 +102,11 @@ def _get_logits_processor(framework: Frameworks) -> Tuple[Callable, Callable]:
                "JSONLogitsProcessor",
                "RegexLogitsProcessor",
            ),
+            "mlx": (


is mlx not implemented for outlines below 0.1?

davidberenstein1957 · 2025-01-23T11:04:41Z

src/distilabel/steps/tasks/structured_outputs/outlines.py

@@ -37,10 +37,11 @@
    from llama_cpp import Llama  # noqa
    from transformers import Pipeline  # noqa
    from vllm import LLM as _vLLM  # noqa
+    from distilabel.models.llms.mlx import MlxModel  # noqa


I would import this class from outlines.model.mlxlm to avoid code duplication

dameikle · 2025-01-30T19:51:13Z

@davidberenstein1957 thanks for the review, and sorry it's taken a while to look at them. I'll work through them and get them resolved. It looks like in outlines there was a bug stopping this working correctly in 0.1.11, I'll see if I can do something smart for highlighting this to the users.

davidberenstein1957 · 2025-02-04T06:06:55Z

@dameikle I think we can keep it like this and remember because we can't do triple constraints.

Something like this could be an option too.

def is_valid_version(version):
    return (version > "1.2" and version < "2") or (version > "2.1")

davidberenstein1957 changed the base branch from main to develop January 22, 2025 13:27

davidberenstein1957 changed the base branch from develop to main January 22, 2025 13:27

dameikle and others added 2 commits January 22, 2025 17:16

Added structured generation support to MlxLLM using Outlines

626b222

[pre-commit.ci] auto fixes from pre-commit.com hooks

e24470c

for more information, see https://pre-commit.ci

dameikle force-pushed the mlx_structured_generation branch from 55bc71f to e24470c Compare January 22, 2025 17:18

dameikle and others added 2 commits January 22, 2025 18:05

Added example and test for generating structured data to MlxLLM

ad91b0b

[pre-commit.ci] auto fixes from pre-commit.com hooks

6e5d545

for more information, see https://pre-commit.ci

dameikle changed the base branch from main to develop January 22, 2025 18:11

davidberenstein1957 requested review from gabrielmbmb and davidberenstein1957 January 23, 2025 07:18

davidberenstein1957 reviewed Jan 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added structured generation support to MlxLLM using Outlines #1108

Added structured generation support to MlxLLM using Outlines #1108

dameikle commented Jan 22, 2025 •

edited

Loading

davidberenstein1957 commented Jan 22, 2025 •

edited

Loading

dameikle commented Jan 22, 2025

davidberenstein1957 commented Jan 22, 2025

dameikle commented Jan 22, 2025

davidberenstein1957 left a comment •

edited

Loading

davidberenstein1957 Jan 23, 2025

davidberenstein1957 Jan 23, 2025

davidberenstein1957 Jan 23, 2025

davidberenstein1957 Jan 23, 2025

davidberenstein1957 Jan 23, 2025

davidberenstein1957 Jan 23, 2025

dameikle commented Jan 30, 2025

davidberenstein1957 commented Feb 4, 2025

Added structured generation support to MlxLLM using Outlines #1108

Are you sure you want to change the base?

Added structured generation support to MlxLLM using Outlines #1108

Conversation

dameikle commented Jan 22, 2025 • edited Loading

davidberenstein1957 commented Jan 22, 2025 • edited Loading

dameikle commented Jan 22, 2025

davidberenstein1957 commented Jan 22, 2025

dameikle commented Jan 22, 2025

davidberenstein1957 left a comment • edited Loading

Choose a reason for hiding this comment

davidberenstein1957 Jan 23, 2025

Choose a reason for hiding this comment

davidberenstein1957 Jan 23, 2025

Choose a reason for hiding this comment

davidberenstein1957 Jan 23, 2025

Choose a reason for hiding this comment

davidberenstein1957 Jan 23, 2025

Choose a reason for hiding this comment

davidberenstein1957 Jan 23, 2025

Choose a reason for hiding this comment

davidberenstein1957 Jan 23, 2025

Choose a reason for hiding this comment

dameikle commented Jan 30, 2025

davidberenstein1957 commented Feb 4, 2025

dameikle commented Jan 22, 2025 •

edited

Loading

davidberenstein1957 commented Jan 22, 2025 •

edited

Loading

davidberenstein1957 left a comment •

edited

Loading