Skip to content

Disaggregated serving #365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Apr 18, 2025
Merged

Disaggregated serving #365

merged 15 commits into from
Apr 18, 2025

Conversation

quic-amitraj
Copy link
Contributor

@quic-amitraj quic-amitraj commented Apr 16, 2025

Adding support of-

  1. prefill_only
  2. mdp_ts_json_path

quic-rishinr and others added 9 commits April 16, 2025 05:30
Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
@@ -300,7 +308,7 @@ def _compile(
command.append(f"-custom-IO-list-file={custom_io_yaml}")

# Write mdp_config.json file
if mdp_ts_num_devices > 1:
if not mdp_ts_json_path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not mdp_ts_json_path and mdp_ts_num_devices>1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not addressed yet?

@@ -300,7 +308,7 @@ def _compile(
command.append(f"-custom-IO-list-file={custom_io_yaml}")

# Write mdp_config.json file
if mdp_ts_num_devices > 1:
if not mdp_ts_json_path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not addressed yet?

ochougul and others added 4 commits April 18, 2025 14:25
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
@ochougul ochougul marked this pull request as ready for review April 18, 2025 11:02
@ochougul ochougul requested a review from quic-rishinr as a code owner April 18, 2025 11:02
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
@quic-amitraj quic-amitraj merged commit 3de4072 into main Apr 18, 2025
5 checks passed
@quic-xiyushi
Copy link

Is the prefill_only flag available only for QEFFAutoModelForCausalLM? Why don't we support it for other classes as well, such as multimodal models?

eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Signed-off-by: eplatero <quic_eplatero@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>

Signed-off-by: eplatero <quic_eplatero@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>

Signed-off-by: eplatero <quic_eplatero@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>

Signed-off-by: eplatero <quic_eplatero@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
Adding support of-
1. `prefill_only`
2. `compile_for` for VLM
3. `mdp_ts_json_path`

---------

Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>
Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <168134249+ochougul@users.noreply.github.com>

Signed-off-by: eplatero <quic_eplatero@quicinc.com>
@quic-rishinr quic-rishinr deleted the dist_serve branch June 13, 2025 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants