Skip to content

Commit

Permalink
skip checking schedulability in put()
Browse files Browse the repository at this point in the history
  • Loading branch information
tohtana committed Jan 15, 2024
1 parent 94ebae1 commit 904e500
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion mii/batching/ragged_batching.py
Original file line number Diff line number Diff line change
Expand Up @@ -442,7 +442,8 @@ def make_response(self,
finish_reason=finish_reason)

def put(self, uids: List[int], tokenized_input: List[torch.Tensor]) -> torch.Tensor:
return self.inference_engine.put(uids, tokenized_input)
# Call inference engine. You can skip checking schedulability because we already checked when scheduling
return self.inference_engine.put(uids, tokenized_input, skip_check=True)

def flush(self, uids: List[int]) -> None:
for uid in uids:
Expand Down

0 comments on commit 904e500

Please sign in to comment.