Add left aligned cache support. #133

wang2yn84 · 2024-06-21T22:43:07Z

No description provided.

… no return issue;

…ted tests.

FanhaiLu1 · 2024-06-24T18:35:44Z

jetstream_pt/engine.py

-    # fill mask first
-    mask = decode_state.mask.at[:, decode_state.current_position].set(0)
+    if self.env.ring_buffer:
+      input_indexes = jnp.full((1,), pos)


What if we change current_position to [batch_size, 1], can we use same logic do mask for both ring_buffer and onn_ring_buffer?

Not really. For non ring buffer case, there is one single value of current position to indicate the decoding position for all the batches. But for ring buffer, every batch has different position, so we cannot use the current_position here.

Yes. I mean if we change current_position to [batch_size, 1], different slot can have different the current_position. For non ring buffer case, the current_position should be same as input_pos.

It will cause performance regression. Please check jax_experiments.py/test7, inserting with batching + position array takes much longer, like x4~x5

FanhaiLu1 · 2024-06-24T18:40:06Z

jetstream_pt/engine.py

+      input_pos = jnp.where(
+          decode_state.input_pos == 0,
+          0,
+          decode_state.input_pos + 1 % self.env.cache_len,


In non ring buffer case, can input_pos be larger than cache len?

If no, I feel we don't need do % since it never reach the cache len.

We don't have control for this. Generate() will keep running if no new prefill results are inserted.

Thanks for sharing the details!

jetstream_pt/engine.py

wang2yn84 added 6 commits June 20, 2024 22:38

Add left aligned cache insertion support.

ada79a4

Fix ring buffer config propogation issue; Fix the left aligned insert…

3998c95

… no return issue;

Updates the generate function to support left aligned cache.

37a8e89

Fix the cache insertion connection issue and insertion error and rela…

73da3cf

…ted tests.

Fix tests and lint errors.

bce8a09

Fix lint issues.

93dae6e

wang2yn84 requested review from qihqi and FanhaiLu1 June 21, 2024 23:04

FanhaiLu1 reviewed Jun 24, 2024

View reviewed changes

FanhaiLu1 approved these changes Jun 26, 2024

View reviewed changes

qihqi approved these changes Jun 28, 2024

View reviewed changes

qihqi merged commit 175d956 into main Jun 28, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add left aligned cache support. #133

Add left aligned cache support. #133

wang2yn84 commented Jun 21, 2024

FanhaiLu1 Jun 24, 2024

wang2yn84 Jun 24, 2024

FanhaiLu1 Jun 26, 2024

wang2yn84 Jun 26, 2024

FanhaiLu1 Jun 24, 2024

wang2yn84 Jun 24, 2024

FanhaiLu1 Jun 26, 2024

wang2yn84 Jun 26, 2024

FanhaiLu1 Jun 26, 2024

Add left aligned cache support. #133

Add left aligned cache support. #133

Conversation

wang2yn84 commented Jun 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment