Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Jacobi method and its corresponding code #20

Closed
YingHH1 opened this issue Nov 27, 2023 · 2 comments
Closed

The Jacobi method and its corresponding code #20

YingHH1 opened this issue Nov 27, 2023 · 2 comments

Comments

@YingHH1
Copy link

YingHH1 commented Nov 27, 2023

In the blog, the Jacobi method is directed to the wiki page on Jacobi iterative method for solving linear system. But here we are solving nonlinear system of equations, so what exactly is the the algorithm? Could it be the Gauss-Jacobi method described in https://www.researchgate.net/profile/Romulo-Chumacero/publication/265633573_Solving_Nonlinear_Equations/links/556e350708aec2268308c3d3/Solving-Nonlinear-Equations.pdf

Furthermore, if it is an iterative method for solving nonlinear equations, then wouldn't the solutions being floating point numbers rather than the integers in input_ids for LLM?

I would like to understand the overall structure of the code implementation, and find out more details of the whole Lookahead decoding algorithm (including the solving nonlinear equations using Jacobi method part). However, I couldn't locate the relevant code, so any direction on where to look for code for implementing the Jacobi method would be greatly appreciated.

Many thanks for the help!

@Viol2000
Copy link
Collaborator

Hi, thanks for your interest!

I think Jacobi is similar in non-linear and linear systems. If every problem in the system is a linear function, it is a linear system, or it will be a non-linear system. Moreover, I think Gauss-Seidel and Jacobi are different on the parallelism side, and Jacobi is more parallelized. You can take a look at these two papers: https://proceedings.mlr.press/v139/song21a/song21a.pdf and https://arxiv.org/pdf/2305.10427.pdf.

The previous methods tried to apply Jacobi-iteration in accelerating the language model decoding process, but I think they do not have a very visible speedup. I think one reason is the floating point problem. The Jacobi method does not work well as the input and output are discrete integers rather than floating points in the LLM decoding process.

Our method, though motivated by Jacobi iteration. It is actually different from it. The lookahead branch is like a delayed Jacobi iteration (I do not know its exact name) with several inputs from a much earlier step rather than the last step to solve several non-linear systems (LLM) to obtain output for each window slot. And we capture the tokens from this process. Here is another model detailed discuss. When the N-grams's N=2, it becomes a Jacobi iteration. Here, you can simply think that LLM decoding one token solves its non-linear problem. Parallelly decoding several consecutive tokens solves a non-linear system in parallel or in a Jacobi style.

Suppose you want to dive into the code. I think you should shortly discard the Jacobi iteration and focus on our algorithm instead because it is not exactly the Jacobi you want. You can take a look at and track this variable past_tokens. It updates as a Jacobi iteration to obtain new n-grams.

@YingHH1
Copy link
Author

YingHH1 commented Nov 30, 2023

Thank you for the response. I think I understand the whole algorithm now.

@YingHH1 YingHH1 closed this as completed Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants