Phi-3 support #1672

Theodotus1243 · 2024-04-23T16:58:42Z

Powerful model trained on syntetic data, has high MMLU

4K context window one should be easier, as has no LongRope

https://huggingface.co/microsoft/Phi-3-mini-4k-instruct
https://arxiv.org/pdf/2404.14219.pdf

The text was updated successfully, but these errors were encountered:

BBC-Esq · 2024-04-24T10:23:03Z

I second this. The current phi loader is broken, apparently because of some changes that Microsoft did to the model after it was initially released. At any rate, adapting the phi loader to the new phi3 should be easier than starting from scratch.

jncraton · 2024-04-24T17:31:26Z

For anyone else researching this, phi3 support has been added to the convert_hf_to_gguf.py script in llama.cpp. Perhaps something can be gleaned from there to simplify the implementation of the ct2 converter.

vince62s · 2024-04-24T17:39:16Z

no worry it will be done, it's quite easy for the mini-4k since it takes all llama2 arch.
fyi: https://forum.opennmt.net/t/phi-3-3-8b-llama2-7b-ensemble-just-for-fun/5729

BBC-Esq · 2024-04-24T20:31:10Z

Is it done yet? I've been waiting patiently for approximately two hours now? ;-)

minhthuc2502 · 2024-04-25T07:49:15Z

Hello, I am working on it. Some unexpected problems appears.

BBC-Esq · 2024-04-25T07:56:21Z

I'm not skilled enough to help directly by implementing the code...but if you want me to do any grunt work or research let me know dude...anything to assist speed up the process. Thanks!

BBC-Esq · 2024-04-25T08:22:25Z

I'd like to start learning to eventually possibly help...Question...how do I get the actual model architecture to start with...It's my understanding that getting the model's structure, what activation functions are used, etc. and basically starting to understanding the structure is key in making additional converters down the road. For example, here's a link:

https://bbycroft.net/llm

Here are some other links that I've been gathering with the goal of eventually contributing a converter...based on first trying to understand the structure of LLMs...

https://github.com/mert-kurttutan/torchview

https://github.com/lutzroeder/netron

Huggingface sometimes (but not always) has information like this...

Basically, any good starting point for me that you'd recommend dude? Thanks!

BBC-Esq · 2024-04-25T08:23:19Z

Remember, you're dealing with an idiot who doesn't do this for a profession and has never taken the LLM 101 class in college let alone have a doctoral degree. ;-) I don't even know what "mlp.down" or "layernorm.weight" means, for example, but am willing to learn.

minhthuc2502 · 2024-04-25T12:32:21Z

PR #1680 to add the converter for Phi3

vince62s closed this as completed Apr 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phi-3 support #1672

Phi-3 support #1672

Theodotus1243 commented Apr 23, 2024 •

edited

Loading

BBC-Esq commented Apr 24, 2024

jncraton commented Apr 24, 2024

vince62s commented Apr 24, 2024

BBC-Esq commented Apr 24, 2024

minhthuc2502 commented Apr 25, 2024

BBC-Esq commented Apr 25, 2024

BBC-Esq commented Apr 25, 2024

BBC-Esq commented Apr 25, 2024 •

edited

Loading

minhthuc2502 commented Apr 25, 2024

Phi-3 support #1672

Phi-3 support #1672

Comments

Theodotus1243 commented Apr 23, 2024 • edited Loading

BBC-Esq commented Apr 24, 2024

jncraton commented Apr 24, 2024

vince62s commented Apr 24, 2024

BBC-Esq commented Apr 24, 2024

minhthuc2502 commented Apr 25, 2024

BBC-Esq commented Apr 25, 2024

BBC-Esq commented Apr 25, 2024

BBC-Esq commented Apr 25, 2024 • edited Loading

minhthuc2502 commented Apr 25, 2024

Theodotus1243 commented Apr 23, 2024 •

edited

Loading

BBC-Esq commented Apr 25, 2024 •

edited

Loading