-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phi-3 support #1672
Comments
I second this. The current phi loader is broken, apparently because of some changes that Microsoft did to the model after it was initially released. At any rate, adapting the phi loader to the new phi3 should be easier than starting from scratch. |
For anyone else researching this, phi3 support has been added to the convert_hf_to_gguf.py script in llama.cpp. Perhaps something can be gleaned from there to simplify the implementation of the ct2 converter. |
no worry it will be done, it's quite easy for the mini-4k since it takes all llama2 arch. |
Is it done yet? I've been waiting patiently for approximately two hours now? ;-) |
Hello, I am working on it. Some unexpected problems appears. |
I'm not skilled enough to help directly by implementing the code...but if you want me to do any grunt work or research let me know dude...anything to assist speed up the process. Thanks! |
I'd like to start learning to eventually possibly help...Question...how do I get the actual model architecture to start with...It's my understanding that getting the model's structure, what activation functions are used, etc. and basically starting to understanding the structure is key in making additional converters down the road. For example, here's a link: Here are some other links that I've been gathering with the goal of eventually contributing a converter...based on first trying to understand the structure of LLMs... https://github.com/mert-kurttutan/torchview https://github.com/lutzroeder/netron Huggingface sometimes (but not always) has information like this... Basically, any good starting point for me that you'd recommend dude? Thanks! |
Remember, you're dealing with an idiot who doesn't do this for a profession and has never taken the LLM 101 class in college let alone have a doctoral degree. ;-) I don't even know what "mlp.down" or "layernorm.weight" means, for example, but am willing to learn. |
PR #1680 to add the converter for Phi3 |
Powerful model trained on syntetic data, has high MMLU
4K context window one should be easier, as has no
LongRope
https://huggingface.co/microsoft/Phi-3-mini-4k-instruct
https://arxiv.org/pdf/2404.14219.pdf
The text was updated successfully, but these errors were encountered: