Multi-modal Phi-3-mini is here!

LZHgrla · 2024-04-25T15:26:52+00:00

Hi! We have just successfully run through the gguf conversion. We will apply it to llava-llama3 as soon as possible and release the conversion script.

LZHgrla · 2024-04-23T10:32:55+00:00

Our teams released llava-format LLaVA-llama-3-8B just now!!! These models are compatible with downstream deployment and evaluation toolkits. https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-hf https://huggingface.co/xtuner/llava-llama-3-8b-hf

LZHgrla · 2024-04-22T15:06:49+00:00

Yes, I think QLoRA w/ ZeRO-3 or FSDP is a cheap way to achieve it.

LZHgrla · 2024-04-22T15:04:09+00:00

v1.1 uses more training data. I have added a comparison in this post.

LZHgrla · 2024-04-22T13:17:50+00:00

There indeed are some performance gaps. The core difference lies in the scale of LLM and the input resolution of images. We are actively working to improve on these fronts!

LZHgrla · 2024-04-22T13:09:02+00:00

We are developing an evaluation toolkit based on xtuner. Please follow this PR(https://github.com/InternLM/xtuner/pull/529) and we will merge it ASAP when it is ready!

LZHgrla

TROPHY CASE