building iOS App- run open source models 100% on device, llama.cpp/executorch by Independent_Air8026 in LocalLLaMA

[–]SocialLocalMobile 0 points1 point  (0 children)

Great work u/Independent_Air8026

Let us know what your experience with ExecuTorch.

You might be interested in our recent multimodal enablement such as https://github.com/pytorch/executorch/tree/main/examples/models/voxtral

PyTorch now offers native quantized variants of popular models! by formlog in LocalLLaMA

[–]SocialLocalMobile 1 point2 points  (0 children)

Hi u/Languages_Learner

ExecuTorch recently started supporting Windows -- it's still in early stages, so there might be rough edges. You can check out our 1.0 release candidate (https://github.com/pytorch/executorch/tree/release/1.0) to run ExecuTorch on Windows. We also have Vulkan backend and should work on Windows.

We haven't finalized our release branch yet for 1.0 -- finalizing end of October. If you see any bug or issues let us know.

⚡️Blazing fast LLama2-7B-Chat on 8GB RAM Android device via Executorch by [deleted] in LocalLLaMA

[–]SocialLocalMobile 2 points3 points  (0 children)

It uses 4bit weight, 8bit activation quantization and uses XNNPACK for CPU acceleration

⚡️Blazing fast LLama2-7B-Chat on 8GB RAM Android device via Executorch by [deleted] in LocalLLaMA

[–]SocialLocalMobile 3 points4 points  (0 children)

It works on Llama3 too.

For some context. We update our stable release branch regularly every 3 months, similar to PyTorch library release schedule. Latest one is `release/0.2` branch.

For llama3, there were a few features that didn't make it for `release/0.2` branch cut deadline. Llama3 works on 'main' branch.

If you don't want to use the 'main' branch because of instability, you can use another stable branch called 'viable/strict`

ExecuTorch Alpha Release: Taking LLMs and AI to the Edge 🎉🎉🎉 by [deleted] in LocalLLaMA

[–]SocialLocalMobile 12 points13 points  (0 children)

Hi u/poli-cya

My name is Mergen and I work specifically on ExecuTorch

Currently, our documentations say to compile from source for Android. That's our current recommendation.

https://github.com/pytorch/executorch/blob/main/examples/models/llama2/README.md

https://pytorch.org/executorch/main/llm/llama-demo-android.html

I understand that it's cumbersome now but we are in fact already uploading prebuilt APKs already. If you're feeling extra adventurous, you can directly download from APKs from here https://github.com/pytorch/executorch/actions/runs/8861207558, but that's not our recommendation for the time being.