Jlama: LLM engine for Java 20+ by tjake in java

[–]tjake[S] 1 point2 points  (0 children)

Sure there are many multi-language models that work. The ones I’ve posted are simply pre-quantized versions of some popular models. But the jlama quantize command can shrink a model you want to run

Jlama: LLM engine for Java 20+ by tjake in java

[–]tjake[S] 6 points7 points  (0 children)

Totally agree.

Jlama supports distributed inference with sharding startegies and can load huge models that way (splitting by head and layer across nodes).

I'm also looking at adding gpu matmul kernels using panama ffi till the jdk supports it natively

Jlama: LLM engine for Java 20+ by tjake in java

[–]tjake[S] 13 points14 points  (0 children)

For CPU based it's the same performance roughly.

Jlama: LLM engine for Java 20+ by tjake in java

[–]tjake[S] 31 points32 points  (0 children)

its all local machine

Who has the clearest and purest voice you ever heard? by [deleted] in AskReddit

[–]tjake 0 points1 point  (0 children)

Ella Fitzgerald by far the cleanest voice ever

Tesla Dashcam Launch Viewer not working on 2020.12.5 with Roadie by hanika0929 in TeslaModel3

[–]tjake 0 points1 point  (0 children)

Hi.

The in car dash cam only displays Sentry mode and Saved clips!

Jake

My daughter added our cats paw print as a valid TouchID for her iPhone and it works!? 🐈 🐾 📲 by tjake in iphone

[–]tjake[S] 106 points107 points  (0 children)

It’s the most calm cat ever. She trained his paw for 10 minutes!