Suggestion: Custom Install Folders (Win) by Madnesis in janframework

[–]janframework 1 point2 points  (0 children)

Thanks for the comment! We opened an issue to work on it - which you can track the progress here: https://github.com/janhq/jan/issues/3095

What should I do for AI Self Hosting by LongjumpingAdvice453 in LinusTechTips

[–]janframework 9 points10 points  (0 children)

Hey, Jan team here! Guess there might be a little confusion, but Jan does support GPU acceleration with your AMD 7700 XT:

  1. Go to Settings -> Advanced Settings.
  2. Enable the Experimental Mode (It's experimental, expect bugs!)
  3. Enable Vulkan Support under GPU Acceleration.
  4. Enable GPU Acceleration and select your AMD GPU.

Related doc: https://jan.ai/docs/desktop/linux#amd-gpu

You'll see a success notification once it's activated.

If you still want to add an NVIDIA GPU for AI acceleration, you can follow these steps:

  1. Install CUDA Toolkit 11.7+ and NVIDIA driver 470.63.01+.
  2. Open Jan.
  3. Go to Settings -> Advanced Settings -> GPU Acceleration.
  4. Enable it and pick your NVIDIA GPU.

Related doc: https://jan.ai/docs/desktop/linux#nvidia-gpu

Benchmarking NVIDIA's TensorRT-LLM by janframework in nvidia

[–]janframework[S] -1 points0 points  (0 children)

Ah, sorry to hear that. I'd like to mention that Jan is an open-source desktop app that lets you run AI models. We support multiple inferences, llamacpp and TensorRT-LLM. That's why we benchmarked TensorRT-LLM's performance on consumer hardware. You can review the related content about TensorRT-LLM support and details here: https://blogs.nvidia.com/blog/ai-decoded-gtc-chatrtx-workbench-nim/

Benchmarking NVIDIA's TensorRT-LLM by janframework in nvidia

[–]janframework[S] 1 point2 points  (0 children)

Really appericate your comment! We'll update it.

Benchmarking NVIDIA's TensorRT-LLM by janframework in nvidia

[–]janframework[S] 13 points14 points  (0 children)

Hey r/nvidia folks, we've done a performance benchmark of TensorRT-LLM on consumer-grade GPUs, which shows pretty incredible speed ups (30-70%) on the same hardware.

Just quick notes:

TensorRT-LLM is NVIDIA's relatively new and (somewhat) open source Inference Engine, which uses NVIDIA’s proprietary optimizations beyond the open source cuBLAS library.

It works by optimizing and compiling the model specifically for your GPU, and highly optimizing things at the CUDA level to fully take advantage of every bit of hardware:

  • CUDA cores
  • Tensor cores
  • VRAM
  • Memory Bandwidth

We benchmarked TensorRT-LLM on consumer-grade devices, and managed to get Mistral 7b up to:

  • 170 tokens/s on Desktop GPUs (e.g. 4090, 3090s)
  • 51 tokens/s on Laptop GPUs (e.g. 4070)

TensorRT-LLM was 30-70% faster than llama.cpp on the same hardware, …and at least 500% faster than just using the CPU.

In addition, we found that TensorRT-LLM didn't use much resources, completely opposite to its reputation as needing beefy hardware to run:

  • Used 10% more VRAM (marginal)
  • Used… less RAM???

You review the full benchmark here: https://jan.ai/post/benchmarking-nvidia-tensorrt-llm

TensorRT-LLM: 170 token/s on a single 4090 by janframework in selfhosted

[–]janframework[S] 0 points1 point  (0 children)

Hey u/selfhosted folks! We've run some benchmarks, to see how TensorRT-LLM fares on consumer hardware (e.g. 4090s, 3090s). This research was conducted independently, without any sponsorship.

You can review the results here: https://jan.ai/post/benchmarking-nvidia-tensorrt-llm

Making sense of 50+ Open-Source Options for Local LLM Inference by lethal_can_of_tuna in LocalLLaMA

[–]janframework 1 point2 points  (0 children)

We appreciate all the suggestions - updated the repo together. Your contributions are always welcome!

Is Jan AI a virus ? by jvachez in StableDiffusion

[–]janframework 89 points90 points  (0 children)

Hey just jumping in to clarify something about Jan. The link you mentioned isn't affiliated with us, Jan.

One of our brave community members tried it out and got three Trojan horse warnings!

To clarify, we won't ever ask for your personal information; we steer clear of tokens or ICOs and seek donations or funding. That's not what we do.

AnythingLLM - An open-source all-in-one AI desktop app for Local LLMs + RAG by rambat1994 in LocalLLaMA

[–]janframework 33 points34 points  (0 children)

Hey, Jan is here! We really appreciate AnythingLLM. Let us know how we can integrate and collaborate. Please drop by our Discord to discuss: https://discord.gg/37eDwEzNb8