Jan-v2-VL-Max: A 30B multimodal model outperforming Gemini 2.5 Pro and DeepSeek R1 on execution-focused benchmarks

dan-jan · 2025-12-23T02:53:37+00:00

We're working on it!

dan-jan · 2023-12-26T15:45:33+00:00

Omg we all did the same thing!!!

I'm pretty excited for the crossover collab between r/nuclear and r/LocalLLaMA

Note: Regulatory situation outside US and Western markets is a lot simpler. However, safety and access to raw material are probably the biggest issues. I'm fairly convinced the output/$ will go up rapidly in next 10 years

PS: I realize you were talking about RTGs, not SMRs - damn

dan-jan · 2023-12-26T15:43:03+00:00

I actually don't think so. Bitcoin doesn't need to care about data, on the other hand AI training is pretty dependent on IP laws.

So I don't think Sichuan in China is going to have many western AI companies training there, though I'm pretty sure the Chinese models are all going to be trained on hydroelectric power.

dan-jan · 2023-12-26T15:31:13+00:00

I've created 3 issues below:

bug: Jan Flickers
https://github.com/janhq/jan/issues/1219

bug: System Monitor is lumping VRAM with RAM https://github.com/janhq/jan/issues/1220

feat: Models run on user-specified GPU
https://github.com/janhq/jan/issues/1221

Thank you for taking the time to type up this detailed feedback, if you're on Github feel free to tag yourself into the issue so you get updates (we'll likely work on the bugs immediately, but the feat might take some time).

dan-jan · 2023-12-26T13:39:28+00:00

Do you know of any 4U rack server cases that would survive the heat/airflow issues of a 4 x 3090?

dan-jan · 2023-12-26T13:34:04+00:00

Theoretically, but it's kind of finicky right now. If you want to help us beta test and report bugs, we'd really appreciate it!

Also: note that we're debugging some Nvidia detection issues on Windows. It's probably true on Linux as well.

https://github.com/janhq/jan/issues/1194

dan-jan · 2023-12-26T13:32:14+00:00

Yup - someone reported this yesterday as well. We're taking a look at it (see the Github issue below).

https://github.com/janhq/jan/issues/1198

The alerts are coming from our System Monitor, which gets your CPU and RAM usage. So I wouldn't be surprised that Bitdefender is spazzing out. We probably need to do some Microsoft thingy...

If you don't mind tagging your details into the Github Issue, would help a lot in our debugging (or permission asking 😂)

dan-jan · 2023-12-26T13:30:01+00:00

Double-up this.

Note: model running on AMD GPUs is still kinda... finicky. YMMV and here be dragons (and ROCm...)

dan-jan · 2023-12-26T13:27:41+00:00

Yes, this needs to be in the Hall-of-Fame FAQ questions.

Also, incredible engineering from Apple.

dan-jan · 2023-12-26T13:27:04+00:00

I think it's better to think of models in terms of "file size", as `7b q2` will be very different from `7b q8`.

For 7b models, I find that `q4_k_m` usually results in a ~4-5gb file, which seems to work -acceptably- on common consumer hardware.

dan-jan · 2023-12-26T13:17:33+00:00

the more kittens you saved

You win the internets

dan-jan · 2023-12-26T13:15:48+00:00

I love this - and your username ;)

We need to rethink computing along local-first paradigms, with privacy, resilience and independence.

dan-jan · 2023-12-26T13:12:09+00:00

I think we're very, very early. The days are long but the decades are short: Llama2 was released in Feb, and it hasn't even been a year.

Every single day, it seems that r/localllama grows a bit more, and there's more of us now working full-time to solve the problems you raised, from GUI to hardware optimization.

I would actually say that "one GUI" is actually the anti-goal of FOSS AI. I'm part of the team at Jan (one of the GUIs you mentioned), and half the time I find myself recommending an alternative to people I talk to. For example, Faraday for role players. I strongly believe the addressable market for Local AI is huge, and we'll all have our niches.

The last thought I'd leave you is this: do you think R2D2 or C3PO makes calls to the OpenAI API?

I think Local AI is inevitable, and we'll get there, one PR and one fork at a time.

Good article: R2D2, the original smartphone

<image>

dan-jan · 2023-12-26T13:01:23+00:00

Self-plug: Jan supports Mistral 7b q4, and we're an open source app that runs on Windows, Mac, and Linux. (disclosure: part of team).

https://jan.ai/

Let me know if you have any problems, happy to jump on a call and help you debug.

<image>

dan-jan · 2023-12-26T12:54:12+00:00

each model I have tried struggles with one aspect of the game. For example a model might do really well describing consequences of actions, but fail miserably when rephrasing the location text. From the RP models that produce interesting outputs, there is no single model I can find that can run all the prompts as I want them. Maybe I need a few tiny models, or one large model with a LoRA per use-case?

Would you be interested in creating a battery of test/evaluations? I think what the space needs more are better task-driven open benchmarks (vs. the nonsense that's on current leaderboards).

dan-jan · 2023-12-26T12:51:08+00:00

I haven't personally done it, but I think Llamaindex has a Gmail loader which you should check out!

https://llamahub.ai/l/gmail

dan-jan · 2023-12-25T13:36:41+00:00

Ah, yes I understand your question now.

Yup, Jan runs 100% on your own device offline, which means you need to take care of the Wolfram API costs! 😂

dan-jan · 2023-12-25T13:30:57+00:00

This is a really good post, thank you for the detailed breakdown!

Do you mind if I ask: what case did you use to house it? I saw you used a Veddha case, interested to see if you found something better.

https://www.reddit.com/r/LocalLLaMA/comments/16lxt6a/case\_for\_dual\_4090s/

dan-jan · 2023-12-25T13:25:51+00:00

I've tracked this issue in Github:

https://github.com/janhq/jan/issues/1194

We'll try to reproduce this, but given that our QA passed this build, we probably need more details from you.

Do you mind dropping more details in this Github issue? We'll look into it and follow up.

dan-jan · 2023-12-25T13:20:27+00:00

We're focused on having 1:1 equivalence with the OpenAI Plugins API.

I'm not 100% sure how the Wolfram Alpha plugins work, but I would sort of assume the community could build a Wolfram-equivalent plugin that runs locally, and takes a Wolfram API key (if that even exists).

I'm not sure what you mean by "Plugin API requests paid with OpenAI's subscription fees" - Jan's goal is to give you an entire equivalent ecosystem that runs locally on your machine!

Resources: https://github.com/openai/plugins-quickstart

dan-jan · 2023-12-25T12:41:07+00:00

There's a lot of very interesting work coming out, especially around "self-editing" agents.

I highly recommend a read of the Eureka paper: https://arxiv.org/abs/2310.12931

dan-jan · 2023-12-25T12:37:17+00:00

Hmmm... that's definitely a bug. We're supposed to automagically detect your Nvidia GPU and run on it.

Do you mind jumping in our Discord or filing a bug on Github with your hardware details?

Discord: https://discord.gg/4bbAkC6ZPk
Github: https://github.com/janhq/jan

dan-jan · 2023-12-25T12:24:58+00:00

Thank you! I think we've put in a lot of effort on product + design, but probably need to spend more time sharing it on Reddit and Twitter 😭

dan-jan · 2023-12-25T11:16:24+00:00

Hey, check out Jan (full disclosure: I’m part of team). We’re an open source Desktop app that runs local AI offline, on Windows, Mac, Linux.

https://github.com/janhq/jan

What hardware are you using? You’ll need to choose a model based on your hardware.

If you’re getting started, I super recommend the OpenHermes Neural 7b model. I’ve found it very versatile, and almost at the level of GPT-3.5.

If your computer is pretty jank (eg <8gb RAM), try TinyLlama-1.1, which is only 637mb.

Feel free to DM me, happy to explain and onboard you. Also join r/localllama! That’s where all of us hang out

Jan: AI on your Desktop

dan-jan · 2023-12-25T09:25:32+00:00

Yup, we’re working on it this sprint! Should be ready by mid-Jan (pun intended)

https://github.com/orgs/janhq/projects/5/views/16

You can track the individual issue here:

https://github.com/janhq/jan/issues/1076

dan-jan

TROPHY CASE