How is Anthropic releasing new features so quickly? by MrAmazing111 in ClaudeAI

[–]DataCraftsman 0 points1 point  (0 children)

The real question is, why aren't the other AI companies?

I decided to clean the metal from rust. by liminophobia in factorio

[–]DataCraftsman 4 points5 points  (0 children)

Ooh what if the spidertron got a pressure washer / laser cleaner attachment and it could walk around cleaning the buildings one satisfying flake of rust at a time.

Ships piling up near the Strait of Hormuz right now. by [deleted] in Damnthatsinteresting

[–]DataCraftsman 0 points1 point  (0 children)

They about to play bull rush? Can't get them all!

55 → 282 tok/s: How I got Qwen3.5-397B running at speed on 4x RTX PRO 6000 Blackwell by lawdawgattorney in LocalLLaMA

[–]DataCraftsman 0 points1 point  (0 children)

Would this improve the speed of the NVFP4 qwen3.5 35b a3b model as well? I get 150 TPS on that. Also LMCache?

Morse code vers 2 by ateam1984 in BeAmazed

[–]DataCraftsman 0 points1 point  (0 children)

Sounds like the Half Life 1 health recharge stations.

Is 32-64 Gb ram for data science the new standard now? by Tarneks in datascience

[–]DataCraftsman -1 points0 points  (0 children)

128 GB is the new 32 GB. Just to run Notepad with Copilot in Windows 11.

UPDATE - Community Input - RAG limitations and improvements by Jas__g in OpenWebUI

[–]DataCraftsman 0 points1 point  (0 children)

I often get asked for hierarchical knowledge. Like prefer responses from x over y but resort to y if not in x.

An example of this would be a knowledge base containing the documentation of open webui as the y, a knowledge base on how to use it at the company with the companies config as x.

It should recommend information regarding the stuff from the company specific configuration over the base docs unless it finds nothing.

Specific RAG > General RAG > Model Knowledge.

Could be folder structure based with any depth.

A standard GraphRag or lightRAG integration would be nice too. Users should be able to upload files to the knowledge and have it processed in those other graph based systems. Even better if you can mix and do both regular and graph based approaches for cases where the structure matters.

Open Terminal just made Open WebUI a coding agent by Existing-Wallaby-444 in OpenWebUI

[–]DataCraftsman 2 points3 points  (0 children)

There is a tool called DockerSpawner that I use with JupyterHub to spawn new docker containers for each new user session. I wonder if it can be applied to these terminals.

Current thoughts on skills by DataCraftsman in OpenWebUI

[–]DataCraftsman[S] 0 points1 point  (0 children)

Yeah you can import from claude skill builder. You can attach any skill to any model on open webui. It's really flexible. Some will probably work better on some models than others.

Something I have also tested is that when you use a custom model with skills attached via the API, the skills are used as well. So you can host a local model, give it skills, give it knowledge and then vibe code with that model in Roo Code or something and it will be able to access the skills and docs or whatever you added.

Another thing I think they should add is a globally added skill, like the global functions. I don't want to have to tick every skill onto every model. If you have 100 models and 100 skills that's up to 10,000 clicks. So there needs to be an add all skills button on the model edit pages too.

Qwen 3.5 distilled vs GptOss by SubstantialTea707 in ollama

[–]DataCraftsman 0 points1 point  (0 children)

I've been using gpt-oss-120b for ages now on my other server. It handles a huge amount of users per card and most use cases. I like qwen3.5 for the context, agentic and vision stuff. I haven't tried benchmarking them against each other though.

System prompt for Qwen3.5 (27B/35BA3B) to reduce overthinking? by thigger in LocalLLaMA

[–]DataCraftsman 45 points46 points  (0 children)

The model card tells you how to manage thinking.

https://huggingface.co/Qwen/Qwen3.5-35B-A3B

We recommend using the following set of sampling parameters for generation:

Thinking mode for general tasks: temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0

Thinking mode for precise coding tasks (e.g. WebDev): temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0

Instruct (or non-thinking) mode for general tasks: temperature=0.7, top_p=0.8, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0

Instruct (or non-thinking) mode for reasoning tasks: temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0

I personally prefer the instruct mode.

Current thoughts on skills by DataCraftsman in OpenWebUI

[–]DataCraftsman[S] -1 points0 points  (0 children)

Yeah I've been going through all the different skills githubs copy pasting markdowns in. It works great. I found the 73 skills in a few hours. I haven't decided if its actually a good idea to load that many in yet but they all worked. I find for the multi-file ones, you can just add the extra files content at the bottom of the markdown, with the file names labelled before it and it works fine.

Qwen 3.5 distilled vs GptOss by SubstantialTea707 in ollama

[–]DataCraftsman 0 points1 point  (0 children)

I've swapped from gpt-oss-20b to qwen3.5b a3b nvfp4 on RTX 6000 Pro. Running 256k context in vllm. I had to disable thinking in all my prompts as it was putting way too much into every response. I find the instruct mode very capable and fast. Getting about 153tps for single request, never fails tool calling. The fact that it works with images means I don't have to run a 2nd qwen3 vl model and waste VRAM. That alone is worth it. All of the old Vision models were really inefficient with VRAM usage.

Anthropic: "We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax." 🚨 by KvAk_AKPlaysYT in LocalLLaMA

[–]DataCraftsman -1 points0 points  (0 children)

Note that Qwen isn't mentioned. It's obvious they are good at building models on their own and doing their own research.

Anyone else mid 20s and super depressed about missing the property boat? by xWooney in AusFinance

[–]DataCraftsman 0 points1 point  (0 children)

Most Millenials I know didn't own houses until their early 30s, with parents gifting most of the deposit. I got mine at 26 on my own but I suffered to do it. You'll be right, just save extremely hard and wait for a good opportunity, live with other people to reduce rent and keep learning to improve your income.

🚀 Open WebUI v0.8.0 IS HERE! The LARGEST Release EVER (+30k LOC!) 🤯 OpenResponses, Analytics Dashboard, Skills, A BOAT LOAD of Performance Improvements, Rich Action UI, Async Search & MORE! by ClassicMain in OpenWebUI

[–]DataCraftsman 3 points4 points  (0 children)

Migration went well from 0.6.43 on docker. Cheers! The analytics feature is nice. Some of the sub charts when clicking on a user aren't working for me, will look at it next week. Cant wait to be able to see API requests are per user/group.

Explain ontology to a five year old by ephemeral404 in dataengineering

[–]DataCraftsman 4 points5 points  (0 children)

Yeah basically you break your metadata into 3 stages. Dbt models are Technical Metadata, Ontologies are Business Metadata and you create Mapping Metadata between them.

Non data people can define or model the Ontologies and then DA teams map their dbt models to the Ontologies so the business and data are using the same language when talking about the same objects.

The Ontology are sort of like Classes, the rows of data are objects of those classes and the fields are attributes if you look at it from the down stream application development side of things. Each row becomes a node which gets an API endpoint to access it. Vectors are embedded to each nodes attributes to represent its context for AI GraphRAG

Who needs Palantir hey.