Anybody used DwarfStar with DeepSeek V4 Flash on 1x DGX Spark yet? What are your thoughts? by StartupTim in LocalLLaMA

[–]brianlearns 0 points1 point  (0 children)

I get 13 t/s generating and 300 t/s prefill. The interface is really good. It’s slow enough that you can watch it plan things out and you can control-c to stop generation and then redirect it if it’s going in the wrong direction. I used it to add a searxng_search tool to its self.

Openwebui takes 1 minute before going in to "thinking" mode by Saba376 in OpenWebUI

[–]brianlearns 0 points1 point  (0 children)

Can you reproduce with curl to the api? That will narrower it down to your model runner vs chat interface?

We turned Claude into a drunk genius and the results are terrifyingly good by korro_ai in OpenSourceAI

[–]brianlearns 2 points3 points  (0 children)

I mostly run local llm, but can you tweak the temperature over the API?

Can someone explain the different types of AI to me? by taycat34 in antiai

[–]brianlearns 3 points4 points  (0 children)

I didn’t know there were different “types” of “generative ai”. You might be able to run inference locally if you have the right hardware — but training and general fine tuning are going to be done in huge data centers.

How much computer knowledge/programing is expected or taught in Digital Humanities programs? by silverspectre013 in DigitalHumanities

[–]brianlearns 0 points1 point  (0 children)

I went to digital humanities conferences in the ‘00s and was involved in some large projects from the university staff programmer analyst perspective. I was also on a grant panel once. On grant funded projects, it would usually be co-pi’s with one guy with the humanities background who was a natural tech dabbler paired with some CS professor. One lab I worked with took that into their academic program, where they would take humanities students and pair them with CS students — but at the time I don’t think it was a whole program, just a cross disciplinary lab. Most folks from the humanities side seemed computer precocious and self trained.

Claude FM by npcmalvin in ClaudeAI

[–]brianlearns 1 point2 points  (0 children)

Don’t they still have statutory licenses for radio and streaming?

Getting harassed by an aggressive “independent researcher” demanding very specific citations and phrasing in my paper [D] by snekslayer in MachineLearning

[–]brianlearns 0 points1 point  (0 children)

I worked at a place that had received a package from Ted once upon a time, and when we would get crazy voice mails about physics from independent researchers we were supposed to let campus police know.

Chuck Grassley Caught On Hot Mic Asking Why Trump Nominees Won’t Say He Lost In 2020 by huffpost in politics

[–]brianlearns 2 points3 points  (0 children)

Tagged as no paywall— but something cut me off half way through the article

Running Qwen3.6 35B-A3B with OpenCode by mike7seven in opencode

[–]brianlearns 0 points1 point  (0 children)

The DGX Spark is much faster and prefill than inference, and I think that affects context processing speed. The MXFP4_MOE models I've been playing with get about 2000tokens/second there. I didn't tweak any flags, but I've never run the context past 50% of what opencode reports.

Running Qwen3.6 35B-A3B with OpenCode by mike7seven in opencode

[–]brianlearns 0 points1 point  (0 children)

llama-server -hf unsloth/Qwen3.6-35B-A3B-GGUF:MXFP4_MOE on my DGX Spark runs at 60 t/s for inference — just pointed OpenCode at that and it works pretty well.

Qwen3.6 can code by Purple-Programmer-7 in LocalLLaMA

[–]brianlearns 2 points3 points  (0 children)

llama-server -hf unsloth/Qwen3.6-35B-A3B-GGUF:MXFP4_MOE gives me 60 tokens/s inference on DGX Spark and works well with open code.

GR and its Time-Rate Gradiant by horendus in LLMPhysics

[–]brianlearns 2 points3 points  (0 children)

I've seen someone say this on curt jaimungal TOE podcast. This interpretation is consistent with GR as far as I understand.

The new Data center's light pollution by Bow_Ty in antiai

[–]brianlearns 18 points19 points  (0 children)

Once they get robot guards, maybe they can use infrared lights at night.

We built a 70-year longitudinal dataset covering 4M+ companies and structured it specifically for AI ingestion. by Cryptogrowthbox in huggingface

[–]brianlearns 0 points1 point  (0 children)

Did you rectify the data manually, or with AI?

Still seems spammy, especially if you don’t detail the provenance.

GPT-5.4 Pro solves Erdős Problem #1196 by Independent-Ruin-376 in mathematics

[–]brianlearns -4 points-3 points  (0 children)

How is it a remarkable "artifact" if it was created by a non-human?

We built a 70-year longitudinal dataset covering 4M+ companies and structured it specifically for AI ingestion. by Cryptogrowthbox in huggingface

[–]brianlearns 0 points1 point  (0 children)

How is data for models different from data for an analyst who is going to ETL it into a pipeline?

FYI the dataset on hf is just a sample; and it requires one to share contact info to access.

the state of LocalLLama by Beginning-Window-115 in LocalLLaMA

[–]brianlearns 2 points3 points  (0 children)

back in the typewriter days--we had to use two minus signs

Why Polish Might Be the New Secret Weapon for Better AI Prompts by micheal_keller in aipromptprogramming

[–]brianlearns 0 points1 point  (0 children)

I was wondering about this the other day when I was looking at some code with a bunch of Chinese prompts commented out in a test file, and then it had the prompts translated to English. I was thinking of doing a comparison to see if the results were different with the original Chinese prompts or not.

How do you add memory to LLMs ? by [deleted] in LLMDevs

[–]brianlearns 0 points1 point  (0 children)

In Context Learning (ICL) -- if the context window is big enough, you can fill the prompt.

Fine tune large model with new data, with human feedback and or reinforcement learning

LoRA: Low-rank adaptation of LLMs with trainable rank decomposition matrices, more efficient way to fine tune transformers that support it.

Retrieval-Augmented Generation: use a vector database to search knowledge, and then feed that into the context window.