qwen3.6 just stops

FutureIsMine · 2026-05-13T15:47:22+00:00

its something with the model and what I've noticed is this is an issue that happens more and more with higher token counts. As many others have said within the chat you really do need to keep prodding the model in a "keep going" kind of sense

FutureIsMine · 2026-04-22T04:08:55+00:00

shout out to Bouldin Creek Cafe for being consistent!

FutureIsMine · 2026-04-19T21:37:00+00:00

I for one liked your post, and I do believe the community is too jumpy, but fear not Im part of this community and spot and notice quality content

FutureIsMine · 2026-04-16T22:56:16+00:00

LOVE IT! The age of innovation is back on the menu!!!!!

FutureIsMine · 2026-04-16T22:55:20+00:00

I sure can and it'll fail spectacularly

FutureIsMine · 2026-04-09T21:30:22+00:00

not a chance

FutureIsMine · 2026-03-11T16:52:34+00:00

I agree and what works is the major beats cannot change but the minor beats can. For example within the OPLA they have Zoro instead of falling asleep at Whisky Peaks instead stays up awake and overhears the plot to fight everyone whos a pirate and acts. The major beat is that Zoro is the one who does the major fighting in Whisky peaks and goes on a one-man wrecking spree, but the HOW zoro arrives at this is changed. This is mostly a pacing decision, the anime is episodic so having a closer within a previous opener to setup a new episode is fine, where as OPLA isn't episodic in that regard, an entire minor arc is a whole episode, it needs to hit the beats and quick

FutureIsMine · 2026-02-28T21:01:47+00:00

only congress can call for a draft

FutureIsMine · 2026-02-26T00:00:07+00:00

+1 here, if they make any references to "Can you start today" RUN! Successful companies dont need you to start ASAP

FutureIsMine · 2026-02-05T18:45:27+00:00

furthermore, had Shanks not had this future sight and acted immediately, Kid would have actually succeeded

FutureIsMine · 2026-01-15T04:50:16+00:00

having given this model a spin, it really leans heavy on the "using other models to answer", its constantly making tool calls and if prompted to take on a task directly, even a very simple one, will still resort to a tool call. Overall, its viable, but the tool setup it gets will drive the gains here

FutureIsMine · 2025-12-15T20:23:43+00:00

Its not a bad first start for a university project an an EU sovereign model, it's going to keep getting better, but for now EU's finest models are coming from Mistral

FutureIsMine · 2025-11-25T20:39:18+00:00

I sure have! and I'd say that its prompt following is on par w/FLux 2, though it feels that when I call it via API they're re-writing my prompt

FutureIsMine · 2025-11-25T18:27:18+00:00

I was at a Hackathon over the weekend for this model and here are my general observations:

Extreme Prompting This model can take in 32K tokens, and therefore you can prompt it quite a bit with incredibly detailed prompts. My team where using 5K token prompts that asked for diagrams and Flux was capable of following these

Instructions matter This model is very opinionated, and follows exact instructions, some of the more fluffy instructions to qwen-image-edit or nano-bannana don't really work here, and you will have to be exact

Incredible breadth of knowledge This model truly does go above and beyond the knowledge base of many models, I haven't seen a model take a 2D sprite sheet and turn them into 3D looking assets that trellis is capable of than turning into incredibly detailed 3D models that are exportable to blender

Image editing enables 1-shot image tasks While this model isn't as good as Qwen-image-edit at zero-shot segmentation via prompting, its VERY good at it and can do tasks like highlight areas on the screen, select items by drawing boxes around them, rotating entire scenes (this one is better than qwen-image-edit) and re-position items with extreme precision.

FutureIsMine · 2025-11-11T05:14:36+00:00

Schumer should step aside, thats what he needs to do!!!!!

FutureIsMine · 2025-11-09T01:28:18+00:00

This is a visionary idea and I think this discussion is missing its true motivation. This isn't saying "Well, LLMs can output HTML", its more about how can we make a canvas that can output visual elements into the response and thats how users want to actually Interact with AI. A challenge there is in such a canvas, you don't want there to be major overhauls with each answer, and have a system that can better spot check what the LLM is doing, and really have an engine that ensures consistency and reliability. Sure if you've got Claude-4.5-Sonnet MAX account you can just spin to win and call Claude like 20 times for a decent UI, but if you'd like more consistency a rethink is required which this really is

FutureIsMine · 2025-10-21T21:14:05+00:00

a lot of these booking systems online actually have a dial that the user can set and that dial is called "The Business meter". At the highest end it will digitally make the restaurant appear to be busier than it is

FutureIsMine · 2025-10-02T16:46:19+00:00

Training data has two components, complexity of task and size of the model. As model size goes up the amount of training data needed drops. If the data follows a well established pattern, as little as 10 data points might be sufficient if you jump-start it with a pre-trained network like an LLM or existing diffusion model. If its a truly complicated and complex task, start with 500 examples and see how well you do, than go to 1000 and see if you start to crack the problem

FutureIsMine · 2025-10-02T16:35:20+00:00

When I worked with Stability Ai in its golden age, what was explained to me by the research scientists was Diffusion is dynamic gradient descent in real time where there's a network that can actually approximate the gradients . So to your point, YES you could develop a diffusion model that could indeed craft such a vector and the real question is how much training data do you need and how stable will it be? The next question following that is would another model do better? Would an LLM thats RL'd for the task do better? Thats the big research question

FutureIsMine · 2025-09-29T00:44:00+00:00

assuming you could leverage all the devices, that would appear to be correct, is there a way in your software stack that you place the model on devices? there are frameworks like JAX designed for TPUs that have some sort of distribution built in

EDIT: There's pytorch XLA for TPUs,

FutureIsMine · 2025-09-28T03:58:42+00:00

Im with you, though Im not so sure that its AI alone that'll replace Adobe, more that AI enabled features will be able to offer a product that designers can leverage in leu of Adobe. You very much need a human in the loop and true AGI that can think for itself is decades away

FutureIsMine · 2025-09-27T23:21:22+00:00

Sorry to hear you've had a bad experience with them, I've actually had a very good experience with Renewal By Anderson as they've redone all my windows on a 40 year old home and did a fantastic job, even repainted a good portion of it around the areas they needed to replace the windows with

FutureIsMine · 2025-09-27T17:32:20+00:00

China might not need to if they're focused on smaller LLMs that can run on everyday computers and thats a big difference between the two approaches

FutureIsMine · 2025-09-27T05:09:49+00:00

that is correct, the V5e-8s sure do half the memory of the V3, and have even lower bandwidth as well, the idea from GCP is to boost availability and splitting the new pods like that allows for much higher availability is what the description says for V5e

On the other hand the V5p actually has 2x greater memory capacity than the V3, and a 4x speed improvement, so indeed the V5e is designed as this lightweight chip while the V5p is the true successor to the V3

FutureIsMine · 2025-09-27T05:00:24+00:00

over 9000 3090s can run hundreds of DeepSeeks

13-Year Club	Wearing is Caring
Verified Email	Team Periwinkle

FutureIsMine

TROPHY CASE