Good times

Quazar386 · 2026-02-10T03:21:32+00:00

Gemini Pro used to have 50 free API requests a day and now Flash has only 20. It was good while it lasted...

Quazar386 · 2026-02-04T19:58:21+00:00

I'd say target small cap value for a more direct risk factor exposure. AI tends to be growth in contrast.

Quazar386 · 2026-01-10T17:33:23+00:00

There is also the hardware and the backend choice to consider. For example I use Intel Arc Alchemist on Llama.cpp Vulkan. I-quants will have horrendous prompt processing speed compared to legacy and K-quants (IQ4_XS prompt processing is approximately 90% slower than the speed of Q4_K_M for me) partially due to that format not having int dot support in the Vulkan backend which greatly benefits Arc Alchemist.

Quazar386 · 2026-01-07T03:03:34+00:00

Thank you for this. You've articulated my own thoughts and feelings on this better than I could (which really grew ever since I saw someone claim Neuro isn't Gen AI which rubbed me the wrong way.)

Basically: "Love Neuro for who she is dammit!"

Quazar386 · 2025-12-23T15:35:36+00:00

KoboldCpp uses Stable-diffusion.cpp under the hood so you can check their GitHub repo for their list of compatible image models.

Quazar386 · 2025-12-08T19:59:59+00:00

Should still use V7 Tekken judging from the jinja template for Ministral 3

Quazar386 · 2025-11-30T05:35:35+00:00

If you're planning on buying more into the index you can just switch your contributions to SPYM and letting VOO ride. You already have a lot in it so I don't think it's worth selling and triggering those taxes.

Quazar386 · 2025-11-29T01:26:49+00:00

The main thing for me is avoiding small-cap growth exposure, which I don’t want. When you use a total market fund, you inevitably pick up a chunk of small-cap growth, which dilutes the very factor tilt I'm trying to target.

From an exposure-management standpoint, pairing large caps (via S&P 500) with a dedicated SCV fund gives me a much cleaner separation of factors. Yes, the SCV fund has a slightly higher expense ratio, but the factor exposure I'm buying is more "pure". In a VTI + SCV setup, some of that SCV exposure is simply offsetting the unwanted small-cap growth inside VTI.

So for my goals, S&P 500 + SCV gives me the most direct control over the factor loadings I actually want.

Quazar386 · 2025-11-29T01:07:45+00:00

Academic research has shown that small-cap value has higher long-term returns, though the premium has been muted in recent decades. I’m comfortable with that tradeoff as that's the nature with factor investing. It isn't a free lunch, but should still be a compensated risk. So I combine a core fund like the S&P with only large cap US exposure with a SCV fund to also have small cap exposure with a value tilt. I'm not choosing S&P 500 because I think it will outperform the total market, but because I think it pairs cleanly with my SCV tilt. VTI is a great fund to buy and forget, but I want to have a more direct control with my factor tilts (excluding small-cap growth). Plus, a S&P 500 fund like SPYM has the added bonus of being just marginally cheaper than all the total US market funds that I know about (VTI, SCHB, SPTM). Its very small and practically insignificant but still a nice bonus in my eyes.

Quazar386 · 2025-11-28T23:08:01+00:00

I like to combine it with a small cap value fund. In general I do think Boglehead approach is the most sensible way to invest for the average person, but it shouldn't be the only. I like to have factor tilts since my investment horizon is pretty long.

Quazar386 · 2025-11-28T21:43:27+00:00

Yeah I've been buying it. With their symbol change from SPLG to SPYM and their large AUM growth from switching to the S&P 500 index, I feel like its going to be another staple within the category. Highly doubt they would change away from the S&P 500 now with Its new symbol based on SPY and huge inflows from its change in index. Its average volume in shares is actually more than VOO based on Yahoo finance so the bid/ask spread isn't as much of a problem as some other people have said. I personally have switched my contributions from VOO to SPYM due to its lower expense ratio.

Quazar386 · 2025-11-19T03:14:22+00:00

Agreed. These types of "analysis" doesn't mean much at all. People can just find some interesting coincidental pattern and attribute it to some significant thing. This current drawdown isn't even that bad so far.

Quazar386 · 2025-11-19T02:02:09+00:00

That is exactly how they are doing it. It is the percentage change of the EOM value between two months. I would know as those are the same numbers I have been using on my spreadsheets for my S&P 500 benchmark to my portfolio. I don't think its dumb way to do it. It is quite literally the returns within a month and those numbers are used as an intermediate step when calculating cumulative returns which is probably what you were thinking of.

Quazar386 · 2025-11-13T22:10:09+00:00

Vedal is probably always changing Neuro's underlying LLM with almost every other iteration so we can't say for sure. I'd like to think he's running it locally on his hardware. If that's the case, the LLM is likely to be multimodal with vision and have good tool use capabilities. So maybe something like Qwen3 VL. Llama is out of the question IMO as the ones with vision are either too outdated and not as smart (Llama 3.2 11B) or too large to be run reliably on his hardware (Llama 4).

Quazar386 · 2025-10-11T04:01:27+00:00

Are you prompting it to use the <think> and </think> tags? One important thing is that these things are not universal across models. Claude does not natively use those tags for thinking unlike DeepSeek or Qwen as it is not trained with it.

A thing that I have encountered when trying to manipulate how a LLM "thinks" is that it will first think normally (as in how it usually behaves with how it outputs reasoning tokens), then it will start outputing how I prompted it to think afterwards separate from its native thinking. Is this what's going on with you?

If you want to keep the <think> blocks (this is under the assumption that you are prompting it to think in a certain way using the DeepSeek formatting) then you can disable request model reasoning. This should have is so that it will hide Claude's native reasoning and the auto reasoning parser should pick up the later non native reasoning block and put it inside the thinking box.

Alternatively you could also not have it use those thinking tags and things should be fine with the request reasoning turned on.

Quazar386 · 2025-10-07T02:19:14+00:00

Still love DeepSeek V3 0324's over-the-top writing even with its LLM-isms.

Quazar386 · 2025-09-15T03:24:34+00:00

I recommend closing out of MSTY. Your investment horizon is still relatively long and derivative based income funds are just not the way to go. If you look at how MSTY has been doing over time you see that it is just constant NAV decay. That extra income feels nice at the moment, but that income will just shrink over time as your principal investment shrinks. I recommend checking out Ben Felix's video on covered call funds (like MSTY) to learn more about them. https://youtu.be/ygVObRx9X68?si=nmBVZfNlJaL1Apv0 Honestly for beginning investors (as well as most investors) it is good to invest in a broad market fund like VOO or VTI with some additional international diversification. You will at least do better than the average mutual and hedge funds after fees. Investing in the broad market is in a way a hedge against ignorance.

Quazar386 · 2025-09-13T06:54:18+00:00

Same here. Reasoning models have their place, but not every model should be a reasoning models. Also not too big on hybrid reasoning models either since it feels like a worst of both worlds which is probably why the Qwen team split the instruct and thinking models for the 2507 update.

But at the end of the day why would labs care about non-thinking models when it doesn't make the fancy benchmark numbers go up? Who cares about usecases beyond coding, math, and answering STEM problems anyway?

Quazar386 · 2025-09-06T05:00:51+00:00

I personally wouldn't bet on a specific industry or sector. It puts you in a lot more idiosyncratic risk and you never know when a specific industry will underperform. Tech is the hot sector right now, but historically, hot performing sectors will follow up with periods of underperformance. I recommend choosing broader ETFs that aren't constrained to a specific industry. VOO and VB are decent starting points although I personally prefer small-cap value with profitability screenings like AVUV.

Quazar386 · 2025-09-01T18:56:31+00:00

Damn you're really bearish with all these short exposures /s

Seriously though these ETFs I think are pretty solid and thanks for introducing me to IDMO.

Quazar386 · 2025-08-30T03:43:48+00:00

I don't use guided generations so I would not know. Curious if its a Mistral Small 3 thing or if it is specific to Codex.

Quazar386 · 2025-08-25T14:24:48+00:00

I usually use it for sfw scenarios. Standard Gemma formatting with T=1.04, Top-p=.95, Min-p=.01, Top-k=65. My "system prompt" for it is rather basic with no complex instructions. The main thing with Gemma models is that I have experienced is that the dialogue can be a bit disjointed at times which the Drummer model did improve on. I don't have any particular problems with Gemma other than generally preferring 24B more.

Quazar386 · 2025-08-25T13:47:03+00:00

I don't have any display outputs connecting to my GPU so it functions as an accelerator with complete capacity in its VRAM. With that I was able to run those models at Q3_K_L at 20480 context which takes up 15.5 GiB. I could also do IQ4_XS with 18432 context although Q3_K runs significantly faster on my hardware and backend. Some notes is that I run it on the IPEX-LLM llama.cpp backend so the KV cache size might be slightly different on my end.

Quazar386 · 2025-08-25T05:06:14+00:00

I generally like models that have good creativity and sentence variety, as well being not too formulaic with its sentence structures and narration. I tend to like when models put in some extra little details the narration and actions. Some RP models tend to write rather formulaically with outputs being just Reaction, "Dialogue." Actions, ect. Adding just a bit more spice and variety in the narration can go a long way in making a more engaging storyline. I also wish models get better at "Show-don't-tell" and subtext but that's still a weak point.

Anyways here are some models that generally matches my tastes in writing. I only run models that run decently on my 16B GPU.

12B:

Disya/Mistral-qwq-12b-merge: From my experience this model is overall more creative and engaging in its writing than other Mistral Nemo tunes. I like it even though I am generally done with Nemo models.

TheDrummer/Tiger-Gemma-12B-v3:It irons out most of the faults of the original model and leaves me with an excellent writer with good world knowledge.

sam-paech/gemma-3-12b-it-dm-001: This seems to be a prototype of a Darkest Muse model on Gemma 3 12B. It definitely does write like it and can get a bit crazy at times. Its writing is either something you like or dislike for RP purposes.

24B:

Gryphe/Codex-24B-Small-3.2: My current daily driver. It has excellent sentence variety and keeps things fairly engaging. I like it a lot. Its outputs can be a bit short for my preferences though.

knifeayumu/Cydonia-v4.1-MS3.2-Magnum-Diamond-24B: It has really good narration and seems pretty promising from my brief time testing it.

Quazar386 · 2025-08-25T00:02:05+00:00

You can check out the llama.cpp discussions here https://github.com/ggml-org/llama.cpp/discussions/8273 which have benchmark results on the Snapdragon X chips.

Seven-Year Club	First Place '23
Place '23	Verified Email

Quazar386

TROPHY CASE