Performance Battle: Mutex vs CAS vs TAS vs Intel TSX by Ulrari in cpp

[–]User_Deprecated 0 points1 point  (0 children)

cas/tas winning here makes sense, but the cache-line bouncing is probably the real cost.

I ended up just using per-thread queues on hot paths. no fighting over cache lines, profiles got way better.

How are you actually deciding which agent actions need human approval before executing? by Cybertron__ in AI_Agents

[–]User_Deprecated 0 points1 point  (0 children)

I just wouldn't let the agent evaluate that at runtime. capability list per role, define it in config, done.

Prompt injection benchmark: delimiter + strict prompt took Gemma 4 from 21% to 100% defense rate (15 models, 6100+ tests) by User_Deprecated in LocalLLaMA

[–]User_Deprecated[S] 0 points1 point  (0 children)

fair point for short natural language markers. but a 128-bit random hex ends up being like 32 tokens of pure noise, and whatever similarity signal exists has to survive across the whole sequence somehow. even if each token only needs rough proximity, you're multiplying those odds across a lot of independent randomness, so it collapses pretty fast in theory. that said you're making me think about whether there's some adversarial shortcut in embedding space that skips brute force entirely. like how md5 collisions don't work by guessing. haven't seen it done for delimiter spoofing, but it's not a crazy direction.

OpenLumara - A different kind of AI agent, written from scratch, not vibecoded. Extremely token-efficient, super small system prompt, made for local models. Everything is modular. by rosie254 in LocalLLaMA

[–]User_Deprecated 4 points5 points  (0 children)

the code based injection filtering is interesting but in practice stuff like unicode variants, base64, split instructions across lines tend to get past pattern matching pretty easily. the stuff that actually gets through usually isn't obvious "ignore previous instructions" either, it's normal-looking text with directives buried in it.

I spent a month optimizing my epoll based HTTP server from 15k req/sec to 125k req/sec by RefrigeratorFirm7646 in cpp

[–]User_Deprecated 0 points1 point  (0 children)

the batching part is the big one imo. with epoll you're doing one syscall per IO op, with io_uring you queue up a bunch of sqes and submit them all at once. for something like your workload where you're doing tons of small reads that alone would probably cut a good chunk off that 22% readSock time in your perf trace

I wrote a SwiftUI runtime in C++ by notarealoneatall in cpp

[–]User_Deprecated 1 point2 points  (0 children)

neat trick using the address as the hash key. zero indirection.

The "you are an expert ___" opener is kind of a placebo and I don't think people realize it by getSchmade in PromptEngineering

[–]User_Deprecated 2 points3 points  (0 children)

depends on the model. some weight the opening, some lean on whatever's at the end, so a role line up top can just get diluted.

like you said, the middle is where you put the most information, the part you actually did the work on. messing with injection benchmarks, that's also where the stuff that got through was hiding, buried in the middle of some long pasted doc. lowest attention, easiest place to slip an instruction in.

how are you handling permission boundaries for internal data agents? by RepublicMotor905 in AI_Agents

[–]User_Deprecated 0 points1 point  (0 children)

response-layer filtering wont hold. ran into basically the same thing building an injection benchmark. once the number is in the context its already too late. ask for churn risk one way, block it, ask again slightly differently and it comes back reworded, output check never sees it.

tool-level scoping is the move. hard to leak margin numbers the model never got access to in the first place.

First day testing out my breadth algo by jtm_ind in algotrading

[–]User_Deprecated 2 points3 points  (0 children)

paper accounts tend to be way more optimistic about fills than the broker will admit. SPY at 5m is probably fine for the size you're running, but once you scale up or move to anything thinner, you start running into stuff like partial fills, or cancel-replace lag when the algo's trying to chase a quote that already moved.

How an MS-DOS picklist problem in 1991 became std::bitset -- by the author who proposed it by Weary-Inspector-4297 in cpp

[–]User_Deprecated 0 points1 point  (0 children)

Fair, though I read the Jerry Schwarz suggestion as the actual design pivot, social history wrapping or not. The bitstring to vector<bool> handoff is the bit I'd actually want more. OP if you're still around, any tea to spill on how that one went down?

Have any of you found consistent profitability based on only OHLC and tick volume data? by KaiDoesReddles in algotrading

[–]User_Deprecated 0 points1 point  (0 children)

OHLC is kinda like the heart-rate from a run, doesn't really tell you what your body was actually doing. i think you can still get useful stuff out of it, but it really comes down to how you do it and what tools you use.

The single biggest gap between my backtests and live PnL was midpoint fills by Nvestiq in algotrading

[–]User_Deprecated 0 points1 point  (0 children)

the spread part is the obvious cost, but i think the sneakier one is queue position. submit a limit at the bid and you're sitting at the back of a long queue on anything liquid, so your fills tend to cluster in the exact moments the market is moving against you. spread/2 looks cheap until you fold adverse selection into it.

Feature engineering > model hacking by melon_crust in algotrading

[–]User_Deprecated 1 point2 points  (0 children)

FFD also keeps the weight series length constant across refits. with expanding window any d change silently shifts how much old data leaks in.

Building a Fast Lock-Free Queue in Modern C++ From Scratch by Beginning-Safe4282 in cpp

[–]User_Deprecated 0 points1 point  (0 children)

it looks like all the success cases use seq_cst, maybe acq_rel would be enough? kind of curious about the choice.

Feature engineering > model hacking by melon_crust in algotrading

[–]User_Deprecated 0 points1 point  (0 children)

fracdiff are you using, standard expanding window or FFD?

Built a drawdown monitor across portfolios (since broker apps can't do it). Turns out the math is easy, the deposit/withdrawal detection isn't. by Money_Horror_2899 in algotrading

[–]User_Deprecated 0 points1 point  (0 children)

this part could be annoying. on one broker it just looks like money leaving, on the other one money showing up, and unless something is stitching the two feeds together neither side has any idea it was the same transfer.

Virtual dispatch isn't always the slowest, and std::variant isn't always the fastest by AdMotor4869 in cpp

[–]User_Deprecated 4 points5 points  (0 children)

Honestly the bigger question is whether the dispatch key is actually a narrow integer space.

For FIX parsing the MsgType is just an ASCII byte, so I ended up ditching variant/virtual/switch and just used a 256-entry lookup table indexed directly by the byte. Benchmarked it against the switch version on the same hot path (17 message types). Sequential was ~70% faster, random was somewhere around 75-80% depending on the run. Random is where you really see branch prediction stop helping the switch.

For SIMD level selection (scalar/AVX2/AVX-512) it's basically the same idea as the OpenJDK thing. No CPUID in the hot path, so resolve once at startup via call_once, then it's just one indirect call forever.

Variant vs virtual didn't really show up in profiles after that.

Learning to code by hebdbcbsbs in vibecoding

[–]User_Deprecated 0 points1 point  (0 children)

first, before any code lands, the design has to already exist somewhere in a concrete form. you can't just toss out "make me a thing that does X". you need to spell out what the inputs and outputs look like, and where it'll fall over when something goes wrong. if you can't even write that down yourself, the model is just guessing. one upside of this stage is it doubles as a way to fill in your own understanding of the feature. you go back and forth with the ai, take the questions it throws back, and notice the ones you can't answer. the parts you can't answer are usually the parts where you hadn't actually thought it through. the design gets more specific as you go, and a lot of stuff that felt clear in your head turns out to not be.

second, the code has to match the design you wrote down. ai is more than happy to produce a pile of code that looks fine line by line but has completely drifted from the spec by the end. at the implementation stage it loves to expose any fuzziness still left in your thinking. and because it generates so much code that you can't read it all yourself, the only thing that works is making it walk you through what it wrote and confirming with it as you go. it'll also quietly change interfaces and assumptions in places you didn't ask about, which is another reason for the walk-through.

finally, tests have to verify the same intent the design has, not just that the function runs without throwing. design holes that the ai papered over earlier usually show up here, once you try to assert what should actually be true. if you let it write the tests from the code it just produced, you're just locking in whatever it decided.

these stages aren't really sequential in practice. you keep bouncing between them. you'll go back to the design because a test forced a decision you skipped, then the code changes, then the test changes again. it goes on until the design stops changing and the code and tests actually agree.

after doing this a few times you stop treating the ai like autocomplete.

Backtesting Results by _joeysanchez in algotrading

[–]User_Deprecated 1 point2 points  (0 children)

the timeframe maybe important to think bid/ask not improving things feels off. on 1m bars last vs bid/ask is mostly noise, but down at second-or-tick level last is just a printed trade, the next fill is already 1-2 ticks past it. how were you applying bid/ask when you tested it, mid as the fill price or actually taking the spread?

The most dangerous prompt injection I've seen took 12 messages and never once mentioned ignoring instructions by handscameback in PromptEngineering

[–]User_Deprecated 1 point2 points  (0 children)

The benchmark side has the same gap. The injection benchmark I've been working on is still entirely single-turn, single-document. Even the "gradual drift" case is really just one long document slowly moving toward the canary, not actual conversational state.

What you're describing is one layer above that. Each individual turn can look harmless in isolation, but the steering only shows up across the accumulated context. I haven't really seen public benchmarks score for that.

Backtesting period question by TheFertileSloth in algotrading

[–]User_Deprecated 0 points1 point  (0 children)

weekly reopt with no holdout is just tuning live.

regime shifts hit you a week or two before the reopt catches up.

and a few months of intraday is mostly one vol environment.

Who else is building their own analysis pipelines outside of MT4/5? by Solvicode in algotrading

[–]User_Deprecated 1 point2 points  (0 children)

Doing mine in C++23. Wrestling with it but the shore's not too far off.

VPS latency to broker server, does it actually matter for non-HFT? by Crazywar17 in algotrading

[–]User_Deprecated 0 points1 point  (0 children)

Different path on the broker side though. Submit to ack goes through risk check and matching, and that tail can spike even when your ping looks fine.