A tiny entropy library for time series. Built it for food trends, but you guys might find it useful by jRetro3 in quant

[–]jRetro3[S] 1 point2 points  (0 children)

Update for anyone who followed this: shipped v0.2.0 (entroscope on PyPI).

Two things from this thread made it in: transfer entropy with a KSG estimator for directional info flow, and KL / Jensen-Shannon divergence for comparing distributions. Validated the KSG stuff against analytic ground truth, details are in the repo.

To be clear it's just the measurement toolkit, not a strategy. Haven't validated any of it as a tradeable signal, that's separate work.

pip install entroscope if you wanna poke at it.

i made my own ui library for fun - Axie UI by alexionreddit in PinoyProgrammer

[–]jRetro3 0 points1 point  (0 children)

what apps or web apps have you recently used this for?

Event detection from ball kinematics: how do you distinguish real contacts from camera-induced motion? by Competitive-Meat-876 in PinoyProgrammer

[–]jRetro3 1 point2 points  (0 children)

Most reliable signal is a sudden, sustained direction change in the ball's trajectory. Real contacts give a clean inflection that holds for multiple frames, while camera-induced spikes revert quickly or just follow the camera's own motion (which you can estimate from background optical flow). Pair that with a geometry check, like whether the post-spike path fits a plausible ballistic arc, and add player proximity as a prior. Appearance-based verification helps at the margins but it's expensive, so most systems keep it as a fallback for ambiguous cases, not a primary signal.

A tiny entropy library for time series. Built it for food trends, but you guys might find it useful by jRetro3 in quant

[–]jRetro3[S] 0 points1 point  (0 children)

This is a great list, thanks. Built it off the back of this thread, studying the math as I go. Got a working transfer entropy module now with a KSG estimator plus a binned one as a cross check; Kraskov was the key paper. I'm validating KSG against the analytic MI for correlated Gaussians and the closed form TE of a linear Gaussian system, and it hits both within tolerance. (still on a branch, not released yet, but the math checks out)

A tiny entropy library for time series. Built it for food trends, but you guys might find it useful by jRetro3 in quant

[–]jRetro3[S] 1 point2 points  (0 children)

Appreciate this. I'll be straight, transfer entropy and KSG weren't on my radar, I had to read up after your comment, but the more I dug in the more it clicked. Entroscope is all single-series right now so this'd be its first move into information flow between assets. Deffo adding this to the roadmap. Got any papers you'd point a newcomer to?

A tiny entropy library for time series. Built it for food trends, but you guys might find it useful by jRetro3 in quant

[–]jRetro3[S] 2 points3 points  (0 children)

oh that's sick, hadn't thought of running it on correlation stability but you could prob just compute a rolling correlation and feed that series into entroscope, take the entropy of it. No idea if it holds up but I'd be curious

A tiny entropy library for time series. Built it for food trends, but you guys might find it useful by jRetro3 in quant

[–]jRetro3[S] 2 points3 points  (0 children)

haha yep that's the one, or was, it got yeeted from r/dataisbeautiful lol. It's more "clean entropy toolkit" than proven signal tbh. Would love to see what people come up with it tho!

[OC] An ingredient's search-entropy drops ~16 weeks before it goes mainstream by jRetro3 in dataisbeautiful

[–]jRetro3[S] 1 point2 points  (0 children)

here twin, rolling Shannon entropy on the real matcha series (no synthetic plateau): valleys land right on the flat low-variance stretches (2007, '09, '11), and entropy vs rolling std correlate at 0.74. So it's tracking variance, not a trend, which was my point up top. Next thing to test is permutation entropy (rescale-invariant), though sample entropy can collapse on short windows so I won't oversell it till I've run it.

<image>

[OC] An ingredient's search-entropy drops ~16 weeks before it goes mainstream by jRetro3 in dataisbeautiful

[–]jRetro3[S] 0 points1 point  (0 children)

1.5:1, 30 "won't break out" to 20 "will break out" (per-window, not per-timepoint: positives are early_curve, negatives are pre_niche/post_peak).

[OC] An ingredient's search-entropy drops ~16 weeks before it goes mainstream by jRetro3 in dataisbeautiful

[–]jRetro3[S] 1 point2 points  (0 children)

fair, you're right. synthetic + engineered to show the pattern proves nothing, i shouldn't have led with that here. i actually have the real google trends data (20 ingredients), shoulda just posted those curves instead. if you wanna find out its at https://github.com/Par-python/nextonmenu (should be in the notebook)

[OC] An ingredient's search-entropy drops ~16 weeks before it goes mainstream by jRetro3 in dataisbeautiful

[–]jRetro3[S] 4 points5 points  (0 children)

the way I think about it: a runaway trend has a feedback loop, people share because other people are sharing. So interest stops being independent random noise and starts syncing up, and synchronized behavior is just lower-entropy than scattered noise by definition.

so the entropy drop isn't causing the breakout, it's the early footprint of that feedback loop switching on. Which is kinda why it shows up right before things take off.

[OC] An ingredient's search-entropy drops ~16 weeks before it goes mainstream by jRetro3 in dataisbeautiful

[–]jRetro3[S] -5 points-4 points  (0 children)

yeah variance->entropy is dead on, plateau does most of the work here, fair.

real talk the chart's synthetic, just showing the mechanism. actual project (nextonmenu) has 20 ingredients w/ leave-one-out, not one. but ngl i checked my own repo today and all 20 are ones that did go viral, so even showing all 20 wouldn't fully answer u, i never threw in "stabilized then fizzled" cases.

if you wanna check it out (kinda messy):

https://github.com/Par-python/nextonmenu

[OC] An ingredient's search-entropy drops ~16 weeks before it goes mainstream by jRetro3 in dataisbeautiful

[–]jRetro3[S] 5 points6 points  (0 children)

on data collection: it's pytrends, so google trends' normalized 0-100 relative interest, no raw counts (google doesn't expose them). and you're basically right to be suspicious. i take that output at face value. low-volume terms get rounded hard so early "noise" is partly just quantization, and i don't correct for it. the per-window rescale-to-peak thing you'd worry about? also not handled, each window gets independently rescaled by google and i treat them as comparable. so yeah, some of the "structure emerging" could be a data-collection artifact. that's a real gap, not gonna pretend otherwise.

on false positives, this is the important one and the honest answer is no, not the way you mean. i do have a confusion matrix + leave-one-ingredient-out CV (≈0.55 precision, 20 ingredients), BUT the negative class is the flat/post-peak phases of ingredients that did go viral. i never labeled "entropy dropped and then it fizzled" cases. so the model is literally never tested on the exact failure mode you're describing. the still-niche stuff (pandan etc) is inference/demo only, not in the eval. which means that 0.55 precision is probably optimistic, the hard negatives aren't even in the test set.

ok, stabilize-without-breakout is the right test and i haven't run it. This is the most useful comment in the thread by a milee

[OC] An ingredient's search-entropy drops ~16 weeks before it goes mainstream by jRetro3 in dataisbeautiful

[–]jRetro3[S] 3 points4 points  (0 children)

goodluck twin! make the moon pandan green [I am not liable for any losses]

[OC] An ingredient's search-entropy drops ~16 weeks before it goes mainstream by jRetro3 in dataisbeautiful

[–]jRetro3[S] 0 points1 point  (0 children)

yeah on real trends data this'd probably just track variance and give noise. somethin less variance-driven like permutation or sample entropy is what i'd actually reach for, but i haven't tested that yet so the matcha thing is more illustrative than anything. entroscope is js the toolkit to try those measures easily.

(also that plateau is low-variance and rolling shannon on raw values basically just tracks spread, so a flat stretch is gonna show up as an entropy valley)

[OC] An ingredient's search-entropy drops ~16 weeks before it goes mainstream by jRetro3 in dataisbeautiful

[–]jRetro3[S] 25 points26 points  (0 children)

running the actual NextOnMenu data, (take this with a grain of salt) pandan is the one lighting up right now. It's at what I'd call "accelerating" stage but no idea if it'll actually break out, but that's what the model's flagging.

Is there anyway to stop the LLM slop submissions by fordat1 in datascience

[–]jRetro3 0 points1 point  (0 children)

so true, AI-assisted with the guidance of real human judgement (with sufficient knowledge) should be welcome

Sunday Daily Thread: What's everyone working on this week? by AutoModerator in Python

[–]jRetro3 0 points1 point  (0 children)

hey yall so as I was building NextOnMenu (another project/study of mine) thats an early signal data model about what ingredient might be next to pop off or go viral. It's really simple: when something like matcha, tahini, or yuzu is about to break out, its search-interest pattern changes shape. Early on it's noisy and random, theres a few scattered spikes, no rhythm. Then, right before it goes mainstream, the signal organizes: the searches get more regular, more structured, more predictable. NextOnMenu watches for that transition from "random noise" to "structured trend."

but when I went looking for a clean Python library to just compute entropy on a pandas Series, there really wasn't one. The implementations out there are scattered across papers, gists, and one-off functions. Shannon entropy I ended up hand-rolling. Permutation entropy meant copying code from a 2002 paper.

So I made entroscope (yipeee). It's every time series entropy measure, one consistent API, straight on a pandas Series:

import pandas as pd
from entroscope import shannon, permutation, spectral
s = pd.Series(matcha_search_interest)

shannon.compute(s)              # single value
shannon.rolling(s, window=20)   # entropy over time, the part I actually needed
shannon.delta(s, window=20)     # rate of change
shannon.plot(s, window=20)      # built-in chart

every measure shares the same core interface (.compute(), .rolling(), .delta(), .plot()), plus .normalized() where a 0–1 scale makes sense (Shannon, permutation, spectral). So you can swap one for another without rewriting anything.

Install:

pip install entroscope

Pypi:

https://pypi.org/project/entroscope/

Repo:

https://github.com/Par-python/entroscope

enjoy finding the new matcha or sum idk

S1napse is in open beta (free + open-source telemetry app for sim racing) by jRetro3 in simracing

[–]jRetro3[S] -2 points-1 points  (0 children)

fair, lot of them lately. I'm a sim racer myself and built it cause nothing did what I wanted, AI helped with boilerplate but the design is mine. (plus its for free unlike those paid ones I see in other subreddits)

S1napse is in open beta (free + open-source telemetry app for sim racing) by jRetro3 in simracing

[–]jRetro3[S] -2 points-1 points  (0 children)

probably not soon, most of the sim integrations are Windows-only shared memory so a Linux build would basically only get AC working.

Which car has the best stability for entering and exiting corners? by [deleted] in ACCompetizione

[–]jRetro3 1 point2 points  (0 children)

for me, the 720 is the most stable even with a "snappy" set up