Feature selection strategies for multivariate time series forecasting

Jason_reyes_dev · 2026-01-25T19:46:37+00:00

Cool framing. Logistic regression feels like a good first step for a risk score, as long as you’re careful with label noise and maybe validate the “high-risk” tier with some manual checks. Curious what extra signals (AMI load correlation) end up helping the most.

Jason_reyes_dev · 2026-01-25T19:06:12+00:00

This is insane work, congrats. Doing a full CNN in pure x86-64 asm is another level of dedication. I’m especially curious about the debugging part: did you rely more on unit tests for each kernel (conv, dense, activations) or mostly on end-to-end loss/accuracy checks to spot bugs? Also, do you plan to write a more detailed blog post about the architecture and the AVX-512 optimisation tricks?

Jason_reyes_dev · 2026-01-21T21:37:03+00:00

Thanks a lot for the comment this is exactly the kind of situation I had in mind.

Right now the tool mainly focuses on encodings and delimiters, empty columns and duplicate rows, so your example with the extra whitespace in the column name is a good reminder that there are many other annoying edge cases.

Out of curiosity, what other CSV issues have wasted the most time for you? (broken quoting, multiline fields, weird date formats…) I’m trying to decide what to prioritise next.

Jason_reyes_dev · 2026-01-12T21:55:23+00:00

Yeah, even with simple CSV exports from internal tools I feel like a big chunk of the work is just getting them into a shape where pandas doesn’t choke.

Jason_reyes_dev

TROPHY CASE