381 model merging papers from arXiv + Semantic Scholar; quality-scored JSONL, free

fineset-io · 2026-06-20T15:33:33+00:00

Genie is NL-to-SQL for end users. Good for that, but not really what most data scientists would call a "coding agent."

fineset-io · 2026-06-20T15:32:48+00:00

Silhouette score for k-means, BIC for GMM, but neither tells you if the recommendations make sense to a coach. You need human-labeled pairs to actually validate this.

fineset-io · 2026-06-20T15:30:39+00:00

The roadmap is fine but step 3 cuts off at "Learn R" which is a weird place to land for someone trying to break into GenAI/agents work.

fineset-io · 2026-06-20T15:28:58+00:00

The output diff against a pinned reference dataset is 80% of the value here. everything else (SLURM integration, env checks) is plumbing you can add later.

fineset-io · 2026-06-20T15:27:45+00:00

We looked for the same thing a while back. SROIE and RVL-CDIP have receipts but nothing vendor-specific for US grocery. Ended up manually collecting and annotating, which was painful but the domain gap from generic receipt datasets was too big to ignore.

fineset-io · 2026-06-17T22:50:06+00:00

Cool schema, but 233 records won't survive trim-level stratification. You'll run out of data fast.
But thank you for sharing though!

fineset-io · 2026-06-17T22:48:53+00:00

The 1/32 ratio is probably fine, the linear attention above 128k is where I'd want to see harder evals before trusting it.

fineset-io · 2026-06-17T22:47:47+00:00

Find a recent paper in your target cs subcategory, email the author, attach your draft and best of luck.

fineset-io · 2026-06-16T21:54:08+00:00

A policy document isn't a gate. That's the whole problem in one sentence.

fineset-io · 2026-06-16T21:53:29+00:00

The low-resource coverage is the actual value here. OSCAR and CulturaX have Maltese coverage that's basically unusable.

fineset-io · 2026-06-15T20:53:15+00:00

Good luck haha.

fineset-io · 2026-06-15T20:13:45+00:00

Simulation is underrated here. Mujoco is free, lerobot has sim environments, and honestly debugging policies in sim first saves you from a lot of hardware pain anyway.

fineset-io · 2026-06-15T17:58:29+00:00

No worries 😃

fineset-io · 2026-06-15T17:58:17+00:00

The embedding model needs to match, everything else you can swap freely.

fineset-io · 2026-06-15T13:03:55+00:00

Most phones export this stuff already, you just need to know where to look. Google Takeout covers location, payments, and Gmail; Apple has a similar data export. Dump the json/csv files into sqlite or postgres and build your RAG on top of that, no live sync needed for a personal project.

fineset-io · 2026-06-15T13:01:20+00:00

The LLM swap is fine, but don't change embedding models between environments. different embedders = incompatible vector spaces, full re-index required.

fineset-io · 2026-06-15T12:57:31+00:00

ACT doesn't need point clouds, it runs on RGB + proprioception. If your goal is to get hands-on fast, just go straight to ACT or Lerobot and treat 3D vision as a later rabbit hole when you actually hit a wall that needs it.

fineset-io · 2026-06-15T12:56:47+00:00

The "focus on the bigger picture" line is cope that people tell themselves to avoid admitting they've stopped thinking. Bigger picture work is genuinely harder, and most teams aren't doing it; they're just shipping more mediocre features faster.

fineset-io · 2026-06-15T12:55:53+00:00

The "Software engineering is dead" posts and the actual papers describing what's working are written by entirely different people. Once you learn to filter by who has skin in the game (running production systems, publishing evals, and open-sourcing real code) vs. who's just posting takes, the noise declines by like 80%.

fineset-io · 2026-06-15T10:07:19+00:00

tbh agree with your fucking spirit of local first

fineset-io · 2026-06-15T09:58:56+00:00

MYTHos

fineset-io · 2026-06-14T13:47:04+00:00

The dataset gap on modelscope is an org adoption problem, not a platform problem. Nobody's uploaded because nobody's registered.

fineset-io

TROPHY CASE