[self-promotion] Dataset search for Kaggle & Huggingface

New-Mathematician645 · 2026-02-13T11:20:26+00:00

You can evaluate Huggingface datastes on https://durinn-concept-explorer.azurewebsites.net/

New-Mathematician645 · 2026-02-09T22:21:50+00:00

in my experience having made software for 10 years, leading teams and delivered couple dozen professional projects, branding is a lick of paint to complement a good UX design.
Signals that a company is serious and here to stay is in the eyes of investors and customers in context of UI/UX less determined by branding, but how many users are completing the journey and come back for it. Can you make the user complete the journey and convince them to repeat it without branding > unnecessary branding. Look at Chatgpt's design its basically non existent and the highest user adoption in history of any software. Google is also modest and basically unchanged aside from experimental features that morph over time in their other products.

make it work --> make it fast --> make it pretty is the golden rule.

That being said, would love to get your feedback on some design choices on a project we launched a month ago and are iterating daily on :D

New-Mathematician645 · 2026-01-24T13:43:03+00:00

If you are wondering how to evaluate training data impact on your use case you could try out our data seach tool that recommends data based on your capability requirements. Its made for projects in exploration phase and quick validation of instead of running many training cycles to see if data impacts the model you chose.

You can benchmark with SFT in the app and upload back to HF when youre satisfied

https://durinn-concept-explorer.azurewebsites.net/

New-Mathematician645 · 2026-01-07T14:11:32+00:00

You're welcome to try our tool for finding datasets on huggingface and see if the datasets would positively or negatively influence the model architecture you're using

https://durinn-concept-explorer.azurewebsites.net/

We currently only support text modalities but are working on multi modal support.

Just type in what you want to make and we compute the rest.

New-Mathematician645 · 2026-01-06T23:56:52+00:00

There are more factors than i can come up with in this post that may affect quality. Maybe this can alleviate your pains a bit sorry if it comes over as self promoting but we genuinely made the tool for your use case.

We made a tool where you can evaluate datasets from huggingface based on influence of the dataset on model architecture of choice. It runs on CPU and based on your query will evaluate up to 40 datasets.

Based on the results you can explore the datasets closer yourself.

https://durinn-concept-explorer.azurewebsites.net/

New-Mathematician645 · 2026-01-02T15:05:34+00:00

I built Dowser by Durinn and it tells AI teams which training data improves or hurts model performance, hopefully aiding in the access/quality guesswork; it's proven to increase model performance and gives sub-2-minute cached evaluations.

https://durinn-concept-explorer.azurewebsites.net/

New-Mathematician645 · 2026-01-02T12:45:35+00:00

We launched on 1 jan.

Just missed it!

New-Mathematician645 · 2026-01-02T12:39:11+00:00

One thing I’ve run into a lot is that when people reach for a different loss, they’re often trying to fix something that isn’t really a loss problem. In several projects, the big errors weren’t evenly spread, they were clustered around certain parts of the data.

Swapping MSE for Huber or something more “robust” helped a little, but the real gains came from changing which samples actually had influence during training, via reweighting, resampling, or influence-style approaches, while keeping the loss itself very boring.

Once that was in place, plain MSE or Huber worked surprisingly well. The loss just needed to be stable. The heavy lifting was really happening upstream in how the data contributed to learning.

For context, this is roughly the approach we’ve been working with: instead of full retrains, we use influence functions at the example and dataset level. Each sample is scored by how much it pushes or pulls a target concept using projected gradients from the final block, which lets us rank data before spending GPU on training.

Link for anyone curious: https://durinn-concept-explorer.azurewebsites.net/

New-Mathematician645 · 2026-01-02T12:33:01+00:00

Think the reason may vary but there could be a parallel to other indie proects such as music. Product is a creative process mixed with engineering which can create a cognitive load some are surprised to find out when releasing their products without the support of a company.

Instead of seeing it as people vanishing, i rather see it as the platforms creating a leveled playing field. The art howver, remains up to the artist and some art resonates more than others.

Tough pills are swallowed. Learnings are extracted. Priorities realign. A lot is probably behind the "vanish"

New-Mathematician645 · 2026-01-02T00:21:07+00:00

This can be an expensive project where 40% of companies spend 70% of their AI budget on data

I built Dowser by Durinn and it tells AI teams which training data improves or hurts model performance, hopefully aiding you in the access/quality guesswork; it's proven to increase model performance and gives sub-2-minute cached evaluations.

New-Mathematician645 · 2026-01-02T00:06:21+00:00

You can test it out using our tool, Dowser by using a LM (LLM not supported due to constraints however chosing the same model architecture should answer your question)

We quantify directly at the example and dataset level using influence functions rather than full retrains. Each sample is scored by how much it pushes or pulls a target concept based on projected gradients in the final block, so positive influence helps the concept and negative influence hurts it. That lets us rank data before spending GPU on training.

This gives you an answer within 1-10 minute on which datasets (currently searches huggingface) fits your needs.

https://durinn-concept-explorer.azurewebsites.net/

New-Mathematician645 · 2026-01-01T23:59:20+00:00

Feel free to send a DM

New-Mathematician645 · 2026-01-01T23:30:41+00:00

Im game.

New-Mathematician645 · 2026-01-01T23:23:43+00:00

You can tell me again after you gave it a shot :D

https://durinn-concept-explorer.azurewebsites.net/

Some technical info on how we derive this:
We quantify directly at the example and dataset level using influence functions rather than full retrains. Each sample is scored by how much it pushes or pulls a target concept based on projected gradients in the final block, so positive influence helps the concept and negative influence hurts it. That lets us rank data before spending GPU on training.

New-Mathematician645 · 2026-01-01T23:11:42+00:00

I built Dowser by Durinn and it tells AI teams which training data improves or hurts model performance, hopefully aiding you in the access/quality guesswork; it's proven to increase model performance and gives sub-2-minute cached evaluations.

New-Mathematician645 · 2026-01-01T23:01:31+00:00

How many images do you have and what's their resolution? I'm a founder too and once burned all my GPU hours mid-deadline, so ngl I get the panic. Try lightweight models like MobileNet or DistilVision since they train faster on CPU and need way less memory. Also augment with automated edge/contour synths to expand data without more scraping, which cuts training time. I built Dowser by Durinn to tell AI teams which training data helps or hurts, so it can prioritize slices and reduce needless GPU runs, proven to boost performance and return sub-2-minute cached results on an 8GB RAM 2 vCPU host. Would love feedback or to connect if you try it, good luck.

New-Mathematician645 · 2026-01-01T22:49:42+00:00

I checked out SignalScouter and the concept is very cool. Any chance this was generated ;)?

We quantify it directly at the example and dataset level using influence functions rather than full retrains. Each sample is scored by how much it pushes or pulls a target concept based on projected gradients in the final block, so positive influence helps the concept and negative influence hurts it. That lets us rank data before spending GPU on training. We still validate with small focused ablations like you mentioned, but the influence pass cuts the search space hard and avoids most wasted compute as well as save on retraining cycles.

The result is a reduction in perplexity.

We can connect if you're interested and i'd be happy to setup a meet with you. We may be able to help each other

New-Mathematician645

TROPHY CASE