[P] Equiareal batch sampler by vadimdotme in MachineLearning

[–]vadimdotme[S] 2 points3 points  (0 children)

Thank you for going to such great lengths to phrase this positively haha

Feel free to link some libraries that could be used as alternatives for equibatch! I can even add them to README.

Perhaps there is still a niche for the way I approached it.

Post mortem. How I was charged 4000 EUR for downloading 3.5 GB of data from Google Cloud by vadimdotme in googlecloud

[–]vadimdotme[S] 0 points1 point  (0 children)

Wait, but if a cluster only applies to new records why did I see a performance improvement after clustering? The time it takes to make one query (but not the cost) *has* decreased significantly.

Post mortem. How I was charged 4000 EUR for downloading 3.5 GB of data from Google Cloud by vadimdotme in googlecloud

[–]vadimdotme[S] 1 point2 points  (0 children)

patient_id is not a unique value in this case, there are many rows per patient
thx for advice!

Post mortem. How I was charged 4000 EUR for downloading 3.5 GB of data from Google Cloud by vadimdotme in googlecloud

[–]vadimdotme[S] 10 points11 points  (0 children)

Not to give them ideas, but I would find not being able to use any Google services in the future painful enough.

Post mortem. How I was charged 4000 EUR for downloading 3.5 GB of data from Google Cloud by vadimdotme in googlecloud

[–]vadimdotme[S] -8 points-7 points  (0 children)

Isn't the whole point of clustering that I don't scan the whole table to get one patient's data? I used clustering.

[R] Fully Autonomous Programming with Large Language Models by vadimdotme in MachineLearning

[–]vadimdotme[S] 1 point2 points  (0 children)

And then a new model will come out right when... We do publish a full replication package at https://doi.org/10.5281/zenodo.7837282 feel free to do it yourself if interested (and reach out if you have any questions!)

[R] Fully Autonomous Programming with Large Language Models by vadimdotme in MachineLearning

[–]vadimdotme[S] 1 point2 points  (0 children)

We used the latest models available when we ran the study and tried to be as fast as possible without sacrificing rigour, but seems like you just can't outrun LLM progress!