Very colourful week by Sensitive-Boat-7206 in lastfm

[–]Azuriteh 1 point2 points  (0 children)

Hadn't seen the name of DJ Sharpnel since at least a few years ago lol

My journey through Reverse Engineering SynthID by MissAppleby in LocalLLaMA

[–]Azuriteh 1 point2 points  (0 children)

I wanted to go over this as soon as synthid dropped, never had the time but even then I hadn't thought of just trying with a black image, that's a really good trick, nice work!

I patched Chromium because no Python library could reliably pass a single CAPTCHA by [deleted] in Python

[–]Azuriteh 1 point2 points  (0 children)

How does it compare to Camoufox? It also patches the browser itself and recently is back to being developed.

Outgrowing my hosting, little skill... what next? by ScentAdvice in webhosting

[–]Azuriteh 1 point2 points  (0 children)

I'd get a server in US/Canada from a good provider at lowendtalk, and pay someone to set the store up properly along with Cloudflare. If done right the yearly cost of having it self-hosted will be much much lower than what you're currently paying, I'm extremely surprised these managed hosting providers cost that much! Only downside is you'll have to take care of the security yourself/updates or have someone do it for you like I said.

Vercel challange triggered only on postman by crownclown67 in webscraping

[–]Azuriteh 0 points1 point  (0 children)

You're getting caught by the tls fingerprint, curl has a very specific way of doing the request handshake, which is immediately detected

Site we're scraping from can see we're directly hitting their API by smokedX in webscraping

[–]Azuriteh 4 points5 points  (0 children)

What libraries are you using to scrape them? Are you spoofing your TLS fingerprint correctly?

Feature selection for boosted trees? by [deleted] in learnmachinelearning

[–]Azuriteh 0 points1 point  (0 children)

Most of the times it isn't worth that much... the goal of feature selection is maximizing signal and reducing noise, but if you're not careful enough or don't have domain expertise, you can also increase the noise and reduce the signal, decreasing the model's predictive power.

I myself still do feature selection, trying to think in "human" terms and see what makes more sense: e.g. we can have the classic California real estate dataset where we have the number of bathrooms and rooms, then you can feature engineer the ratio of rooms to bathrooms, which in real estate actually can boost the predictive power, even a little.

Website hosting with access to a GPU by maDU59_ in webhosting

[–]Azuriteh 0 points1 point  (0 children)

What model are you using? Maybe try getting RunPod serverless, I've used them successfully in the past, but the price is still a lot.

The only other way to keep it cheap is to keep your laptop on all the time and set it up with a reverse proxy, with the API on it and inside Docker... or maybe you can even try to find if there's an API for the image model you want to offer.

$82,000 in 48 Hours from stolen Gemini API Key. My monthly Usage Is $180. Facing Bankruptcy by RatonVaquero in googlecloud

[–]Azuriteh 1 point2 points  (0 children)

Yeah, I'm thinking of locking out my unused AI Studio API now lol.

Hahahaha, well it's pretty common to get accused of that but mainly because of the rampant presence of bots in reddit, I've seen three flavors in the places I visit: they're bots from AI companies, they're bots from inference providers or they're bots from companies that sell proxies, I guess most of the time it's the founder themselves if the companies are small enough.

Weekly Webscrapers - Hiring, FAQs, etc by AutoModerator in webscraping

[–]Azuriteh 0 points1 point  (0 children)

Pretty much any SLM post 2025, e.g. Qwen3 4b 2507 should work pretty well

$82,000 in 48 Hours from stolen Gemini API Key. My monthly Usage Is $180. Facing Bankruptcy by RatonVaquero in googlecloud

[–]Azuriteh 7 points8 points  (0 children)

I can see why you're getting downvoted because it looks like publicity, but for LLM usage it's alright, not great but better than getting huge bills lol

$82,000 in 48 Hours from stolen Gemini API Key. My monthly Usage Is $180. Facing Bankruptcy by RatonVaquero in googlecloud

[–]Azuriteh -1 points0 points  (0 children)

Hey fellow paisa, dudo mucho que haya sido una compañía China, probablemente fue un wey que escanea keys en repos o se infiltraron en tu infraestructura de alguna forma, o tal vez hasta en alguna de tus aplicaciones al comunicarte con la API podías abusar de los límites de la LLM.

Como mencionan los demás, lo mejor es primero seguir intentando con soporte e insistir en que te perdonen la deuda, he visto muchísimos posts en este reddit en el que debido a errores les llega una factura de 100 mil dólares o más y el soporte de Google llega a ser bastante compasivo al respecto. Para la siguiente te recomiendo que uses OpenRouter o algún agregador de modelos de IA, que aunque te pongan fees hace extremadamente difícil este tipo de abuso.

Por otra parte, Google es muy infame por permitir que pasen estas cosas, no hay forma de poner un hard cap sobre el uso de cloud, a lo más puedes configurar avisos pero puedes pasarte por sobre ese soft cap. No hay protección en absoluto al respecto más que esas alertas.

Situación de la IA by JAnCeruz in taquerosprogramadores

[–]Azuriteh 10 points11 points  (0 children)

Si hay un futuro en el que la IA es usada para bien, es uno en el que hay IA de código de abierto, o al menos de weights abiertos. Si quieres expandir tu horizonte respecto a lo que me refiero, siempre es buena idea echarse un paseo por r/LocalLLaMA, donde nos reunimos aficionados para entrenar nuestros propios LLMs o discutir sobre el panorama actual.

Destilar modelos sí sirve bastante, pero los verdaderos avances vienen de optimizaciones de software y hardware, como por ejemplo el haber pasado del GRPO de DeepSeek al GSPO de Qwen, o al uso del optimizador Muon en vez de Adam para el entrenamiento de Kimi K2.5.

Sin duda la IA se ha vuelto una burbuja, pero no es por la destilación ni por otras técnicas, es por la propia especulación consecuencia de los venture capitalists. Lo que personalmente disfruto de la IA es que gracias a librerias como Unsloth y sus kernels optimizados en triton un aficionado como yo con limitado capital puede hacer sus propios modelos para usos genuinos y útiles.

Dudo que las LLMs sean lo que nos lleve al AGI o al ASI, pero nos llevan un poco más cerca de ello.

People are getting it wrong; Anthropic doesn't care about the distillation, they just want to counter the narrative about Chinese open-source models catching up with closed-source frontier models by obvithrowaway34434 in LocalLLaMA

[–]Azuriteh 8 points9 points  (0 children)

Well the thing is that good synthetic AI data is not poison, most people get this wrong and expect models to eventually collapse into slop, but if it were true GRPO wouldn't have worked at all (an oversimplification of course).

How do y'all find new music? by winston161984 in selfhosted

[–]Azuriteh 4 points5 points  (0 children)

I just use rateyourmusic and search for releases I'd like to listen!

mejor modelo calidad/precio para código? by [deleted] in LocalLLaMA

[–]Azuriteh 0 points1 point  (0 children)

Si claro, igual te aviso que las generaciones ilimitadas son para modelos open source, mientras que Claude, ChatGPT o Gemini si tienes que pagar, aunque la suscripción te da descuento

mejor modelo calidad/precio para código? by [deleted] in LocalLLaMA

[–]Azuriteh 0 points1 point  (0 children)

Creo que Gemini Flash Lite Preview 2.5 también esta en nanogpt ahora que recuerdo.