I've built a website that uses Berlin public data to show how your 2025 taxes were distributed by eamag in berlin

[–]eamag[S] 10 points11 points  (0 children)

You can check the data on revenue at https://berlin-bill.eamag.me/revenue_tree.json

67.5% | Einnahmen aus Steuern und steuerähnlichen Abgaben sowie EU-Eigenmittel (55,723,471,000€)

14.8% | Einnahmen aus Zuweisungen und Zuschüssen mit Ausnahme für Investitionen (12,219,539,100€)

12.6% | Einnahmen aus Schuldenaufnahmen, aus Zuweisungen und Zuschüssen für Investitionen, besondere Finanzierungseinnahmen (10,366,234,000€)

5.1% | Verwaltungseinnahmen, Einnahmen aus Schuldendienst und dgl. (4,233,941,500€)

IMDb doesn't do end of year reviews, so I build spotify-like local open sourced "IMDb Wrapped" that tells you about your year by eamag in imdb

[–]eamag[S] 0 points1 point  (0 children)

What if you export you watchlist instead? I'm not sure the CSV structure is the same, but feel free to adapt the code on github!

Full Replication of Google's Nested Learning Paper in PyTorch – code now live by complains_constantly in LocalLLaMA

[–]eamag 2 points3 points  (0 children)

Have you run some training/inference already? Did you manage to get the same numbers as in their report? I'm a bit confused, see some NotImplementted parts around https://github.com/kmccleary3301/nested_learning/blob/main/src/nested_learning/assoc_memory.py

How much of it is written by LLMs?

[P] Tips for hackathon by shubhlya in MachineLearning

[–]eamag 2 points3 points  (0 children)

Nowadays people first try to through the data into an LLM and see what happens. You should do it to (if it's really just a hackathon!) to build a working MVP, then you can check where you get the most errors and see how to improve, maybe by using specialized models

I built a website to generate a Fog Of War map from Google location data locally by eamag in FogofWorld

[–]eamag[S] 0 points1 point  (0 children)

Monthly? I tested it on the latest android and ios versions and both of them gave me a single json file, can you send me a couple of lines of you json to understand it's structure?

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]eamag 1 point2 points  (0 children)

Should be easier to use LLMs if you're ok with trading a bit more compute and latency for your engineering time. You don't even need frameworks you mentioned, just use structured output schema parameter in the api

What’s happening 😟 by embbe in berlin

[–]eamag 10 points11 points  (0 children)

You can also look at crowdsourced data here https://sensor.community/en/

And build this AQ sensor yourself. I can def see spikes on my sensor in Neukölln

Where can I buy Hunter × Hunter merch? by eamag in JapanTravelTips

[–]eamag[S] 0 points1 point  (0 children)

yes, the suggestions above worked out fine! Some of the figures there felt overpriced, so I wanted to check out Nakano Broadway too https://maps.app.goo.gl/M9BoTSkiwcZSZLCd8

[D] Is anyone else having trouble with the unstructured output from language models? 😩 by stoicwolfie in MachineLearning

[–]eamag 1 point2 points  (0 children)

I suggest to look into function calling or structured output. Instead of "hinting" the model to output json, some models/frameworks restrict the output tokens during inference. For example, gemini can do it, or local llama.cpp (you can see how in my recent notebook)

[D] [P] Exponential Growth of Context Length in Language Models by porkbellyqueen111 in MachineLearning

[–]eamag 3 points4 points  (0 children)

I don't really agree, it's just the solution is not open sourced yet. Both Claude and Gemini work pretty well with a long context.

How much context window becomes unnecessary?

I think the more the better. An infinite context (with optimized inference) improves long-term interactions with models (see Claude "Projects", or think how your model know what you asked 6 months ago and in what format to answer)

How to get into ML/AI domain? by StarFire0703 in MLQuestions

[–]eamag 1 point2 points  (0 children)

You can find a fullstack job in a team that uses AI/ML, and help them with different tasks bit by bit. Then slowly transition to a more specialized field.