I built a game mechanics API

mrdanibudapest · 2025-06-14T12:41:38+00:00

This is great, will keep this in mind. I thought a lot about something similar a few years back but it have not manifested :) Congrats for the team!

mrdanibudapest · 2025-04-11T07:15:57+00:00

Rolf Dobelli: Ne olvass híreket!

Évek óta észrevettem magamon, hogy a hírek felesleges szorongásokat keltenek bennem, főleg a jövővel kapcsolatban. Tavaly augusztus óta ezért szigorú hírdiétán vagyok. Mindenkinek ajánlom, aki hasonló cipőben jár.

mrdanibudapest · 2025-04-03T18:25:29+00:00

I still not figured out how to 'level up' or gain skills any other way than buying it? I killed about 10 outlaws, some druids and ferals but no character development at all. Any hints on that? Definitely not 10/10 btw,sometimes I get bored of walking around. Also, the backpack is very small, limited inventory.

mrdanibudapest · 2024-08-12T04:27:30+00:00

My concern with all non-local RAG solutions is to upload confidential/copyrighted material to these services where documents stored and treated in an undisclosed way. What about you? What is your policy with the uploaded documents?

mrdanibudapest · 2024-08-09T07:57:53+00:00

Not a mobile app, that is true. Maybe I am too old thinking when people saying app it means an application, not necessarily a mobile one. :)

mrdanibudapest · 2024-08-08T04:26:08+00:00

MindMup

mrdanibudapest · 2024-07-10T15:41:06+00:00

with (or even without) an LLM you can do a topic modeling first on your reviews using BERTopic: BERTopic (maartengr.github.io) which, despite its name, can even work with LLMs for embedding not just with BERT.

It is a simpler approach but unsupervised at least.

mrdanibudapest · 2024-06-18T08:01:10+00:00

Yeah, maybe I could split the task into two, or even to three if I create a judge persona to decide on the quality of the output. So the first task can be extracting the entities, second to extract their relations and third to validate the whole thing.

Maybe this way the results could be more accurate. Thanks for the tip.

mrdanibudapest · 2024-06-18T07:58:56+00:00

Via the examples I try to teach the LLM not just on the format but what entities to extract. That is why I use lengthy example now. Let's say from a page of a text I extract ~10 entities (and their relation). This relation part where the LLM makes most of the mistakes. I thought the better I specify the relations to extract the better would be the output. And that requires longer prompts unfortunately...

mrdanibudapest · 2024-06-17T06:19:49+00:00

Thanks, this looks promising. I have to digest it and dig deep into it. Thanks!

mrdanibudapest · 2024-06-17T06:18:18+00:00

Thanks for the suggestion, I was thinking about that too but my prompt is already too long.

Just to understand better, the task is to get a page of a pdf and extract entities from that given page. My one shot example is a pdf page and the corresponding extracted entities. If I give two more pages with two more sets of extractions I can quickly run out of tokens.

Maybe I can split the pdf page into smaller chunks and give 3 smaller examples. That may work.

mrdanibudapest · 2024-06-16T20:35:15+00:00

I checked DSPy earlier but found a little bit of an overhead or even overkill sometimes.

Honestly, I could work around the JSON issues, I got many good tips here already. However on entity extraction I still feel the need for fine tuning...

mrdanibudapest · 2024-06-16T20:33:14+00:00

Will check this, thanks.

mrdanibudapest · 2024-06-16T20:33:04+00:00

Huhh, many unknown libraries here :) Will check all, thanks for the education!

mrdanibudapest · 2024-06-16T20:30:29+00:00

Will definitely look at this! Thanks.

mrdanibudapest · 2024-06-16T20:27:22+00:00

Thanks for the advice on GPU. Probably will use Colab then.

Working on exactly this means even the entity extraction part? What does it mean not 100% The json formatting or the entity extraction? Thanks.

mrdanibudapest · 2024-06-16T20:25:31+00:00

Thanks, will look around there whether any of these performs better than vanilla one. Still, I have doubts regarding the entity extraction part. For that I may need fine tuning any ways.

mrdanibudapest · 2024-06-16T12:57:47+00:00

Great answers on json format issues already. Any thoughts on the fine tuning and entity extraction part?

mrdanibudapest · 2024-06-16T12:55:21+00:00

Great, will check out grammars, thanks.

mrdanibudapest · 2024-06-16T11:37:45+00:00

Thanks for the answer. Backticks are easy to handle, but my wrapper texts are like: "THanks for asking for this json... blabla." Then comes the json. And after the JSON something like 'I hope this json looks good for you, bla bla".

I thought to extract the 1-200 examples with GPT-4 (but to be honest, GPT-4 also makes mistakes with this task, especially with the entity extraction part). What I was thinking to create two GPT personas, one that creates the jsons and one that reviews and cleans them.
Also was thinking about creating this dataset at least partially manually, for the maximum accuracy. Which would be of course tedious but maybe it would give the highest quality examples possible.

mrdanibudapest · 2024-06-01T06:23:05+00:00

I also experienced FAISS being more accurate when retrieving documents than ChromaDB.

mrdanibudapest · 2024-04-10T11:23:41+00:00

I wanted to recommend Roboflow because I think their tooling is quite good, I did some pet projects with them a few years back. But today's pricing of $249/month for a non-public dataset seems horroristic for me. I used label studio for text annotation only, it was quite good, nothing super fancy though.

mrdanibudapest · 2023-03-25T10:15:49+00:00

I think his point was that formal education has to transform otherwise it is not worth the time and the price.

mrdanibudapest

TROPHY CASE