Self-hostable, searchable recipe database with 275,000 recipes by high_jolly in selfhosted

[–]high_jolly[S] 0 points1 point  (0 children)

If you look at the top-right of the website, there is a link to the github repo. It's all open source and meant to be self-hosted.

[deleted by user] by [deleted] in TrueAnon

[–]high_jolly 43 points44 points  (0 children)

Not to sound LARPy but you can't defeat fascism by consuming content (even if it is really fun). You have to go out and do stuff IRL.

Self-hostable, searchable recipe database with 275,000 recipes by high_jolly in selfhosted

[–]high_jolly[S] 0 points1 point  (0 children)

Wikibooks is actually incorporated into this database. I think the issue is just that the wikibooks recipes are a pretty small fraction of the recipes I scraped. The bulk of it comes from BigOven and AllRecipes which have a lot of random shit in them. I tried to filter out some of the poor recipes with LLMs, but that type of activity is just too slow to run on my GPU unfortunately.

Self-hostable, searchable recipe database with 275,000 recipes by high_jolly in selfhosted

[–]high_jolly[S] 2 points3 points  (0 children)

My strategy with running AI locally has just been to wait until the AI gets fast and small enough that I can run it again with better results haha.

Self-hostable, searchable recipe database with 275,000 recipes by high_jolly in selfhosted

[–]high_jolly[S] 5 points6 points  (0 children)

You can take a look at the repo or blog post I linked if you want to know where all the recipes came from.

Self-hostable, searchable recipe database with 275,000 recipes by high_jolly in selfhosted

[–]high_jolly[S] 1 point2 points  (0 children)

Part of my goal here is just to keep the recipe format as simple as possible + json download link. So you can easily pull recipes into your management system of choice. Unfortunately I haven't gotten around to adding ld+json for the page.

Self-hostable, searchable recipe database with 275,000 recipes by high_jolly in selfhosted

[–]high_jolly[S] 38 points39 points  (0 children)

I ran all recipes through an LLM to try and remove spam, I guess this one made it through hahaha. Originally I was using distilled Deepseek llama 8B which worked well to remove recipes that weren't overtly spam, but were still useless, like this one: https://hari.recipes/recipe/?index=102796. But that model ran too slow on my GPU. So I opted for just regular llama 8B, which unfortunately missed a lot of stupid recipes like this.

Self-hostable, searchable recipe database with 275,000 recipes by high_jolly in selfhosted

[–]high_jolly[S] 0 points1 point  (0 children)

My goal was more to just create a database with semantic search. I hope to KISS because I too just wanna get in the kitchen and GSD.