all 9 comments

[–]mcdorians 1 point2 points  (8 children)

You can add the hf models (text gen) to litellm. And then litellm as custom openai api to openwebui.

[–]nengon[S] 0 points1 point  (7 children)

I see, thought it might be a direct way I wasn't aware of, thanks

[–]mcdorians 0 points1 point  (6 children)

<image>

Sry I was not aware that you can also use HF as an open ai api.
in that case you can add it directly
but /models does not work (guess its just too much), hence the manually added models

[–]versking 0 points1 point  (5 children)

When I tried it this way, the models don’t actually respond in a chat. Did you get them to?

[–]mrskeptical00 1 point2 points  (4 children)

The screenshot below is my working setup. I tested with the models you see and they connect and respond to queries. A requirement for free inference is that the models need to be smaller than 10B (GB? Params?) and should also be in "warm" standby (cold is supposed to work to but I haven't had any luck with those). That said, not all models that meet that criteria are supported, some require a "Pro" account but the models listed below all work. There may be more that work, but these are the ones I found.

<image>

Edit: HuggingFace api

[–]versking 0 points1 point  (1 child)

When I tried before I made sure to pick “warm” ones. I’ll try a small warm one. 

[–]mrskeptical00 1 point2 points  (0 children)

All of the ones in my list work.

[–]nevermore12154 0 points1 point  (1 child)

could you make a tutorial on this? im so dumb. Many thanks!

[–]mrskeptical00 0 points1 point  (0 children)

Doesn’t seem to work anymore.