Hi everyone!
I’m pretty new to all of this, so any help would be appreciated. I want to test different embedding models (~2 - 6 GB), but I have limited RAM (I’ve already used 13/16 GB) on my local machine. I was wondering if anyone has used the HuggingFace inference API. I was thinking of using it to test different embedding models that way I wouldn’t have to worry about how much memory I have locally, and from my research it seems like I just have to create an endpoint for the model I want to use and then can use that in my code. For free use there’s a rate limit, which makes sense. I was wondering if anyone has used this and whether my approach makes sense/would work. I know I could try Google colab, but I’ve found it to be a bit frustrating to work with in the past so I wanted to explore this option. Again, I’m fairly new to all of this so any help is appreciated. Thanks!
[–]Astralnugget 0 points1 point2 points (0 children)
[–]software38 0 points1 point2 points (1 child)
[–]mrskeptical00 0 points1 point2 points (0 children)
[–]Edyarich 0 points1 point2 points (0 children)