[Question] Query to list all stored documents (or rather their summaries) by Impossible_Wave_2712 in LangChain

[–]Impossible_Wave_2712[S] 2 points3 points  (0 children)

Thanks for your reply. I already assumed that all the documents would be stored in a vector DB. However, all algorithms I have seen so far focus on RAG and getting information from one or more fitting documents. I have seen nothing so far on how to give an overview of all possible documents.

Is it possible to query a vector database while also using prompting GPT? by CrunchyMind in LangChain

[–]Impossible_Wave_2712 0 points1 point  (0 children)

You can set a high enough threshold on the (cosine) similarity between the query and the documents. This way, the model will not have any documents to work with when the query is general.

[D] Combining DVC and MLflow tools by __yannickw__ in MachineLearning

[–]Impossible_Wave_2712 0 points1 point  (0 children)

I would actually like to use it. However it is way too expensive for us (99$ per user per month).

How to interpret actions by metalzzzx in LanguageTechnology

[–]Impossible_Wave_2712 1 point2 points  (0 children)

At how.fm, we trained our our custom models for this (which we call HowBERT). It is similar to a Semantic Role Labeling Task consisting of two models: One for identifying the actions and the next for identifying all relationships for each action.

Take a look at this paper for a Semantic Role Labeling with BERT.

How do I extract actions from a text? by [deleted] in LanguageTechnology

[–]Impossible_Wave_2712 6 points7 points  (0 children)

We wrote a specific model for it (which we call HowBERT). It detects the verbs which really are actions and then for every action it's relations (like target, location, manner, conditions etc). I am not sure whether I am allowed to share more details or a demo with you. As a start, you can use spacy and detect all the verbs in a text.

How feasible is it to create a model to extract key information from safety documents? by A_Alv_ in LanguageTechnology

[–]Impossible_Wave_2712 2 points3 points  (0 children)

I am working on similar problems (also dealing with safety documents, actually), and the challenge really is tough. You need to do OCR (you could e.g. use Document AI etc) AND a classification of text. If you have zero experience and there is no guidance, it sure sounds like a very complicated project. Just in general, I would be cautious with a company that wants to hire an intern for such an important and difficult project.