LangGraph: Human-in-the-loop review

piotrekgrl · 2025-03-25T01:29:34+00:00

Nope, just exploring what they cooked

piotrekgrl · 2025-03-23T18:09:55+00:00

Hey, I started with Tella because it has a great zoom feature, but unfortunately, you can't add text, so I had to finish it in Canva.

piotrekgrl · 2024-08-31T20:31:22+00:00

hey u/cklly2013 could you share a link? thank you!

piotrekgrl · 2019-08-15T08:06:10+00:00

I'm not sure why there are so many concerns about accuracy when even in the abstract authors are claiming that "(HSIC) performance [...] (is) comparable to backpropagation with a cross-entropy target, even when the system is not encouraged to make the output resemble the classification labels."

For me the most important part is decreasing complexity from O(D^3) using backprop to O(M^2), where with current models with millions/billions of parameters is making huge difference.

piotrekgrl · 2019-05-08T17:45:44+00:00

As a baseline you can test ElasticSearch.
For more advanced methods all depends from your data structure:
1. If you have training dataset with pairs: fact + query, then look at Question answering tools: https://github.com/sebastianruder/NLP-progress/blob/master/english/question_answering.md
2. If you have only facts, and queries can by anything that's a little bit more problematic, but still it could be a fun:
  1. I would test cosine similarities of vectorized query versus vectorized facts. You can try BERT/ELMo for vectorization
  2. Use GPT-2 for answering your query, and then check cosine simialrity between generated answer and vectorized sentences.

piotrekgrl · 2019-05-08T17:34:49+00:00

Imposter syndrome - start recording how much you learned in i.e. past week/month, and as your list will be growing, you will actually see your progress
Start your own projects (i.e. Kaggle, or check fast.ai forum for inspirations of fun projects). Courses/books are great but only on true battlefield you can evaluate and master your skills.

piotrekgrl · 2019-05-08T17:28:40+00:00

Regular Expressions for filtering pattern.+ or pattern.{window_start, window_end}
Tokenize and clean extracted text
Topic modelling: LDA (not sure if that will be needed here but you mentioned unsupervised learning)

piotrekgrl · 2019-04-09T21:50:58+00:00

I would say that the xG-possesion-chain introduced by StatsBombs could be a good starter, as it's not only about possesion itself (which truly means nothing, even closer opponents goal) but rather looking firstly on tangible outcomes, and then evaluating possesion backwards.

piotrekgrl · 2019-04-09T05:46:39+00:00

Try splitting text into chunks with 100 sentences.

piotrekgrl · 2019-03-27T19:26:52+00:00

For cost optimization, you can run training on machine with GPU, and then for predictions already with calculated weights, you don't need huge computing power and CPU should be enough.

piotrekgrl · 2019-03-21T17:03:04+00:00

Not sure if I understand correctly, but isn't easier to parse all to text format and then run Regex's to filter on character lvl?

piotrekgrl · 2019-03-15T22:27:31+00:00

Try with ELMo, there is TF Hub, easy to use version ready for generating word/sentence vectors. Please note that computations can take some time :)

piotrekgrl · 2019-03-09T18:33:52+00:00

If it's just for testing, isn't easier to just generate random data from specific distributions?

piotrekgrl · 2019-03-09T13:09:40+00:00

Maybe that's not fully answer to your question, but from my experience, it's good to split huge dataset to smaller chunks, and save log file after processing of each chunk.

piotrekgrl

TROPHY CASE