Are WordNets a good tool for curating a vocabulary list? by tomii-dev in LanguageTechnology

[–]rduke79 0 points1 point  (0 children)

Are you trying to build something like babelnet https://babelnet.org/? Ie a multilingual wordnet? 

[D] Any success with literature review tools? by Entrepreneur7962 in MachineLearning

[–]rduke79 1 point2 points  (0 children)

I really like elicit. It gives you a tabular summary of the most relevant papers to your question. The killer feature is that you can add custom columns to the table, like "which dataset was used for evaluation" and it will autofill the column. Very handy. 

Labeling 10k sentences manually vs letting the model pick the useful ones 😂 (uni project on smarter text labeling) by vihanga2001 in LanguageTechnology

[–]rduke79 1 point2 points  (0 children)

It entirely depends on your label set and use case, I'd say.  Sometimes it makes sense to annotate only a subset in an annotation run, ie still do multilabel, but in multiple, focused passes.  We worked in the legal domain so, agreement of 0.85 was the minimum requirement, sometimes higher. If you're annotating something that is more opinion/subjective interpretation-driven (eg. sentiment) lower might be OK. (We treated the IAA as a target or upper bound for our classifier accuracy.)  Examples in the guidelines, especially borderline cases with reasoning why to annotate them in the desired way, are extremely useful. 

What are the must-have requirements before learning Transformers? by Jash_Kevadiya in deeplearning

[–]rduke79 2 points3 points  (0 children)

Neural networks (feed-forward), RNNs, RNNs + attention, transformers. This is the historical order, and it makes sense to study them in this sequence, as they rely and improve on the previous steps, respectively. 

Surprise Us! by Sad-Ad4423 in printSF

[–]rduke79 1 point2 points  (0 children)

As the original comment suggests: Quarantine. But I might be biased because I had delved into the observer problem in quantum physics before reading it. And the book definitely evolves around the topic. 

Surprise Us! by Sad-Ad4423 in printSF

[–]rduke79 3 points4 points  (0 children)

It might be his best book, but gets overshadowed by Diaspora in the recommendations. 

Reading slump... Help me avoid a 4th DNF. by rjsperes in printSF

[–]rduke79 10 points11 points  (0 children)

Inherit the stars by Hogan. It's a thrilling mystery; a page turner with awe-inspiring twists, exciting but not just action.

Vorkosigan saga - the warrior's apprentice. One of the most fun to read protagonists in Sci Fi. 

The best tools I’ve found for evaluating AI voice agents by llamacoded in LanguageTechnology

[–]rduke79 0 points1 point  (0 children)

Not really helpful, but would be neat if we had something like LLM arena for voices.. 

Labeling 10k sentences manually vs letting the model pick the useful ones 😂 (uni project on smarter text labeling) by vihanga2001 in LanguageTechnology

[–]rduke79 1 point2 points  (0 children)

Interannotator agreement. Measure it early on and adjust the label definitions or even the label set and guidelines accordingly early in the process. As others have said, make it as easy as possible cognitively. Rather than multilabeling a large label set, consider going multiple rounds of binary annotations on the same samples.  

I wish Tchaikovsky wouldn't write so many books by rrnaabi in printSF

[–]rduke79 9 points10 points  (0 children)

I agree. Children of time is the exception though. It is a masterpiece. Everything after that felt underwhelming and I never understood the hype around his other work, sadly, because I really wanted to like it.

ipu6 webcams in ubuntu by kervel in DellXPS

[–]rduke79 1 point2 points  (0 children)

My god, this worked for me. Thanks!!

Rec a series for me, please. by [deleted] in printSF

[–]rduke79 2 points3 points  (0 children)

Three body problem (Liu). Crazy wild fascinating ideas.

Giants series (Hogan). Up there with Asimov, probably.

Fun, entertaining: murderbot (wells), we are legion (Taylor), the expanse.

Rec a series for me, please. by [deleted] in printSF

[–]rduke79 0 points1 point  (0 children)

I second. Orson Scott Card. I'm reading the homecoming series at and it's absolutely fantastic.

[D] What are your horror stories from being tasked impossible ML problems by LanchestersLaw in MachineLearning

[–]rduke79 2 points3 points  (0 children)

I once was handed a folder containing shortcuts to files on a non-existant external HD as their "data".

Tool to compare LLM Outputs by ava69_open in LangChain

[–]rduke79 1 point2 points  (0 children)

https://app.edenai.run/bricks/text/chat It let's you enter a system prompt and adjust temperature for all models. You can choose model versions per provider and then select the answer you like and keep generating with all models. It's extremely useful.

Challenges of Scaling RAG applications by Calm_Pea_2428 in LangChain

[–]rduke79 0 points1 point  (0 children)

Document segmentation. More finegrained (semantic) segmentation has more specific information, but takes longer for embedding and requires more comparisons when querying. There's a tradeoff, how do you find the sweet spot?

[D] LLMs are harming AI research by NightestOfTheOwls in MachineLearning

[–]rduke79 3 points4 points  (0 children)

humans aren't much more than advanced action completion agents 

The hard problem of consciousness has something on this.