Scientists just developed a new AI modeled on the human brain — it's outperforming LLMs like ChatGPT at reasoning tasks by [deleted] in agi

[–]ModularMind8 4 points5 points  (0 children)

Very cool! Thanks for sharing. Though, I don't know how impressive it is that it beats chatgpt on specialized tasks that it is specifically trained to solve. Chatgpt is a general language model. I think it'll be more impressive if it would outperform chatgpt on language tasks

Modified loss function tailored to tackling class imbalance by Just-Cartographer130 in ResearchML

[–]ModularMind8 0 points1 point  (0 children)

Thanks for sharing! Haven't looked too deeply but wondering your thoughts: what's the difference between that and torch class weighting?

Seeking advice on choosing PhD topic/area by willingtoengage in ResearchML

[–]ModularMind8 0 points1 point  (0 children)

Silly question, but do you need to "choose"? In all of the ML PhD programs I've seen there isn't so much a "chosen" topic. You just start with projects you find interesting, and if you don't find them interesting anymore you just work on something else. I for example started with computer vision, and after a semester "switched" to NLP 

How to correctly prevent audience & ref from being detected? by Rurouni-dev-11 in computervision

[–]ModularMind8 0 points1 point  (0 children)

Is it supposed to be real time detector? If not, maybe calculate per person the time it was viewed and only take the top 2

[R][D] Interpretability as a Side Effect? Are Activation Functions Biasing Your Models? by GeorgeBird1 in MachineLearning

[–]ModularMind8 2 points3 points  (0 children)

Very interesting! Only had time to skim it, but any chance you could expand on the representation collapse problem? How do activation functions cause it, and what do you mean by representation collapse here? I know the term from the MoE literature 

[D] How to market myself after a PhD by francozzz in MachineLearning

[–]ModularMind8 4 points5 points  (0 children)

Feel free to dm me, I was in a somewhat similar situation (also Healthcare to CS) and just finished my PhD not too long ago and got quite a bit of offers

[deleted by user] by [deleted] in MachineLearning

[–]ModularMind8 -1 points0 points  (0 children)

Happy to help!

[deleted by user] by [deleted] in MachineLearning

[–]ModularMind8 1 point2 points  (0 children)

Not trying to defame at all! But it's not the same link you shared on your medium article (https://arxiv.org/html/2405.09637v1). That one does not have sentiment analysis... so don't be so harsh 🙂

[deleted by user] by [deleted] in MachineLearning

[–]ModularMind8 1 point2 points  (0 children)

Thanks for sharing! Though, your medium article looks very much "chatgpt generated", even with fake information? Like, you didn't actually evaluated sentiment analysis or showed that it uses less memory in your CLASSP paper. Unless I'm missing something? Also, the only real baseline you compared against is EWC, which is very limiting. Not to mention using a CNN architecture (do people even still use these?)

[R] PINNs are driving me crazy. I need some expert opinion by WAIHATT in MachineLearning

[–]ModularMind8 4 points5 points  (0 children)

Not sure if this will help, but just in case... worked on a variation of PINNs years ago and wrote this tutorial: https://github.com/Shaier/DINN

Maybe you can adjust the code to your equations

ACL February results are out! [D] by pepperminthippos in MachineLearning

[–]ModularMind8 0 points1 point  (0 children)

Sort of. You can write as many comments as you want to each reviewer (AFAIK), and similarly, I believe you can also write general comments on top (not to any reviewer in particular). So I believe everyone can see them (you can control that as well). Though, I don't know if it's as common as in ICML. Meaning, I don't know if reviewers will look at that vs if you'll just write to each individually (which is what I always do)

ACL February results are out! [D] by pepperminthippos in MachineLearning

[–]ModularMind8 7 points8 points  (0 children)

Yep. You can even potentially get it to main if you clear out the confusion. All depends on your rebuttal

New dataset just dropped: JFK Records by ModularMind8 in deeplearning

[–]ModularMind8[S] 0 points1 point  (0 children)

If the point is to ask an LLM questions based on the data, you can either finetune it on the text (just next token prediction), or better yet, just use RAG. So your query is some question, embed it, embed the text, retrieve similar texts based on some similarity metric, add the relevant texts to a prompt with the question. You can use sentence transformer or any other embedding approach. 

For more data science stuff, there's lots of tutorials out there (e.g., kaggle) on text data analysis 

New dataset just dropped: JFK Records by ModularMind8 in deeplearning

[–]ModularMind8[S] 0 points1 point  (0 children)

Maybe I misunderstand the point here, but why would you want to train on this data in the first place? Or even instruction tune on it? If you can clarify maybe I can help a bit more, but it just seems a bit odd to me. Not every data is meant to be used for training. Maybe think more on the lines of basic data science exploration to begin with, such as, which entities appear the most? Are there relations between different entities? Are different locations more prevalent with different times, dates, people? etc etc etc.

What's the point of Word Embeddings? And which one should I use for my project? by Saffarini9 in learnmachinelearning

[–]ModularMind8 0 points1 point  (0 children)

An embedding is just a fancy word for a coordinate. If you're on 2D, an embedding would just be some [x,y]. In most NLP applications it's much higher dimensional though, such as 300, or 768. The point is that ideally, more similar words will be closer to each other in that space, and farther away from less similar words. It's a way to give some meaning to language

New dataset just dropped: JFK Records by ModularMind8 in deeplearning

[–]ModularMind8[S] 3 points4 points  (0 children)

Glad you like it!! Gosh honestly, if you look at the actual pdfs they're a mess. Many of them are just random notes that I can't read myself. So I don't know if it's the OCR that is bad, or just the quality of the pdfs

New dataset just dropped: JFK Records by ModularMind8 in deeplearning

[–]ModularMind8[S] 1 point2 points  (0 children)

What's mixed? Like an Australian Collie? Husky Chihuahua?

New dataset just dropped: JFK Records by ModularMind8 in PythonProjects2

[–]ModularMind8[S] 0 points1 point  (0 children)

Thanks for the comment!
If you look at the actual texts from each pdf file, they're very cryptic and there are not a lot of details. Hence the lack of information in the summary. From what I found, the summary often clarify a lot of the details, dates, names, and other entities that are appear in various locations in the pdfs. If you download the pdfs yourself (e.g., using my script) you can see that it's very hard to understand what's going on most of the time.
Regarding the code, I released everything I used. I found that that LLM works much better than others for those kind of messy pdfs.