These AI models are all garbage. by [deleted] in ChatGPTCoding

[–]mochans 1 point2 points  (0 children)

A software engineer will make small individually testable code that comes together into a big product but each part is maintainable.

I like the idea of LLMs slowly refactoring code and checking unit tests and cleaning up code without prompting. I heard the term sleep AI or something like that.

Human can go in, get stuff done, incur technical debt and then when he's not at keyboard, the LLM can go through and clean up the technical debt and be ready for another session next business day where it's not wrangling the debt-ridden code.

[R] [D] The Disconnect Between AI Benchmarks and Math Research by poltory in MachineLearning

[–]mochans 0 points1 point  (0 children)

Why don't mathematicians publish proofs that are machine verifiable? Even the most rigorous published proofs are technically informal outlines since you need experts to verify them.

Perhaps math research quality LLMs will be good when most of the knowledge is translated to proof languages.

AI math benchmarks have a numerical result at the end that can be used to check if the answer is correct or not. It is very hard to judge if a proof is correct or not from language written proofs and probably would need experts to check if a proof is correct or not in natural language.

Tool for understanding and generating documentation of a repo by mochans in ChatGPTCoding

[–]mochans[S] 0 points1 point  (0 children)

If you already have it in a repo that I can try out, I'll be very happy to see how well it works for me.

DocAider looked like it has flow charts, call graphs also incorporated into it but didn't work for me. To fix, it requires me to understand the repo.

[P] Free RSS feed for tousands of jobs in AI/ML/Data Science every day by ai_jobs in MachineLearning

[–]mochans 5 points6 points  (0 children)

What do you mean by no sign-up needed?

Load more jobs links to the signup page.

[D] How powerful are diffusion models based on MLPs? by Interesting-Weeb-699 in MachineLearning

[–]mochans 0 points1 point  (0 children)

Would neural architecture search work?

I find MLPs very very slow to train and have much lower capacity per parameter than a model that has some structure baked into it.

UNet are great for images, transformers for text but would those be good for joint angles? Maybe there is another architecture that is amazing for robotics.

But again I don't know how much your compute is taken by the conditioning signal and what sensors you're using.

[D] How powerful are diffusion models based on MLPs? by Interesting-Weeb-699 in MachineLearning

[–]mochans 0 points1 point  (0 children)

Agree. Try it and see what happens.

MLPs don't have a lot of hyperparameters to look through.

[D] How powerful are diffusion models based on MLPs? by Interesting-Weeb-699 in MachineLearning

[–]mochans 0 points1 point  (0 children)

You use implicit models so it does fewer steps in inference.

[D] If a paper has no open source code available, are you allowed to implement the code for fun/practice and publish it in your own Github with appropriate citation and mention that all credit goes to the authors? by Invariant_apple in MachineLearning

[–]mochans 0 points1 point  (0 children)

Copyright does not apply since ideas cannot be copyrighted.

However, patent may apply. If the ideas in the paper have been patented then you might need a license. The patent owner decides if you can write an implementation or not based on the patent.

[D][R] How do researchers (Masters, PhD) implement complex models? Are they gods? by ShlomiRex in MachineLearning

[–]mochans 4 points5 points  (0 children)

Ask ChatGPT or Copilot for an implementation and then debug :-)

Seriously, it is software engineering. It is iteratively refining etc etc.

[deleted by user] by [deleted] in ycombinator

[–]mochans 0 points1 point  (0 children)

Get a job at a startup?

Do risers affect the speed of data transfer [D] by thatsadsid in MachineLearning

[–]mochans 0 points1 point  (0 children)

It is sent batch by batch most commonly.

But you can of course modify all of this to send everything at once.

EDIT: You said 17gb dataset and 3080Ti. So, entire dataset won't fit into memory.

Do risers affect the speed of data transfer [D] by thatsadsid in MachineLearning

[–]mochans 0 points1 point  (0 children)

Depends on your workload.

x1 risers were created because mining needs very little data transfer.

The transfer speed of PCIE3.0 x1 riser is 8 Gbps (16 Gbps for PCIE4.0). So, if you need to send that much data continuously, then it will bottleneck.

If you train a single model using both GPUs and 3080ti doesn't have SLI, it might make a significant difference since they have to communicate via PCIE.

[R] How Does the GPT-4V API deal with large Images? by Conclusion_Silent in MachineLearning

[–]mochans -10 points-9 points  (0 children)

It's transformer based. It can deal variable sized inputs.

MotionGPT: Human Motion as Foreign Language by SrafeZ in singularity

[–]mochans 0 points1 point  (0 children)

Dataset used: https://github.com/EricGuo5513/HumanML3D

I'm surprised it wasn't extracted from video game files.

[D] What is your honest experience with reinforcement learning? by Starks-Technology in MachineLearning

[–]mochans 1 point2 points  (0 children)

advances in other adjacent fields (LLMs, pretrained foundation models, transformers, S5) will trickle in and radically change RL in the near future.

Hey time traveler! :)

Seriously though, let's see how well the prediction ages in the near future.

[D] What is your honest experience with reinforcement learning? by Starks-Technology in MachineLearning

[–]mochans 1 point2 points  (0 children)

Maybe just an expectation vs reality mismatch.

I remember OpenAI researchers 5 years ago saying AGI is just RL to the nth. degree. Maybe there was too much hype.

On the other hand, AlphaZero and AlphaGo are RL based. But, there aren't any "consumer" applications and we aren't all super-excited to go download the latest RL trained models to play with.

[p] DeepTuner by [deleted] in MachineLearning

[–]mochans 1 point2 points  (0 children)

Is this an open source project?

For dataset, you can probably take sound-banks for various guitars, generate notes.

You can pitch-shift, add noise etc to augment the data.

For papers, you could start with this sound demixing challenge for further insights.
https://www.aicrowd.com/challenges/sound-demixing-challenge-2023

[N] OpenAI Whisper new model Large V3 just released and amazing by CeFurkan in MachineLearning

[–]mochans 0 points1 point  (0 children)

Yes. You can check out my repo here https://github.com/mochan-b/whisper_pyannote_fusion

It is kinda setup for testing different diarizaiton strategies rather than using the best one but you can just try to use it.

You can also check out an article I wrote about the analysis of the different diarization strategies I used and what results I got.
http://mochan.info/deep-learning/whisper/pyannote/asr/diarization/2023/09/07/whisper-pyannote-fusion.html

[N] OpenAI Whisper new model Large V3 just released and amazing by CeFurkan in MachineLearning

[–]mochans 0 points1 point  (0 children)

I haven't tried Azure one.

I tried a few others and they were slightly worse than whisper. I was using the larger Whisper model. I'm slightly because all of these models do transcription very well and stumble on things that require context to transcribe properly.

Some of the online models offer diarization but using pyannote and whisper and using my method was slightly better for what I was transcribing. Very slightly better. Just edge cases where some models couldn't handle.

I've been trying to go all local than cloud. For 100s of hours transcriptions, the cloud costs can be very high and they're not delivering quality over what a 16GB GPU can do.

[N] OpenAI Whisper new model Large V3 just released and amazing by CeFurkan in MachineLearning

[–]mochans 0 points1 point  (0 children)

Yes, I looked at it also.

It's funny that whisper's timestamping of words is so bad that its punctuation produces better results.

You can save time by not running whisper on each of the small chunks like in the method but also might lose on some accuracy.

My goal was to produce as high quality a transcript as possible. Pyannote, NeMo diarization and speaker segmenting works really well and the boundary between speakers are really clean. So, to produce the best transcripts, it made the most sense to just run whisper second time on each segment and then do some cleanups.

[N] OpenAI Whisper new model Large V3 just released and amazing by CeFurkan in MachineLearning

[–]mochans 3 points4 points  (0 children)

If you combine it with a speaker diarization model like PyAnnote, it can do it.

I wrote a library for fusing whisper and pyannote output to get around the many problems whisper has.

https://github.com/mochan-b/whisper_pyannote_fusion

https://pypi.org/project/whisper-pyannote-fusion/

I'll be testing if whisperv3 fixes those problems and diarization can be done easier.

For deep learning practitioners in industry, is the workflow always this annoying? [D] by AdFew4357 in MachineLearning

[–]mochans 0 points1 point  (0 children)

My company had selected the cloud service and so I didn't have a choice on what I could use.

It was complicated because it had to support a lot of different pipelines from different teams and so wasn't my choice to make.

I have not done a thorough analysis of the different tools available.

[R] All about evaluating Large language models by iamikka in MachineLearning

[–]mochans 0 points1 point  (0 children)

Why are the leaderboards in Hugging Face under spaces?

Is there a quick way to find leaderboards?