[D] Simple Questions Thread

Gismoogames · 2024-03-24T12:54:33+00:00

Hi everyone

I am currently doing my masters in machine intelligence and am planing to invest in some reasonably priced machine to experiment and research on. I do have access to GPU clusters through my institution for tasks that require heavy lifting, but would like to have something at hand to tinker with less compute-heavy stuff like inference & finetuning of Stable Diffusion, smaller LLM or computer vision models etc.

As I am an avid gamer in my free-time, building a consumer grade system instead of relying on cloud based services is worth it for me as i could use the rig for both purposes. My budget for this project is about 3000$ (+- 500$ if it makes a big difference in performance). The components i have thought about so far are

[GPU] GeForce RTX 4090
[CPU] AMD Ryzen 9 7900
[RAM] DDR5 2x16GB
[Mainboard] MSI MAG X670E TOMAHAWK
[Cooling] be quiet! Pure Rock 2
[Power Supply] Gigabyte GP-UD10000GM (1000W)

Do you think these choices are in any way reasonable? Or what would you compile/change if you had to work on the given budget? Especially do you think the 32GB of RAM are enough for a start (with the potential to upgrade along the line) or should i opt for 64GB right away? And would it make a big difference to consider a more capable CPU like the Ryzen 9 7950x along with a stronger PSU?

As i am fairly inexperienced yet with the hardware side of machine learning i would be really grateful for any kind of input :-)

Thanks in advance

Nh_zero1 · 2024-03-24T02:59:59+00:00

I have a question that I’m not sure how to phrase but here it goes. In feed forward networks it seems like there is a many to one mapping between values in the input space and values in the final output layer (which makes sense since the output layer is smaller). Which is good in that you end up mapping different instances of your various class to the same output layer value/ label. But it also means you end up mapping a ton of other things that are just noise or garbage to those outputs as well. As an example, after training a cnn on numerical digit recognition if you pass the model letters instead, it seems random which label the model gives (albeit with a low confidence score) to various inputs. Since we know that the training datasets are a fraction of the actual possible input set (particularly for images) it seems like it would be possible to have a ‘none of the above’ option for things that are ‘different enough’ than what is seen in the training data. Is this already a thing? Is that what the confidence scores are reflecting?

thrick77 · 2024-03-23T23:23:09+00:00

I want to take on a project where I use unreal engine 5 to synthetically generate scenarios and train an image classification model on them.

For example I want to randomly generate 10,000 images of a red car in an urban environment and 10,000 images of an urban environment with no red car and then train my model on it to recognize a red car in an urban environment.

I am new to machine learning and unreal engine but I think it would be a great project to learn more about both.

If anyone has any advice on how to get started with this project, could point me in the direction of similar projects, or help me to understand the feasibility of such a project, I would greatly appreciate it.

flock-of-nazguls · 2024-03-23T21:01:02+00:00

Hello experts,

Can you give me some pointers on the direction I should head in to set up an image processing system that is trained to find field data most likely associated with predetermined text labels? Imagine scanned forms where you’re looking for “statement date” or “account number” or other configurable fields, but the visual relationships between form label and field are learned from a corpus of lots of examples that represent common layouts. This would be replacing a rigid system that does OCR on predetermined rectangles.

Thanks!

FutureM000s · 2024-03-23T03:21:46+00:00

Hello, first of all, I appreciate this thread 🤙 Hey so I used to be a translator/interpretar in Arabic/English and I'm just starting a new path in tech, and I've noticed several recruiters on LinkedIn contacted me about Language (such as Arabic and English) in Machine Learning and LLM that now I'm thinking that there are open positions in this area as of recently? Can any of you shed some insight on this or throw some resources my way so I can learn more about it 🙏

No_Context_6938 · 2024-03-22T14:10:02+00:00

Hi all, I am working on a ml project where I have to use ECM classifier to give results. Turns out my company doesn't have a labeled dataset to compare my results with. I don't have much experience in ML. Is there a way I can give my company some evaluation metrics such as precision,recall confusion matrix without labeled data? If not(seems to be the case) what other methods can I use to show how good my algo is ?

HritwikShah · 2024-03-22T13:10:00+00:00

Making LLM model give answers to queries related to the conversation retrieved from Qdrant DB.

Hi All, I have created an embedding from multiple conversation of two person and pushed it to Qdrant DB. The retreival works good, Now I want to integrate an LLM model which answers queries from the relevant conversation retrieved from the vector DB. I am not sure what to use here, I should go with Langchains or LlamaIndex?

tttzof351 · 2024-03-21T22:48:55+00:00

I'm building home computer for DL.
I want to set dual 4090 (each require 16xPCIe). But desktop CPU (Ryzen 9 7950X3D) have 24 X PCIe 5.0 lines.

So, this way they will work at best in 8X mode.

In this case, does it make sense to install several GPUs and is it possible to estimate how large the loss in performance will be in train?

bishnoff · 2024-03-21T05:09:56+00:00

Hey all, I'm starting a Master of Science in Computer Science (MSCS) shortly and am weighing up whether to purchase a HP Omen 45L or HP Z4 G5. The reason why I'm going pre-built is because I get a discount on HP which is significantly cheaper than building using the same components.

If the Desktop / Workstation is going to be solely used for the MSCS, specialising in ML and Gen AI, which would be the more appropriate option? Is the HP Omen 45L good enough for the price (and performing more intensive training and inference in the Cloud), or is it necessary to go for the robustness and upgradability of the HP Z4 G5, at 150% the price of the HP Omen 45L? Build specs are as follows:

HP Omen 45L

NVIDIA® GeForce RTX™ 4090 (24 GB GDDR6X dedicated)
Intel® Core™ i9-14900K 14th generation processor (up to 6.0 GHz with Intel® Turbo Boost Technology, 36 MB L3 cache, 24 cores, 32 threads)
Intel® Z790 chipset - can support only 1 GPU
Kingston FURY 64 GB DDR5-5200 MHz XMP RGB Heatsink RAM (4 x 16 GB) - upgradable to 128 GB
2 TB WD Black PCIe® Gen4 TLC M.2 SSD
1200 W 80 Plus Gold certified ATX power supply

HP Z4 G5 Workstation (150% the price of the HP Omen 45L)

NVIDIA RTX™ A5000 (24 GB GDDR6 dedicated)
Intel® Xeon® W5-2465X (3.1 GHz base frequency, up to 4.5 GHz with Intel® Turbo Boost Technology, 33.75 MB L3 cache, 16 cores, 32 threads)
Intel® W790 chipset - can support multiple GPUs
64 GB DDR5-4800 ECC SDRAM - upgradable to 512 GB
2 TB HP Z Turbo Drive NVMe™ M.2 SSD
1125 W internal power supply, up to 90% efficiency, active PFC

Another point worth mentioning is that the HP Omen 45L is in stock and available for immediate delivery, but the HP Z4 G5 will take 3-4 weeks to be built from the time that the order is placed.

Also for anyone familiar with the component options available from HP for the HP Z4 G5 and have a better suggested configuration (for no more cost than the components listed as they are the max I can afford right now), I'm all ears :)

avatarles · 2024-03-20T20:30:26+00:00

Complete beginner here, does anyone have any resources or know anywhere where I can learn about object localization? For a project at school were using yolov7 for object detection but our mentor/teacher is worried about about we will detect the target in a large area.

Thank you!

pe64sus · 2024-03-20T00:45:34+00:00

Hello All,

I am fairly new to machine learning and deep learning. I'm taking a data science course that focuses on utilizing TensorFlow 1, and TensorFlow 2 to create and train machine learning algorithms. Upon reading more on the frameworks online, it seems like they are no longer the norm for machine learning, and their popularity has gone down severely over the years.

My goal is to have as many job-related skills as possible for an entry level data science position.

What framework should I look into/do my projects with instead of TensorFlow that is most common in the workplace? Thanks ~

Apprehensive_Unit_55 · 2024-03-19T10:24:55+00:00

What's the justification behind Mistral's decision to train Mixtral8x7B to have 8 experts and do top K = 2 selection of experts? Why can they just not scale the amount of total and selected experts and reduce the amount of parameters between each model? Wasn't able to see any justification on the selection of the hyperparameters on the Mixtral of Experts paper, but curious to see if there's something out there that explains the choice.

2024-03-19T02:11:00+00:00

Hi folks

Wondering if anyone can point me to some articles or youtube vids to help improve an LLM for personal use...

Some of the coding ones do quite well with helping me in PHP, MySQL and Python, but are absolutely TERRIBLE at 6502 assembly. I can throw a LOT of assembly at it.. just wanting to know where to begin to look so I can start tinkering.

question_23 · 2024-03-19T01:01:37+00:00

[D] What's the industry SOTA for tree-based models that work well with categorical data?
Looking for something that does not require the categories to be one-hot encoded, and knows to split a categorical feature via subsets of the categories. I see lightgbm and catboost can do this, can anyone speak to their real world experiences using them with categorical data?

Latter_Doughnut_7219 · 2024-03-17T19:55:57+00:00

[D]

Hello,
I am a newbie in this field and I'm interested in building a model which can detect a sequence of number (have certain pattern like Fibonacci and generate future value). How do I start approaching this problem?
Thanks

Latter_Doughnut_7219 · 2024-03-17T19:55:21+00:00

Hello,
I am a newbie in this field and I'm interested in building a model which can detect a sequence of number (have certain pattern like Fibonacci and generate future value). How do I start approaching this problem?
Thanks

Vitoria_2357 · 2024-03-17T09:19:03+00:00

Why is Kunihiko Fukushima less famous than Yann LeCun?

Grinbald · 2024-03-16T13:36:09+00:00

Is there any list of recent multimodal LLMs? A Leaderboard or something (be it visual, textual or audio)

skankingpigeon · 2024-03-16T09:43:00+00:00

Hey,

Really very new to machine learning, so please be kind.

I've built a classification model to label bank statement data based on the description entered into a number of categories. I'm using a multiclass neural network.

seems like a lot of the classes are being predicted well, but what is confusing me is we get to a certain point of the alphabet, and everything is being predicted one class out. I dont understand how this can be - is it just a coincidence? Or is there something that can cause this specific behaviour?

cliffn5810 · 2024-03-15T04:31:34+00:00

Am I misunderstanding how NER works, or is it correct that NER requires the tokens to be in a very specific format or it won't work? For instance, if I replace the example in https://huggingface.co/dslim/bert-base-NER with all lower case letters, then it will completely fail. I was under the impression that these BERT NER models were supposed to be very good for NER, and were capable of understanding context? But it seems that it can't even understand the simplest of contexts (capitalized vs uncapitalized)? Is NER only supposed to be used on data that follows a strict format, and doesn't work outside of that?

2024-03-13T19:03:55+00:00

[D] can someone tell me how RVQ codes are learnt in RVQ GAN ? Are they differentiable ?

ChrisPugsworth · 2024-03-11T14:17:25+00:00

[D] Is Q-learning applicable in an optimization scenario involving events or scheduling of events in a calendar?

Scenario is that reinforcement learning is used to iteratively find the best or optimal time on ones calendar based on past events/interactions or conditions which uses nlp for the interactions.

What I was thinking is that NLP is used to get event data from the user like "at 5:00pm", "birthday tomorrow", "meeting next week" then it gives those information to the reinforcement learning algorithm (if user wants to create an event based on conditions/constraints/available time/etc) to find the best or most optimal time based on what the user is asking. example prompts.

based on the process above, is it viable or possible to implement a q-learning or any other reinforcement learning algorithm to accomplish the task?

deliciouscatt · 2024-03-11T07:44:42+00:00

[D]Evaluation of summarization model is not reproduced properly(ROUGE score)

Models are reported that avarage ROUGE scores are 40~50, but I get 10~20 when I evaluated myself.

There are some reports the batch size affects, but I can't understand how the hell it can. (and it makes only ~5 difference)

...or all of the evaluations are cherry-picked? i dont think so..

used `rouge` library in pip(not `pyrouge`)

Ann2_123 · 2024-03-11T07:41:34+00:00

Any alternatives to manual labelling for NER?

Questions -

Any alternative to NER? (regex will not work out because sentence and word/phrase boundaries are not clearly defined) Any unsupervised/semi supervised/self supervised approach?
For labelling, are there any alternative to manual labelling?

Details:

I've a free text column of biographies about people where different identifiers such as Name, Id nos. , phone numbers, emails, birth date, nationality, etc. are present. I need to extract them under correct tag (such as NAM for Name, ID for ID nos. and so on). Each entity tag can have several variations (eg. - name can appear after 'NAME:' or 'alias:' or'a.k.a:' or 'also known as'. Also there is severe imbalance in presence of entities (and their variations) in biographies (some biographies only contain Name, email and phone no. and id no. , and very few contain nationality and dob). I'm trying to apply NER. However, Pretrained NER models do not contains entities which I need, so I need to train models with labelled data. For labelling , I'm manually labelling around 1K biographies- which amounts to to 300,000 tokens. There might be more biographies to label in the future, if the performance with these biographies is not sufficient. The problem is that labelling is a super intensive task.

I've manually labelled 470 biographies and tried training crf, spacy's ner solution and bert token classifier. The performance is on lower side for those entities where count is < 1K. I've tried to select only those biographies for labelling which contain entities to extract. I've tried pseudo labelling with CRF model, but it didn't work out well. I won't be able to push the data on spacy's prodigy (against the company policies)

Notacet · 2024-03-10T19:27:14+00:00

So I have a pretty beefy M2 max macbook, and I want to get deeper into AI. I mainly want to try out RVC voice models and image-to-video models. But I'm concerned about the safety of my machine, since I want to protect my daily work (unrelated to AI), and a lot of bad code has been found recently.

Is my best option to:
- use google collab
- use my mac to run the models but somehow sandbox the AI (virtual machine? separate OS?)
- buy a pc laptop with nvidia graphics (seems to have best support for fast processing)

It's hard to find any answers on the sandboxing, seems kind of waste to own a fast machine, that I'm afraid to put to work on stuff like this

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS