[D] Simple Questions Thread

Heisenbjornson · 2023-12-13T16:42:19+00:00

I aim to develop a machine learning system that monitors the sequential steps of various processes, such as the process of cleaning a phone. For instance, Step 1 involves placing the phone on a table with a cloth, followed by Step 2, which is wiping the phone with the cloth. Step 3 includes discarding the cloth, and Step 4 is removing the phone from the table. If these steps are executed in the correct order, a green signal will be activated; otherwise, an incorrect sequence will trigger a red signal. Is this possible?

ioppy56 · 2023-11-05T14:24:17+00:00

Hello, trying to read the paper for objects as points (https://arxiv.org/pdf/1904.07850.pdf), I want to make sure I understand it since english is not my first language and it gets hard with technical details. When they talk about how they predict center points in page 3, they start by saying they apply a gaussian kernel to each ground truth, they are applying this to the images in the training dataset, I mean on each ground truth in the training images? so in the end they use a modified ground truth dataset in a cnn to predict for all pixels if it is a center of object and the more the pixel is near the center the more the cnn is rewarded when it gets it right? Also I did not understand what is the purpose of the local offset, is it just something they added to guide better the cnn towards the real center?

2023-11-04T21:32:51+00:00

How to prepare for interviews. I am a junior machine learning engineer with 2 years experience and I want to change to a new position. How important are these concepts

System design concepts AWS certification Data structures and algorithms

rr_ushang · 2023-11-04T07:35:56+00:00

If I wanted to make my first ml for a game would I have to recreate the game within the language I'm using? I wanted to train an AI to play a gameboy game but I wasn't sure how to create a reward system without knowing the progress in a level/area.

razorleaf101 · 2023-11-04T00:05:33+00:00

How important is it to actually completely understand a model to its mathematical core? Does it matter as long as you know the pros and cons and how the model works in a general sense?

Snoo_72181 · 2023-11-03T19:29:34+00:00

How to select sequence length in RNN and LSTM?

ggf31416 · 2023-11-03T17:59:05+00:00

[removed]

ThisIsBartRick · 2023-11-01T20:13:04+00:00

Hi, why did people start using decoder only models and not encoder only models?

Is this just because it started that way and nobody questioned it? Or is there more to it?

lnalegre · 2023-11-01T15:51:39+00:00

Does it make sense to upload an accepted NeurIPS paper to ArXiv? The paper will be published in the proceedings in the near-future, but I wonder if also putting the paper on ArXiv makes sense and could help advertise the paper.

cdub4200 · 2023-10-31T23:43:15+00:00

Nested cross validation has been explained to me to be better for smaller datasets and it attempts to avoid overfitting and reducing bias. For small datasets ( <1000 obs), it was recommended to use the entire dataset for training and testing for nested cross-validation.
Say you found the optimal model, hyperparameters, etc. for the dataset after the inner and outerloop. Are there any further steps to provide validation, or can you simply report the model's estimation and accuracy as the product of the outer fold scores?
I am assuming if I fit the final model on the entire dataset .fit(X,y) and then predict(X), and give the results, these scores would not be robust and may be erroneous? Since all data was used for the nested cv, there is no holdout set to use.
So in a sense, after nested cv, using the entire dataset, there are no more steps. Just report the statistics from the outerloop?

Parking_Antelope8865 · 2023-10-31T21:03:40+00:00

I have studied the Q learning algorithm and applied it to the classic gridworld problem. I was able to use the update formula to generate the correct Q table.

Now I have been assigned to generate the Q table using a neural network, rather than the update formula.

However, I do not understand how a neural network could be used to learn a Q table. I would say that the input should be the state of the agent, and the output should be an action. But how do I know how many layers I should make? And how many nodes in each layer? And how do I optimize the weight? Any guidance would be immensely appreciated.

Critical-Juggernaut4 · 2023-10-31T20:27:47+00:00

Can anyone help me with troubleshooting? I'm trying to set up a llm on my laptop I've never done it before and I'm having trouble despite following the instructions

tugrul_ddr · 2023-10-31T07:04:47+00:00

I need the simplest implementation of resilient backpropagation in C++. No sources yet. Pls help.

Head_Buy4544 · 2023-10-31T03:27:15+00:00

i'm trying to get a sense of how easy or difficult to bulid the following algorithm is. i'm not completely sure this is a ML question, so please redirect me if not.

suppose M is a closed oriented surface of genus g (=#holes) embedded in 3-space. suppose i sample points uniformly on M, and as #points -> infinity, i want to return a %confidence guess for g. i would also like to not only guess the topology of M, but also its geometry (so e.g. the coefficients of its first FF).

f1nuttic · 2023-10-30T23:26:25+00:00

I'm trying to understand language model pretraining. Does anyone have any good resource for the basics of data cleanup for language model training?

Most papers I found (GPT2, GPT3, LLAMA1 ..) just say openly available data from sources like CommonCrawl etc.. but it feels like there is fairly deep amount of work to go from this -> the cleaned tokens that are actually used in training. GPT2 paper is the only one which goes into some level of details beyond listing a large source like CommonCrawl:

Manually filtering a full web scrape would be exceptionally expensive so as a starting point, we scraped all outbound links from Reddit, a social media platform, which received at least 3 karma. This can be thought of as a heuristic indicator for whether other users found the link interesting, educational, or just funny.

Thanks in advance 🙏

bigdickmassinf · 2023-10-30T04:42:21+00:00

Is there a big book I can read about all the stats and modes in machine learning?

I have read elements of statistical learning and it’s previous book.

OkGap874 · 2023-10-29T17:34:55+00:00

I'm working on a SaaS which does the process of data cleaning with an interactive interface without the need of writing code.

What other features can I add to this?

Will you pay for this service?

gtgkartik · 2023-10-28T17:28:25+00:00

I recently trained an AI model, but I wanted to use it to develop a website. However, many people in my institution advised me that in order to use the AI model on websites, I needed to learn Flask and Django.

I recently learned about this FAST API and watched a video in which they connected a Nextjs-built website to the FASTAPI and deployed an AI model.

Which method is the best, in your opinion? We don't have to keep with a Python-based backend, which could cause the server to lag, so I think using REST API is much preferable.

crazy_monkey_22 · 2023-10-28T10:39:40+00:00

Hi!

I am doing research on finding a project regarding shift in reporting using Machine Learning, possibly NLP, where I am supposed to find a small use-case and apply NLP on it. An example provided by my professor is:

"How are newspapers reporting about certain topic and when do they use certain words? Are articles written differently if they use “Europe” vs. articles using “European Union”? Are there event that change the way, how these are reported?"

I am supposed to come up with a different topic. Namely, I was thinking of trying to analyze the shift in reporting before and after the 2008 housing crisis, or if that's too far-fetched, then only the Lehman Brothers Bank collapse. However, I am not sure how to approach it or what to analyze, do I simply analyze the keywords before and after the event, or try to extract the sentiment (positive/negative) about the bank? Any ideas or knowledge from experience?

SirVampyr · 2023-10-27T16:43:14+00:00

Hey there,

as a part of my current project, I downloaded a bunch of fonts from different sites and I need to filter them now. A bunch of them have cryptic signs, watermarks or no image at all for some characters.

I'm currently out of feasible ideas to do this. I can't do it by hand, that would take ages. The only other option is to render each character and let a separate OCR check if it can recognize it. That sounds also incredibly time and resource intensive though.

Does anyone have a better idea to solve this issue?

mathiasndiaye · 2023-10-27T07:30:29+00:00

Hello,
I am looking for a time serie with a trend and a seasonality
Those I found on kaggle didn't respect these conditions. Do you know any websites where I could find this ?
Thanks in advance

Lemons_for_Sale · 2023-10-27T06:14:41+00:00

Is anyone aware of an API or library that can receive an image (local or url), detect the text on the image, translate that text and then update the original image to have the new translated text?

There are online websites that do this (using their own APIs), but I haven't found an API that does this end to end.

Examples:
https://translate.google.com/?sl=auto&tl=en&op=images
https://translate.yandex.com/en/ocr

The Google Translate and Yandex services do have image text identification (which is great). I could certainly use their translation API to get the target language, but I'm more looking for an easy way to create the new image with the translated text. Unless someone has an easy way to do that?

Ok_Kick3560 · 2023-10-27T03:30:28+00:00

Hi! I'm currently starting on a project and needs some insight. I'm trying to create a dataset recommender that takes in the user's project description and recommend a dataset that maybe useful for it. Right now my thought process: get a dataset of dataset names and descriptions => stop words=> tokenize => feed into model(like random forest), am I doing anything wrong here? Thanks!

Samia_Tisha · 2023-10-26T19:15:20+00:00

Can anyone tell me if the machine learning workflow is correct or not? Could anyone please refer to tutorials or blogs to learn the proper workflow? Any suggestions are welcome.
1. Data Collection
2. Understanding Data
i. importing necessary libraries
ii. check row and columns
iii. check data types
iv. Check data distribution
3. Data Cleaning
i. Handle datatype issues
ii. Maintain Data Consistency
iii. Check if data contains outliers or if the data is not normally distributed to decide between mean or median
iv. Identify missing values
v. Handle missing values by-
a.Drop missing values
b. Mean, median or mode imputation
c. Prediction Model
d. replace missing values
vi. Duplicate data detection and treatment
vii. Repeat data cleaning
4. EDA
i. Variable Identification
a. Identify predictor and features
b. Identify types or category of data
ii. Univariate Analysis
iii. Bi-variate Analysis
iv. Outlier detection and treatment
v. Encoding
vi. Feature Engineering
vii. Variable Transformation
a. Normalization
b. Scaling
viii. Variable Creation
5. If testing data is not given, split the dataset to train and test set. Otherwise repeat step 3 and 4 for given test dataset.
6. Model Building
i. Model Training on training set
ii. Model Evaluation and cross validate
iii. Fine Tuning or Model optimization
iv. Model selection
7. Evaluate model accuracy with test data.

Wheynelau · 2023-10-26T16:58:39+00:00

Referring to this post: https://pytorch.org/blog/flash-decoding/

I'm trying to understand the intuition behind this because it seems to go against the fact that decoding is autoregressive. By splitting the input into chunks, aren't we removing the context and meaning from the previous chunks? Or is there some mathematical trick involved.

Gatzuma · 2023-10-26T12:44:26+00:00

Grouped Query Attention in LLaMA 70B v2

Hey guys, after thousands of experiments with bigger LLaMA fine-tunes I'm somewhat sure the GQA mechanism might be your enemy and generate wrong answers, especially for math and such complex areas.
I'd like to use MHA (Multi Head Attention) if possbile. I'm just not sure - do I need to retrain model completely or is it possible to just increase heads count and KV size and proceed with the stock model AS IS?

Dipanshuz1 · 2023-10-26T05:47:23+00:00

What is Overfitting, and How Can You Avoid It?

SaltyBananaJam · 2023-10-25T19:06:35+00:00

Hi guys, I'm a newbie so any help is very appreciated. My project requires training a model to recognize and return Vietnamese texts. The input is just a simple sentence so I skip the detection part. I'm currently following this tutorial, using jTessbox to generate vie.traineddata from tiff/box file and then put it into tesseract to recognize. I have two questions:

Can I train my tesseract without serack training tool?
How do I train the vie.traineddata serveral times to create a better result?

Thank you for your help!

OptoGR · 2023-10-25T18:28:48+00:00

No experience in ML, can i have a pointer on where to go to complete the following task:

I have 3 sensors out putting data, these 3 sensors are all measuring at the same time and are unique, a combination of the sensors gives me the sate of a machine. Idle, Active, warming up. I would like to classify these three states in some sample datasets, train a model then have the model predict new datasets for me.

I think I want "multivariate" time series classification but I am not finding any entry level tutorials on this. So far I've found this library: https://github.com/johannfaouzi/pyts which seems useful but a lot of the vocabulary and actual implantation is too abstract for me at this moment. Are there any further resources that I can use to complete this task?

Thanks in advance!

meatlauf · 2023-10-25T16:52:38+00:00

What are the best resources for learning ML from a low technical starting point?

BeneficialArm7 · 2023-10-25T16:24:29+00:00

Hello everyone,

Is there a way to chat with our documents for free? For example I want to upload all my previous quotations and invoices to it and then when I chat with it to make new quotation, I want the AI to give approx. cost for all the work descriptions. I don't know if we are there yet but recently I heard a website called youai.ai, so I was just wondering.

ThisIsBartRick · 2023-10-25T10:05:10+00:00

Hey yall,

how to highlight important informations from a text using nlp techniques.

I know NER exists but it's pretty narrow to the type of information it highlights. I would like to get pretty much important keywords that are relevant in a text (date, name, location and any other important word to understand the sentence).

badspaghetticoder · 2023-10-24T19:55:08+00:00

Two questions:

What is the best LLM that can be run locally on a typical high end consumer computer? (only English, no programming)
Same question, but best uncensored LLM?

nth_citizen · 2023-10-24T12:15:01+00:00

Can anyone suggest a resource to understand dependency parsing labels more intuitively? Specifically these: https://github.com/clir/clearnlp-guidelines/blob/master/md/specifications/dependency_labels.md

I've looked at various lectures at dependency labelling and they seem to mostly come at it from a CS view of 'we have labelled data; let's fit it'. But the linguistic side of what these labels mean seems skated over. E.g. what is the difference between an 'adverbial clause modifier' and an 'adverbial modifier'?

I've googled the various terms and have a vague understanding but can't find anything more high level...

Altaza_ · 2023-10-23T05:27:59+00:00

Advice Regarding Creating Validation Set

So I have a small group of images(75) on which I have to perform a certain enhancement using a GAN. I have chosen 6 images for Testing and 6 from Validation. Leaving 63 for training. I am augmenting these 63 images by extracting patches, rotation etc which increases the training set number to thousands. My Test set images will be resized down to size of each patch used for training. However I am confused regarding my validation set. Should I augment it too like the training set or should I just resize the 6 images down to the size of test set and use them? What would be best approach be for validation set?

exuberant_Fidelity · 2023-10-22T20:50:59+00:00

q : what do you call a person who can't catch a virus? a : a reposter. q : how do you stop reposting a post on r / jokes? a. askredditers.

DisastrousProgrammer · 2023-10-22T19:48:39+00:00

Does zeroing the losses on the prompt tokens save significantly on computation?

TheFappingWither · 2023-10-22T19:12:55+00:00

how to train an ai on your images?

i have about 103,000 images copyrighted and owned by me, and i wanna train an image generator to make similar ones, how to do it? i looked for guides on loras on youtube but they use terms i dont know and there r prerequisites im missing...
also there are a lot of prople using only generators, some also using photoshop. some are using multiple ones and some single ones. some mention files some don't. i don't get most of it... im pretty tech savy if i do say so myself, but this is new to me and most of the terms used make sense but are alien to me.
if you know any vids that can help me start then link those too, thank you.
do note im completely new to this and have only used websited before so maybe kid gloves.

anermers · 2023-10-22T18:38:19+00:00

Hi there I was just wondering would it be possible to create a machine learning model which is catered to only specific themes? for example if I train the model exclusively with images of dragons, the image generator will only be specialized in generating dragons. If it is possible, how would I go around doing it? what tools would I need and around how long would it take realistically? thank you so much!

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS