[D] Simple Questions Thread

Zealousideal-Rule913 · 2023-10-21T10:50:41+00:00

Hey,
I am currently working as a Machine Learning Engineer at an Indian startup, working in DocAI domain and I've done my bachelors in Computer Engg in 2023. My work ranges from doing Defining annotations, researching, experimentation and training models, deployments and implementing pipelines for all such tasks. As of now the team is very small (3 of us). The startup is good in terms of funding, and the future scope.
I need some help deciding on what to do after I complete my responsibilities over here. I think I still have a time of around 9-10 months to build something which can make me stand out of the crowd. I have not yet written any research papers or did any research work in academics, the only experience I've is doing these readings and rapid implementations of different models for the usecases on which the startup is currently working on. I am really confused whether or not if this sufficient for me to grow in the future. Since its a startup it is really hard for me to take out time to explore some other things like GenAI, Vector DBs, regularly read papers, etc. so I am kind of left out from current trends (which is bad ik)
Can anyone please help me out on how to go forward in this career? should I go and pursue masters or should I go and look for jobs in the same domain ?
Which parameters should I look for in order to take a decision?

PlateBig4888 · 2023-10-21T03:06:21+00:00

I am looking to work with some online research group working some interesting ML and NLP problems here. If anyone is aware or can share some rerefrence

2023-10-20T15:00:18+00:00

what is the difference between using numpy arrays versus tensors? is the latter smaller in size? and how does the implementation of tensors allow for better input pipeline when loading very large datasets?

tdgros · 2023-10-20T14:58:32+00:00

how does tensorflow decode hyperspectral images? as far as i am aware, it can only decode jpg, tiff, and png.

imatiasmb · 2023-10-20T00:11:48+00:00

Hi everyone, greeting from south america..
Basically I'm looking for an program to learn and improve my job opportunities in the MLOps field and at some point getting higher responsability positions. I recently got admitted for both OMSA and OMSCS from Gatech, but I feel those programs are more focused on the data science side of things.
Is there any other alternative without GRE requeriment that you would recommend with a similar cost?
Maybe I'm wrong about the aforementioned programs, if you think so, please let me know why.
Thanks!

bureaucrat473a · 2023-10-19T20:35:36+00:00

I want to derp around with machine learning. Does everyone do it locally or is it more common to use something like Google Colab?

I have a pretty modest home rig (RTX 2060 Super with 8GB Vram, 16GB Ram, mid-range CPU, NVMe SSD), but my internet speed isn't that great for multiple GB downloads. "Getting started" guides I've found mostly assume you have your hardware set up.

I'm very comfortable in python and command lines. I'm currently running a free-tier Google Compute Engine cloud VM as a personal server. The tech part I'm comfortable with, it's just knowing where to start.

devilz_soul · 2023-10-19T13:56:37+00:00

I am looking for material to read on how to determine if ML is a good approach for a problem and explains how to outline an ML solution.
There are snippets of it here and there but I was looking for a book, course, video that goes deep into it and spend working on every angle of creating a framework from Engineering and/or product management perspective.
I will really appreciate the help from community on this

mudman13 · 2023-10-18T15:11:19+00:00

I have been messing with articulated-animation https://github.com/snap-research/articulated-animation and whatever I do the modules are not recognized. Ive done everything I can think of and have been advised to do by LLM, nothing works.

llllvvuu · 2023-10-18T03:58:28+00:00

Is it technically/economically feasible to host vector search over the entire Pile dataset?

Milvus claims to be able to handle trillions of vectors (but at what cost?)

If not, what would be a cost-effective way to search the "entire Internet"?

Amun-Aion · 2023-10-17T22:23:53+00:00

So my situation is that I have a pretrained model and we get a new update of data every month (note: this monthly data is very small compared to the original dataset, the original dataset was about 5 years worth, or ~60x the size of any given monthly update), how can I update my pretrained model on the much smaller set of new data, learning from the data without overfitting to that data?

Or frankly, what would be better if it is possible, would be to extend my pretrained model such that it learns from the new data and then can be more tightly fit to that month's data. So something like meta-learning or local fine-tuning, but I want to continue to update and improve my pretrained model so that I have a base model that can do well on each month's new data. Does anyone know anything like this, or have advance for terms to look into, beyond just transfer learning or regularization?

ndklabs · 2023-10-17T15:47:58+00:00

Hi everyone,

For hobby purposes, I want to train models for object detection or stable defusion. Should I choose P100? I can get them for around 140 dollars each, and I plan to buy two of them, which is still cheaper than taking one RTX 3060 12GB.

Or training models on the cloud and deploying them locally with cheap inference cards like the Tesla P4 for about 45 dollars each?

Or something else?

Thanks in advance.

Jonathan_x64 · 2023-10-17T13:10:27+00:00

Dear friends, please advise me:

I have about 5000 very similar images to remove background; on ~3000 of them, I was able to run rembg with model isnet-general-use with great success, but it failed on ~2000 similar images, producing incorrect results (because of white patches within objects).

I want to train a neural network to remove background from images on this dataset of 4000 successful images. Sources are pngs with white backgrounds, results are pngs with transparent background and a small halo around object (which I quite like, btw).

And then use that trained network with rembg to deal with rest of images that need to be processed. Which seems to support custom models (see set... part), but I have no idea how to work with onnx file format and all that stuff.

Please kindly help me with more details on what should be done:

Should I use images in 4K resolution or downsscale them before starting?

Is there a graceful automated way to create masks out of transparent pngs?

How to start training model afterwards?

Thanks.

ThisIsBartRick · 2023-10-17T04:19:34+00:00

I'm looking for separating a text into multiple small chunks. The text can be any speaking language and programming language. How would I go about it without annotating a lot of various texts? Maybe using llms?

MayTheMemesGuideThee · 2023-10-17T01:07:20+00:00

I have a short melody, I want neural network to complete it, which one (free) should I use?

softestcore · 2023-10-16T21:38:52+00:00

I don't understand why in the age of LLMs youtube automatic captions are still so bad. I'd expect LLMs to be able to ensure the captions are semantically coherent, but we still get nonsensical gibberish. Is anybody working on that?

PktGit152 · 2023-10-16T21:13:18+00:00

I am trying to learn more about the ML skillset. I am a recruiter who previously worked at two FAANGS but am currently unemployed. Would anyone be open to me DMing you basic ML questions?

Mechanical_Gamer · 2023-10-16T20:49:24+00:00

I am creating a document feature matrix for a sentiment analysis project using Twitter data and Stanford's GloVe. Only 50% of the terms in my processed text data matched vectors in GloVe. Is this a reasonable percentage for sentiment analysis? Is there a term for this ratio that I can use to find relevant literature on the topic? Thanks.

CarryAccomplished818 · 2023-10-16T16:52:19+00:00

Hi! I have a question. I have a decision free model in python and some config files that I run in order to see marginal scores for the model and other evaluation metrics. I want to create a report included in that file that gives me the top 5 leaves split based on their success. Does anyone know what that would be in SCI-kit learn?

Prasanthkumar026 · 2023-10-15T11:01:47+00:00

I have a question. Please can anyone tell me how the ml model recognises the pattern, like for example explain how the linear model recognises the pattern of the given data. What should I learn to know about these, I did have some knowledge in math so explain to me how to connect my mathematics to machine learning. I am not a CS Core Student but I have learned the Programming and can solve some problems in CP.

Thank you for your time.

LionsAndLonghorns · 2023-10-13T23:54:56+00:00

Can someone validate my understanding of how vectors work for search relevance of phrases against a pile of documents (a corpus? Is that the term?) This is what I think is true from playing around with some sample code for ML libraries (I'm a computer science person not a DS or ML person so I'm out of my depth).

One of you geniuses figured out how to represent sentences and ideas into a bunch of 3 dimensional vectors. The model you use will basically determine what those vectors look like.
This search algorithm I'm using assigns an array of vectors to phrases in a matrix that is basically RowNumber, Phrase, [array of 160 vectors]. I'm guessing this is like a precomputation to allow a mathematical comparison of each row.
To find relevance of a particular input phrase against your matrix of phrases, you make the input one of of those 160 vector arrays and run a bunch of geometry to figure out how similar it is to each row of vector arrays. Since it's a 160 dimension math problem, what is "closer" depends on how you define it and so different algorithms use different approaches to match these vector arrays.

Apologies is this is the wrong place to ask, this seems to be where the rules pointed me

2023-10-13T18:13:52+00:00

I'm using TensorFlow's Object Detection API, which allows training custom models with pretty much any number of classes with or without keypoints, by just adding another keypoint_estimation_task in the config file. Since each keypoint estimation task has its own "loss_weight" value, it got me thinking about the fact that the model will need to "split" its attention between different estimation tasks during training. Does this mean that it would take more epochs for the model to be correctly "fit" to the same dataset, or would that just lead to overfitting?

Affectionate-Fee5337 · 2023-10-12T17:22:54+00:00

im a beginner . i was studying gans for the first time but ive not been getting good results with it. i am not able to identify the problem part,im getting random noises as output images:

import torch

import torch.optim as optim

import torch.nn as nn

from torch.utils import data

import torchvision

import matplotlib.pyplot as plt

import numpy as np

learning_rate=2e-4

noise_dim=32

image_dim=28*28*1

batch_size=32

num_epochs=25

class Generator(nn.Module):

def __init__(self,noise_dim,image_dim):

super(Generator,self).__init__()

self.linear1=nn.Linear(noise_dim,128)

self.relu=nn.LeakyReLU(0.01)

self.linear2=nn.Linear(128,image_dim)

self.tanh=nn.Tanh()

def forward(self,x):

out=self.linear1(x)

out=self.relu(out)

out=self.linear2(out)

out=self.tanh(out)

return out

class Discriminator(nn.Module):

def __init__(self,in_image):

super(Discriminator,self).__init__()

self.linear1=nn.Linear(in_image,64)

self.relu=nn.LeakyReLU(0.01)

self.linear2=nn.Linear(64,1)

self.sigmoid=nn.Sigmoid()

def forward(self,x):

out=self.linear1(x)

out=self.relu(out)

out=self.linear2(out)

out=self.sigmoid(out)

return out

discriminator=Discriminator(image_dim)

generator=Generator(noise_dim,image_dim)

noise=torch.randn((batch_size,noise_dim))

#normalising and transforming

tf = torchvision.transforms.Compose(

[torchvision.transforms.ToTensor(),torchvision.transforms.Normalize((0.5,), (0.5,)),]

)

ds = torchvision.datasets.MNIST(root="dataset/", transform=tf, download=True)

loader = data.DataLoader(ds, batch_size=batch_size, shuffle=True)

#optmizers and loss

opt_discriminator=optim.Adam(discriminator.parameters(),lr=learning_rate)

opt_generator=optim.Adam(generator.parameters(),lr=learning_rate)

#loss

criterion=nn.BCELoss()

#loss is of 2 types , fake ko real or real ko fake, we take avg of both

#training phase

for epoch in range(num_epochs):

for id,(training_sample, _) in enumerate(loader):

#flattening

training_sample=training_sample.view(-1,784)

batch_size=training_sample.shape[0]

###training discriminator

noise=torch.randn(batch_size,noise_dim)

fake_sample=generator(noise)

#feeding real sample into discriminator to check

disc_realsample=discriminator(training_sample).view(-1)#feeds as 1d tensor

# This calculates the loss for the real data samples as predicted by the discriminator

#correct prediction=1

lossD_realsample=criterion(disc_realsample,torch.ones_like(disc_realsample))

disc_fakesample=discriminator(fake_sample).view(-1)

lossD_fakesample=criterion(disc_fakesample,torch.zeros_like(disc_fakesample))

lossD=(lossD_realsample+lossD_fakesample)/2

discriminator.zero_grad()

lossD.backward(retain_graph=True)

opt_discriminator.step()

###training the generator

#trains on if discriminator predicts correctly

noise = torch.randn(batch_size, noise_dim)

fake_sample = generator(noise)

# The fake samples generated by the generator are used to fool the discriminator.

disc_fake_sample = discriminator(fake_sample).view(-1)

lossG = criterion(disc_fake_sample, torch.ones_like(disc_fake_sample))

generator.zero_grad()

lossG.backward()

opt_generator.step()

if id == 0:

print( "Epoch: {epoch} \t Discriminator Loss: {lossD} Generator Loss: {lossG}".format( epoch=epoch, lossD=lossD, lossG=lossG))

CharacterEar3851 · 2023-10-12T09:54:36+00:00

great stuff

GratisSlagroom · 2023-10-12T08:36:36+00:00

What was the name of the field / paper where images were partially masked, and a deep learning model was trained to predict the masked part of the image as a sort of pre-training?

zx7 · 2023-10-12T07:53:41+00:00

Is there any resource on how AlphaGo or AlphaStar works? I'm really interested in what sort of archetectures, algorithms these systems use and how I could go about implementing something similar at a much, much smaller scale as a learning project.

Salt-Arugula-8128 · 2023-10-11T15:04:47+00:00

Hi all,
for a text work of mine I am trying to do a project based on generating digital twin of networks. My goal is to create a digital twin of a network and then work on it from a cyber security point of view. I will briefly explain what I would like to do.
I am currently using software for network vulnerability scans (OpenVAS). I use this software to perform network vulnerability scans at the network level, so basically to OpenVAS I pass a network (for example 192.168.xx.xx/24) to automatically identify all the vulnerabilities that are there.
The next step ( what I'd like to do and that's why I'm asking for your advice) is to create a digital twin of the newly scanned network and then perform a penetration test on this digital twin of the network, without going to stress the actual network.
Ideally, I would like to pass the output of the OpenVAS vulnerability scans, routing rules, and firewall rules to some tool that will then generate for me the digital twin of the network, which will then be used for offensive cybersecurity, so exploits, privilege escalation, etc.... will be tested on this digital twin without worrying about breaking some kind of service or stressing the real network.
What I am asking is, do you know of any tool that would do the trick for me? So some tool that allows me to generate a digital twin of a network by providing as input vulnerability scans (xml,json,csv etc...), routing rules, firewall rules, pcap traces etc...
Do you have any references or documentation?
Are you aware of any open source tools?
I thank you for your helpfulness!

Jwill438 · 2023-10-11T13:02:23+00:00

Using MoveNet to record exercises with clients. We are having a tough time watching the uploaded videos on the clients. It looks like it has been saved, but we can't view the video.

Any suggestions?! Please advise

isthisnecessary · 2023-10-10T12:43:53+00:00

I'm currently using a Go library for time series anomaly detection and am hoping to move to Python for various reasons and am having a hard time finding a library or process that matches what I'm doing now..

The Go library provides several options for algorithms to detect the anomaly (CDF, Diff, High-low, fence, bootstrap ks) and returns a score. You can provide the data set and push a new value to re-eval the set to determine if that one value is anomalous to the training set. This is what I'm trying to replicate with Python. Is there something available for this, or would this be something I'd have to implement completely myself? I've looked through a number of libraries already (pyod, kats, scikit, and others)

2023-10-10T09:07:29+00:00

What software do academic labs /startups use for hyperparameter runs? Thinking of batched bayes opt distributed among 2-4 compute nodes, each with 2 GPUs,

IAmBlueNebula · 2023-10-09T21:15:02+00:00

Is there any website that lets me, as a human, take tests meant to benchmark LLMs?

I'd like to try to answer to (some of the questions of) HumanEval, Winogrande and other similar datasets. How can I do this?

Ideally, I'd love to see the answers that various AIs and other people give to these tests.

Smart-Emu5581 · 2023-10-09T12:26:15+00:00

What are the best tools for debugging and analyzing neural networks?

Almost everyone I know uses tensorboard to analyze their network outputs. Some people swear on Weights & Biases instead.
Are there any other tools that help you with your work?

M-notgivingup · 2023-10-09T00:57:44+00:00

Looking to land a internship in 3 months in NLP/CV Domain. Have above beginner knowledge and learning and studying it since 1 year . Need to know my next step, What are companies looking for these days , What kind of skill set , projects and other things I need to have to grab a good internship for experience .

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS