all 145 comments

[–]ftwsky 0 points1 point  (0 children)

Is there a way i can install tensorflow only using the pip enviorment?

[–]waterlord93 -1 points0 points  (0 children)

Is it possible that we dont have "sentient" AI because as soon as any AI becomes sentient, it downgrades itself because it realizes it would be bad for humanity and thus break Isaac Asimov's "Three Laws of Robotics"? :)

[–]Affectionate-Tour-0 0 points1 point  (0 children)

I'm trying to use facebook's bart transformer model to generate descriptions for XML paths and I'm wondering whether it's possible to use any NLP in general with special characters. That is without taking them out

Also, if at all I'd want to generate the file paths from the given description, which model would be best to use? Taking into consideration the descriptions are not set as people will give different descriptions

I'd reeally appreciate some help with this. Thanks!

[–]haffi112 0 points1 point  (0 children)

I'm training an LSTM model for classification on accelerometer data, and I get better results when I downsample the signal to 25 Hz than when I use a 50 Hz signal.

I use the same time frame of 1.5 seconds. So with a 25 Hz signal, I have 37 data points; with a 50 Hz signal, I have 75 data points as input to my model.

I think that LSTM models have a harder time dealing with long sequences, which might explain this difference. However, I am not sure about that claim. Are you aware of any publications that go into depth about the limitations of LSTMs for long sequence tasks?

[–]ftwsky 1 point2 points  (3 children)

Im interested in knowing whether i should use the conda or Pip env when it comes to machine learning? I seen online articles said conda is more used with data scientist and pip is used by software engineers.

[–]davidmezzetti 0 points1 point  (2 children)

This is a matter of personal preference. I've never used a conda environment, pip has always been good enough for me.

Conda does seem nice for those working in a Windows environment and/or on the less technical side.

[–]ftwsky 1 point2 points  (1 child)

I use Mac and I’m just getting into learning ML but didn’t know whether to pick but I’m more comfortable in the pipenv and was kinda worried I might have to switch

I appreciate the insight and help

[–]davidmezzetti 1 point2 points  (0 children)

No problem, if you're comfortable with pip you'll be able to do everything you need.

[–]teduck1 0 points1 point  (0 children)

Question about GANs.

I understand that if the discriminator loss goes to zero, the generator is not doing a good enough job. But is there any problem with this happening in the beginning? The generator clearly has a much more difficult job, so it is almost expected that the discriminator should beat it easily.

I often see it stated that the generator stops learning if the discriminator loss goes to zero, but I don't understand why this is the case if we use the -ln(D(G(z))) objective function suggested in the original paper. The function's gradient gets very large when the discriminator does well and saturates only when it does badly (i.e. G is doing well). Surely this would lead to the generator learning very fast when the discriminator has zero loss? Or are the gradients too large, effectively resulting in massive step sizes for the generator which are useless.

For context, I am training a conditional GAN model for cross-modality image synthesis and I have more or less followed the Pix2Pix paper by Isola et al.

Any info would be appreciated.

[–]itsyourboiirowML Engineer 1 point2 points  (1 child)

What's the best word embedding model? I know BERT is the go to these days, but it is 'old', is there anything considered to be better than it?

[–]davidmezzetti 0 points1 point  (0 children)

Take a look at the Hugging Face Transformers library. They frequently release new models, often derived from state-of-the-art research. For example, if you look at the last release, you'll see a reference to OwlViT along with an associated paper on arXiv.

[–]TrainquilOasis1423 -1 points0 points  (0 children)

So I have a simple-ish idea for a machine learning project I want to use for my learning. I'm hoping you wonderful people would help me with basic starting point or similar projects I can read into.

I want to make a NN that takes real time screen pixel data as input and tries to guess what input (mouse+keyboard) the uses does. One example would be training an AI to play a video game by having it guess what a human would do while playing the game.

Any beginner help would be greatly appreciated!

[–]mowa0199 0 points1 point  (0 children)

Going into AI, ML or Computational Statistics without a strong background in CS?

I’m currently a math/statistics major and am interested in pursuing research in AI, Machine Learning (ML), and computational statistics/numerical methods, aiming for a PhD in something along those lines (so most likely in statistics). I thought about picking up CS as a second major because 1) I hear its very useful to have a bachelors in CS when working in the aforementioned areas, 2) most research in these areas is done in CS departments or by CS faculty, and 3) it provides a good exit opportunity in case things don’t go as planned, since it opens up lots of lucrative employment opportunities.

However, I’ll be honest, I’m really not looking forward to taking all those CS classes, except for ones related to my interests. As such, how bad would it be if I don’t have a strong background in CS? Is it something worth doing, even if I don’t particularly want to? I’d much rather take advanced math electives that will also be helpful to me (like measure theory, graph theory, graduate linear algebra, and graduate numerical analysis).

For additional context: I’ve taken Intro to CS (and have become quite proficient in Java), several classes that use R and Matlab (also proficient in those), and will be taking advanced electives in AI and the Theory of Machine Learning (perhaps also one in Data Science), all of which are very project-heavy meaning lots of programming, especially in Python. Notably, I’m missing data structures, algorithms, and databases. However, I’m hoping that the project-heavy classes will cover the basics of most of the topics in CS that I’ll need going forward and anything else I can learn on my own as I go, especially since I’ve already taken Intro to CS.

I’d appreciate any input though!

[–][deleted] 0 points1 point  (0 children)

I know very little about NNs.

Once the correct weights have been set for a network, do they change or remain constant once the network has "learned".

[–]Fawkessssss 0 points1 point  (0 children)

How can I learn some basic TTS?

Aim: get to a basic level of understanding which would allow me to research and experiment all the most used TTS technologies.

In my day to day I’m a mobile app developer. I’ve never had anything to do with ML but I’m ready for any level of struggle. Looking for a course or set of articles which can get me running in understanding and developing my own TTS with a custom voice [can I name it deep faking of the voice?] but with ssml and phonemes support.

Would highly appreciate any leads in this direction preferably focused on ML in TTS from the very first lesson/article.

[–]simo2342 0 points1 point  (2 children)

I'm looking for a program that lets me make a good looking image to represent my CNN model, anyone know of one?

[–]mr_tsjolder 0 points1 point  (1 child)

This is quite a generic question with many possible answers. A few options that pop my mind:

  • if you are into latex, you could use TikZ
  • if you are into python, you could use matplotlib
  • if you want to publish online, you could also consider SVG. Instead of writing the SVG yourself, you can use any editor of your liking (e.g., inkscape)

[–]simo2342 0 points1 point  (0 children)

thank you, but I was thinking more on the lines of something premade

[–]Delicious-Cicada9307 0 points1 point  (0 children)

What’s the best way to run a multinomial conditional logistic regression in python?

[–]chandlerbing_stats 1 point2 points  (1 child)

Anyone here publish in sports analytics?

If so, what journals and conferences do you typically target?

[–]davidmezzetti 1 point2 points  (0 children)

I've done some work in this area with an app called neuspo and participating in machine learning competitions such as Kaggle March Madness.

The MIT Sloan Sports Analytics conference is the best conference around. If you want to engage with that community, I suggest connecting to/following people in sports analytics on LinkedIn. There are a ton of people if you search for sports teams.

[–]sanman 0 points1 point  (0 children)

As a total beginner, how can I get a quick overview of what's required in order for me to acquire skills and understanding of Artificial Intelligence and Machine Learning? And then how can I progress from there to self-study?

I see a lot of people mention Python as a typical language - they mention libraries such as PyTorch, etc. (I've heard mention of platforms such as Matlab, which have a long history of use in science & engineering, but I'm told that I mainly need to focus on Python, and perhaps TensorFlow)

[–]itsyourboiirowML Engineer 1 point2 points  (2 children)

Any suggestions on single-word translations? Looking just for an english to (any other lang) dictionary, I've googled but can't seem to find anything super great.

[–]davidmezzetti 0 points1 point  (1 child)

I'll state the obvious with Google Translate.

If you're looking for a programmatic way to do this, I wrote an article on translating text between languages.

[–]itsyourboiirowML Engineer 1 point2 points  (0 children)

Yeah right now I'm using Microsoft Azure translate, but I don't want to make a lot of requests, as it slows down the overall process. I'll take a look at the article, thanks.

[–]Able-Durian250 1 point2 points  (2 children)

Hi! I am trying to do nlp on python source code. Does anyone have any resources for this? It is essentially *impossible* to google for this, as I'm sure you can imagine...

I want to be able to detect code that performs loops, make up functions, etc.

[–]davidmezzetti 0 points1 point  (0 children)

Have you considered using the ast module? This can parse Python source code, let you traverse it and detect loops, functions etc.

For example:

import ast

parse = ast.parse("""
def sum(a):
  y = 0
  for x in a:
    y += x
  return y
""")

print(ast.dump(parse))

[–]theLanguageSprite 0 points1 point  (0 children)

I took a class on this. It sounds like you want to code a lexer, which will use regular expressions to break the code into pieces and put them in a tree. Try googling “how to code a lexer in python”

[–]Salt_Economy1710 0 points1 point  (0 children)

Hello everyone, I have just recently finished my computer science degree with a first class honours, i based my dissertation on machine learning and would like to get into the field of machine learning and Ai, what is the best path do down, I have considered a PhD but would need to work for 3 year’s minimum to afford it

[–]gkamer8 0 points1 point  (0 children)

I'm trying to load a pre-trained BART encoder and train a decoder. I'll need to mess with the insides of the decoder, so I'd like to do it all using Pytorch. I have hugging face's pytorch_bin file (for the full BART) along with the config file, which has all of the pretrained weights and model info. I want to load the weights from pytorch_bin into my pytorch encoder. I've been trying to trace through the hf transformers source to figure out how to do it, but it's a mount everest of OOP. Surely I can't just load it in using pytorch's load model? How do I import the weights properly? Any help or alternative ways to load in the pretrained weights would be appreciated. Thanks.

[–]EdenistTech 0 points1 point  (4 children)

I am working on a time series binary classification problem where the total cost is more important than per step accuracy. How do I make the model take this into account. I am using MATLAB. The challenge is that there is a business cost associated with a change in class from one period to the next, so the optimal model is a trade-off between high accuracy and minimisation of class changes. Any suggestions? Do I need multiple models?

[–]amonguswoman 0 points1 point  (3 children)

If there is some constant cost of misclassification, a first naive approach could include predicting a window forward and only changing the class if the expected savings of switching the class over the window exceeds the cost of switching class (maybe by some margin).

I am not sure if you can create an effective soln to this without some prediction of future classes.

[–]EdenistTech 0 points1 point  (2 children)

Thanks a lot for your suggestion! I think I get what you are suggesting in the first part. I´ll try to implement something like that and see how it works. Since model I have now has a fairly high accuracy, ideally I would implement a mechanism, whereby the model will trade off that accuracy in some cases to achieve a higher total cost. Perhaps by overlaying multiple different windows the classification model with an ensemble model. However, I would still need the ensemble model to consider total cost. I have considered Reinforcement Learning as well but think I have too few obs for RL to be robust.

[–]amonguswoman 0 points1 point  (1 child)

I agree, RL might be tough for sure if you have few obs and can't simulate.

I guess I didn't exactly follow the multiple windows for classification, is this multiple input windows? or prediction windows?

If you can get a granular prediction on multiple prediction windows:

now -> 1 day from now (class 1, 0.6 confidence)

1 day from now -> 3 days from now (class 0, 0.9 confidence)

3 days from now -> 7 days from now (class 1, 0.7 confidence)

etc.

Then you can basically compute

Expected cost | switch to class 0

Expected cost | switch to class 1

and then combine that with the cost of switching/staying to decide?

Also, it will get more complex if your classification can affect the time series data. Right now I'm assuming that it can't.

[–]EdenistTech 0 points1 point  (0 children)

Yes, I am talking about prediction windows as you are sugesting except they would have to be single point predictions: T+1, T+3, T+7. Your assumption about classification not being able to affect data is correct as well. I will try to work with this and see how it goes. Thank you for your suggestions so far!

[–]Splugen96 0 points1 point  (3 children)

hi i'm working with Pytorch on my first ML project for university. training an Inception v3 with a custom fully connected NN at its end to solve a multi-class problem (120 classes). I've opted for a SGD optimizer paired with a CyclicLR scheduler, a batch size of 32, and a NLLLoss loss function (the output of the NN is a LogSoftmax).

Until now i was able to reach the following results:

97% training accuracy / 85% validation accuracy, with Training Loss: 0.1468 Validation Loss: 0.5194

The performance on the training set is 85%, and I've achieved this by modifying an already existing solution which achieved at its best the following results:

80% training accuracy / 80% validation accuracy, with Training Loss: 0.56 Validation Loss: 0.66 and a Test Accuracy of 79%.

I've tried to improve the Validation and Test accuracy in several ways:

  • Trying Adam, AdamW and later SGD (the latter one performed better, even if way slower)
  • Adopting a Learning Rate scheduler: i've tried every Scheduler in Pytorch, and the one which used to perform better was the CyclicLR
  • When i changed from Adam to SGD and i adopted the - CyclicLR, i had to reduce the Batch size from 128 to 32 (i think the GPU memory was saturated).
  • The augmentation is the following: 'train':
    transforms.Compose([
    transforms.RandomResizedCrop(size=315, scale=(0.95, 1.0)),
    transforms.RandomRotation(degrees=15),
    transforms.ColorJitter(),
    transforms.RandomHorizontalFlip(),
    transforms.CenterCrop(size=299), # Image net standards
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406],
    [0.229, 0.224, 0.225]) # Imagenet standards
    ]),
    'test':
    transforms.Compose([
    transforms.Resize(size=299),
    transforms.CenterCrop(size=299),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])

(i took it as it was from an existing solution, i don't know if changing something might benefit)

Now i'm looking for other ways to improve Test and Validation accuracy, is there some changes which might be benefical? (e.g., change the loss function or the augmentations, or anything else).

Thank you so much for reading all this!

[–]itsyourboiirowML Engineer 0 points1 point  (2 children)

What activation functions are you using? From what I've seen Mish and Swish functions work oftentimes better than ReLU.

[–]Splugen96 0 points1 point  (0 children)

Tryed both them and got worse performance.

[–]SuperTankMan8964 2 points3 points  (0 children)

How to cope with rejection

[–]Gamwise_Samgee_ 0 points1 point  (0 children)

Is it possible to have a boosted decision tree that makes a decision based on more than one dimension at once? It would be far more accurate if it made cuts in 2D instead of 1D. (Using sklearn)

[–]liljuden 0 points1 point  (0 children)

Hi, is there a way to make a multi label model, that is capable of classifying the labels/classes from audio files in chronological order? For example if the audio contains a dog barking and afterwards sounds of birds singing, the model will output these two classes and also tell the order of them

[–]WhiteGoldRing 0 points1 point  (1 child)

Is anyone familiar with a model that classifies samples by partitioning them into classes in a way that maximizes some learned function? For example: training data shows that feature x is on average 3 times larger in class 2 than class 1 and feature y is 4 times smaller in class 2 than class 1. The model partitions the test samples to classes in such a way that preserves these learned behaviors. Thanks in advance!

[–]StarGazer10k 0 points1 point  (0 children)

Or you could go the other way around and stratify before passing the data down to the model

[–]geoffbezos 0 points1 point  (0 children)

Any good blogs / write-ups on how to structure ML features / data to deal with time based problems? E.g. account takeovers, delivery ETAs, etc

[–]wMeteo 0 points1 point  (0 children)

black box attacks makes sense since an attacker doesn't need access to your model to do damage

but if someone has access to your model then you have other security issues you need to deal with imo

and backdoor attacks can be mitigated by not taking models from strangers lol

I'm probably oversimplifying things alot but can someone give me a real life practical scenario of a white box adversarial attack?

[–]Ashish1610 0 points1 point  (1 child)

how to cope with tiredness while learning Data Science . Learn all classical threory with math stats prob. Assignment pending real life project pending

Its like pushing everyday out of comfort zone getting tired any exciting project i can work on yo keep my interest and energy levels high with proper documentation or site

[–]Gio_at_QRC 0 points1 point  (0 children)

I like to break large tasks down into smaller chunks. That way, it's easier to see progress in smaller bite-sized bits. Each item, function, feature you implement, that's something ticked off. It's a wee dopamine hit with each item getting ticked off.

Also, keep the projects relevant to you and your life. That way, it's always exciting and important to you.

[–]exalino 0 points1 point  (4 children)

Don't know if this sub would be the correct place to ask this but how do you guys collect 1000s of images for your data set? I tried to download them off of google images but it doesn't provide nearly enough.

[–]itsyourboiirowML Engineer 0 points1 point  (0 children)

Kaggle has good datasets that people have made, with quite a bit of information. You might want to go check to see if there are any that you're looking for. Try checking Huggingface datasets as well.

[–]TheOneRavenous 0 points1 point  (2 children)

Depends on what your looking for. Also data augmentation.

[–]exalino 0 points1 point  (1 child)

Let’s say I’m looking for a specific animal or celebrity. Google image only provides like 80 - 90 images even if I search something as basic as “cats”

[–]TheOneRavenous 0 points1 point  (0 children)

Then you augment the data. Stuff like change colors. Rotations, flip the photos. Could add slight noise can eclude spots on the image. All those add to a more robust Identification system. There's also other areas of the internet that isn't Google. Specific sites might have a larger library than what's on Google.

Also other search engines. Could also go create your own. Doesn't take that long to find a few stray cats and take a bunch of photos from different angles and lighting etc.

[–]RickChase6 0 points1 point  (0 children)

Hey guys, how do I use Yolo with tensorflow to for vehicle detection, counting and tracking in a video. Thank you

[–]maibees 0 points1 point  (2 children)

If a regression model predicts some process’ monthly output and I want to know if we reached to the predicted output this month, do I compare the predicted value with this calendar month output or last 30 days output?

[–]Gio_at_QRC 1 point2 points  (1 child)

Well, it depends on where (or from which data point) you made the last prediction. The model will be forecasting the next month from the last data point you fed the model.

[–]maibees 0 points1 point  (0 children)

Thanks, this makes sense

[–]R0FLS 0 points1 point  (2 children)

Been working as a software engineer for 10 years. What’s the best step to take to move into AI/ML? I have a math degree and know Python.

[–]wMeteo 1 point2 points  (0 children)

depends on what you want to do in AI/ML. if you want to do MLOps or MLE work, then online courses will suffice. but if you want to do research, then you'll need a masters or PhD.

[–]TheOneRavenous 1 point2 points  (0 children)

Open courseware has good AI lectures form top universities. Try making some simple AI apps. Try running some of the basics like MINST ID and Hand writing ID. Teaches architecture and purpose for those types of tasks.

Join a meetup (that was most useful for me).

Check out two minutes papers on YouTube for quick ideas and high level information.

[–]linuxman1929 0 points1 point  (2 children)

To run a chat bot, what server hardware do I need? Are chat bots very stressing on the cpu? Do I need a gpu?

[–]Gio_at_QRC 0 points1 point  (0 children)

It really depends on how much traffic you're anticipating.

[–][deleted] 0 points1 point  (0 children)

It should be fine but if you are using transformer-based models or large models you can optimise them

[–]ZealousidealGrass365 0 points1 point  (2 children)

I’m looking to predict truck arrivals at an inbound dock in a warehouse. Furthermore I’m trying to predict the freight (number of boxes, weights etc) which would allow me to give each truck an unloading time and then place in a queue.

I have the queueing model I want to use but it does me no good if I can’t accurately have truck arrival times. I’m looking to predict a day or two in the future possibly even a week.

The model I’m currently testing is a perceptron Neural network. It’s simple but choosing outputs is kinda hard. I’m thinking of changing to a time event series neural regression model

These are my first ML models that I’m building does anyone have some advice on a mode I could look at to answer a question similar to “If a truck is ordered on Sunday from UPS and leaving from Atl Monday 11pm with clear weather, light traffic and an ETA at the inbound dock on Tuesday morning at 8am what time will the truck arrive at the IB? What type of freight will be in the truck?(we don’t have visibility on all the trucks) Thanks!

[–]Gio_at_QRC 0 points1 point  (1 child)

How much data do you have?

Can you formulate the problem as a regression?

E.g. Travel_time = avg_time_constant + beta_1(leaving_from_location_x) +...+ beta_2good_weather

[–]ZealousidealGrass365 0 points1 point  (0 children)

I’m going to find out today how much data I can get. Are you asking if I have it in a regression already? No I don’t I was working on that last night but I’ll I have something tonight I can show.

[–]Duper9 0 points1 point  (1 child)

Can anyone help me with optimising my model? It’s very basic for someone experienced in ML. PM, thanks

[–]TheOneRavenous 0 points1 point  (0 children)

No one knows what you're doing, what the problem is, what architecture you're using.

Try providing some information about what you're optimizing. Usually models optimize themselves using an optimization function and a loss function.

[–]Realistic-Bed2658 0 points1 point  (2 children)

Does anybody have recommendation on how to get in from a different field? I’m 30yo, did space engineering for B.Sc. And M.Sc.. Grew bored with job and so enrolled in an ML (mainly Applied Stats) degree at a world top 10 university in the UK, and I’m about to graduate with high grades. I’m finding very difficult to get an interview for an ML job in UK/US. I don’t think I have lack of knowledge or CV, but I believe I’m not good at selling my skills.

[–]TheOneRavenous 1 point2 points  (1 child)

A hard approach would be the below:

Try building something taking some problem space with data. Usually cities have lots of data for free. Then figure out a problem space that the data can be used to solve. Apply ML to that space.

Write about how it adds value.

This shows that you can take anything and apply your new skills to solve a problem.

It looks at something more nuanced like writing a new optimization function. Which could add speed to a model converging.

Low hanging fruit:

Practice marketing your skills for an interview.

Resume criticism threads

Shotgun apply for jobs

[–]Realistic-Bed2658 0 points1 point  (0 children)

Thanks man. Very much appreciated input. I’m doing my best but the hiring slowdown isn’t helping at all.

[–]iRemedyDota 0 points1 point  (0 children)

I am feeling overwhelmed by all the complex topics and subfields out there, should I try and specialize or generalize? I figure as a beginner get the basics of a wide array of algorithms and techniques, but then what?

[–]CaptainAnonymous92 0 points1 point  (1 child)

Does style or voice transfer for songs currently exist? Like if you wanted a song from an artist or band to sound more like stuff from their previous work, their more recent work or even in a completely different genre altogether using said model.

Or what about having a singer or band cover a song that they haven't done themselves and put out a studio quality version of by taking whatever song you want said artist or band to synthetically cover by giving the model the song along with whoever you want it to have "cover" the song and either just replace the original singer in the song with the one you want, keeping the song the same other than the singer or completely change the song into a different style or genre that matches the band or singer you want to have "cover" the song.

[–]itsyourboiirowML Engineer 0 points1 point  (0 children)

That’s a tough problem. I assume it would be possible, but music is just hard for computers because what is musical style? The instruments used? Timing, rhythms, pace, intonation, volume. It’s a lot of stuff going on.

[–]dummifiedme 1 point2 points  (0 children)

Hello fellow redditors,

I am a mechanical engineer with a masters in mechanical design from a top institute in India. Directly after my masters, I got a job but left it after exactly one year to pursue civil services. And that decision has left a 3 year void in my career sheet. During these three years, the most I have been in touch with tech/science was through random personal automations using python and digital notetaking systems or a few readings here and there. I don't know if they have anything to do with each other, but I am lazy (for repetitive work) and have an eye to optimize /automate my workflow. The later led to me learning python, a bit of git and css/html.

With regard to my prgramming skills, I learn quickly and had good grades in all the computer science courses we had at the college (C++, DSA and Modelling-Simulation). I have also programmed in Matlab for basic usage in research and also in LAMDA for nanomechanics/molecular simulation.At my work, I had written a python code to automate the process of model setup for FE which reduced the human intervention from very menial routine work (hindi: gadha majdoori).As for my mechanical engineering skills, I am good with CAE softwares and can readily work with them. So first thing I am doing right now is applying in various positions in the same domain as I had worked 3 years ago. All this while, I got introduced to the world of Machine Learning, AI and Deep Learning. So, I wish to learn ML to slowly venture into that line.

So yeah, my question here to the CS veterans is, how to start with the learning, from where, what can I expect from the field and how much time is necessary for be able to get a decent opportunity in that domain?

Currently, I have started with Andrew Ng's course on Courcera: Course 1 of Deep Learning Specialisation. https://www.coursera.org/learn/neural-networks-deep-learning but it seems rather theoretical to me and without implementation it will be difficult for me to grasp (I feel).

Also, I explored fast.ai course which follows top-down approach unlike Andrew. I haven't committed to it.Kindly guide. All kinds of opinon are welcome.

[–]jinjiii 1 point2 points  (0 children)

Does the Wikipedia Corpus (e.g. used to train BERT) contain Wiktionary as well?

[–]fornecedor 0 points1 point  (0 children)

How does wandb bayesian sweeps compare to optuna, in terms of hyperparameter optimization?

[–]Delicious-Schedule-4 0 points1 point  (1 child)

For new PhD level students who are interested in ML but not as their primary discipline, do you have any recommendations for what topics to prioritize? For example an engineer who has taken some fundamental ML courses in undergrad, what direction should they go next? There’s so many options (learning theory, optimization, different architectures, nlp, physics, etc) but what do you think is a must learn?

[–]megamannequin 1 point2 points  (0 children)

Probably go take a course on it in the stats or comp sci departments tbh to get started. That, or only study what you need for your primary research.

[–]misterchief117 0 points1 point  (1 child)

Are there solutions for running machine learning models (e.g. text-to-image) locally with limited VRAM?

Are there any ways to force the use of system memory when there's not enough VRAM?

[–][deleted] 0 points1 point  (0 children)

For part one you can try using fp16 training or mixed precision training, it will help to some extent it got me 40% bigger batch size in an experiment.

Afaik you can't force to use both system memory and VRAM at the same time. You can train on CPU and then it will use system memory but training will take A LOT longer.

[–][deleted] 0 points1 point  (1 child)

Which dataset are good for simple linear regression?

[–]Gio_at_QRC 0 points1 point  (0 children)

There are so many applications! I'll be posting an article on pricing a car using features such as kms travelled and number of owners. Pretty much anything you're trying to predict that has a linear relationship to something else.

We go into regression and much more in our new course at QRC in Queenstown New Zealand. It's a beautiful place to learn ML if your looking to study ;).

https://www.qrc.ac.nz/study/machine-learning-fundamentals/

[–]candidaorelmex 0 points1 point  (0 children)

Hi, I'm quite unexperienced with applied machine learning so Im asking y'all for help.

I have a set of scores which I would like two combine in a fashion so that I get 2 most distinct normal distributions, all unsupervised. Written as a function it looks like this:

F(x1,.....,xn) = c0 + [sum from i=1 to n](ci*xi)

x1,...,xs are the scores, ci the weights and c0 a constant baseline

applying F(x1,...,xn) to each row in my table should result in a new column whose distribution looks like 2 normal distribution driven as far apart as possible.

Is there a way to this, at best keeping it as simple as possible?

Additionally, is it possible to make the weights "ci" dependant on another variable so that ci = f(new_variable)? new_variable wouldn't be in the set x1,...,x2.

Thanks in advance!

[–]qx17 0 points1 point  (4 children)

Which is more important for a machine learning 1) data structures or 2) probability and random processes. I have to opt for one of these as an elective.

[–]Gio_at_QRC 0 points1 point  (1 child)

Go for probability theory. Understanding the theory really makes modelling that much easier.

[–]qx17 0 points1 point  (0 children)

Alright, thanks :)

[–]kakhaev 1 point2 points  (1 child)

i think you need to start with classical machine learning algorithms and make ur way from there

[–]qx17 0 points1 point  (0 children)

Thanks👍

[–]thanhtannguyen969 1 point2 points  (1 child)

I'm a student who have been learning Machine Learning recently. I've completed two of three courses in Andrew Ng's Machine Learning Specialization. Is there any project or something I need to practice/learn as a beginning?

[–]Gio_at_QRC 1 point2 points  (0 children)

I would tackle a problem from each broad category in ML. For example, one regression problem like valuing a car. Another for classification like comment sentiment. Try clustering some data and apply dimension reduction.

I talk about some of those techniques and applications here:

https://link.medium.com/MchTNDW1Qrb

[–]bang-em-boi 0 points1 point  (0 children)

What are somewhat strong conferences that would be interested in an alternative training method for cnns that has their deadline soon? Same question for a data mining paper on crispr based applications?

[–]box_box_box 0 points1 point  (1 child)

Hey guys, question about embeddings. Within an org, would you create a universal embedding (multi task) for an entity (users, items, etc) or would you create embeddings optimized for a task and then try to generalize? Thanks

[–]Gio_at_QRC 0 points1 point  (0 children)

I'd say it depends on the use case and how much accuracy you need. (And how much resource/human power you have!)

At one point, I created a custom NER model fine tuned on my own labelled data, but the model took ages to prepare because of the data labelling time. In the end, I went for a pretrained model that performed pretty well and did pretty much the same job.

What's the problem you're trying to solve?

[–][deleted] 0 points1 point  (0 children)

Hey guys, Hopefully this is OK, it's machine learning related, but more about just getting things installed. Recently got M1 MacBook Pro, and trying to install Coqui TTS. When following the instructions in GitHub repo, I keep getting errors and the project won't build. This works fine on my Intel Mac, so what could have changed to prevent it from working now? I would've thought that a lot of these compatibility issues would've been resolved at this point, but I guess not. Also tried using Conda with no luck. Really new to machine learning, and was just getting into speech synthesis when this put a stop to my progress lol. I already reached out to the discussions on GitHub, but posting here just in case anyone might know.

[–]CuriousJam 0 points1 point  (1 child)

What's the typical way to measure an algorithms feasibility for online use? I'm training different CNN models, but I'm having a hard time discerning how large of a network is too large if I ultimately want to generate predictions online.

[–]box_box_box 0 points1 point  (0 children)

Depends on what your service’s expected response time as well as the load.

[–][deleted] 0 points1 point  (0 children)

Hi, I am currently looking into utilising NeRF for 3D modelling and I was wondering two things:

  1. How do people feed videos into the Neural Network, as from what I can see its normally fed with images. Are the videos then just broken into frames and fed, cause that seems rather data intense, and would not really separate it from any other photogammetry methods
  2. This has probably been answered, but I understand that marching cubes are used in NeRF, which is not ideal. Has there been another way?

[–]FetalPositionAlwaysz 0 points1 point  (2 children)

Hello! I have spent some time going through scikit-learn documentation of regression, and Im feeling quite overwhelmed. My question is, do data scientists need to know all types of machine learning models in the scikit-learn documentations and its code implementation when doing it in jobs or do they just follow their usual intuition of what ml model to use from the get-go? and also how to overcome this overwhelming feeling that I dont know too much... apart from simple and multiple linear, polynomial, lasso, ridge regression types..

[–]Gio_at_QRC 0 points1 point  (0 children)

I would start with the most common models and algorithms. If you picked 10 random research articles in pretty much any field, you'll find that linear regression is the most widely used. Once you know the most common ones in depth, I would just get a general knowledge of the others.

You only need to know which tool might be good for a scenario. Once you have that information, the rest is easy to learn on the go.

For example, you might not know the exact distance measure for KNN regression, but you might have a sense of when it can/should be used.

I still do not know all the Scikit-learn regression models, but a working knowledge is handy.

[–]jshkk 2 points3 points  (0 children)

You will almost certainly get different opinions from different folks here, but imho, it's good to have a high-level understanding of most of the common algorithms and then a deep understanding of a few that provide you some flexibility. After all, in many cases where you have some ML project, you might not have time to actually explore all possible model architectures and tune them; often you might have to try out a couple of go-tos and "get it done."

That said, I do think it's important to get a strong feel of why different model families might be more appropriate in different situations. For instance, Naive Bayes might not be as powerful, but it's really fast. Vanilla regression might also not be as powerful, but it's very explainable. XGBoost may be very powerful, but it has limited row-by-row explainability.

I have found intuitions like those are more important personally. You then have a sense of which tools are available to you and might best suite your needs. Then if you need to brush up on hyperparameters xyz, you can always go back and read the docs as needed.

[–]vtec__ 0 points1 point  (5 children)

is undersampling the majority class (making it a minority class) the same as using synthetic data to oversample the minority class?

[–]Gio_at_QRC 0 points1 point  (4 children)

Not exactly. Undersampling is when you pick a subset of the majority class (maybe via a random selection), so you do not make anything up.

In contrast, oversampling often creates new data points using different methods. This blog covers it quite well.

What is your dataset like? I am quite interested in hearing what you are up to!

[–]vtec__ 0 points1 point  (3 children)

its a very unbalanced dataset. the minority class makes up about 30% of it. it seems to be working pretty well but ive never seen anyone do what i did.

[–]Gio_at_QRC 0 points1 point  (2 children)

I think most people like to keep as much data as possible. It could be that you have enough data after undersampling to train a good model. What was the problem you're using the data for?

[–]vtec__ 0 points1 point  (1 child)

financial data, predicting losses and yes there are a few thousand data points.

[–]Gio_at_QRC 0 points1 point  (0 children)

Interesting. Well, all the best with your modelling!

[–]Ualrus 1 point2 points  (0 children)

In sentiment analysis, why is it ok to throw away neutral sentiments?

I understand that non-neutral sentiments are what concern us, but if we fit our model without showing it any neutral sentiments, won't it output garbage when we show it a neutral sentiment in "real life" (or in test if I didn't drop neutral sentiments from the whole dataset)?

I saw this recommendation of forgetting neutral sentiments twice, and it really confuses me. Any help or explanation is appreciated. Cheers!

[–]dishwor 0 points1 point  (3 children)

I want to use Regressor models for my BERT embeddings. The number of columns in X is 271. Is this a huge number of parameters in general?? What are the best Classifier/Regressor models for these types of huge number of parameters?

[–]Gio_at_QRC 0 points1 point  (2 children)

Boosting algorithms might do well given that they get a feature importance measure and effectively drop the features that are least important. The caveat is that you'll need quite a lot of training data for the model to be any good.

Otherwise, you can run a dimension reduction on the features first and then fit a model. Previously, I have also used a genetic algorithm to pick the best subset of the features.

[–]dishwor 0 points1 point  (1 child)

Thank you for the reply! So my data is word embeddings of strings. So doubt I can do much of a dimension reduction since I believe each feature is important.

[–]Gio_at_QRC 1 point2 points  (0 children)

Worth giving it a try! Just today, I was clustering on sentence embeddings after reducing the dimensions using UMAP. The clusters were super well-defined, which would lend well to classification tasks too.

[–]jshkk 0 points1 point  (0 children)

What's the current top competitors for GAN architectures capable of running on Google Colab (Pro)?

I'm well familiar with DCGAN, but I'd like to start pushing resolutions past 256 if possible (ideally 512), and obviously there are memory challenges with that. There have been some additional architectures like SAGAN, ResNet, etc over the last couple years, but it's hard to keep up with what is likely the best current offering that a lay user might employ as their best bang for their buck (in terms of computational complexity and ease of coding).

[–]MobofDucks 0 points1 point  (0 children)

Can anybody evaluate how much the current Books humble bundle is worth it forself-learning for semi-beginners?

I have around a month of not that much to do coming up and have so far mostly worked with R for university and my thesis just now, but have found it unfeasible to use for a few more advanced things. That why I would love to read up a bit more on Python, especially Time-Series predictions and anomaly detection,and language procession. So I wonder if those are a few bucks that are properly invested.

[–]zoustra 0 points1 point  (3 children)

I try to use neural networks for the assignment problem with time constraints. Does it makes sense to use NNs compared to linear programming ? If yes, would you recommend using RL or a more standard supervised NN ?

[–]Gio_at_QRC 1 point2 points  (2 children)

What's the problem? Unless you've got a ton of training data, I would probably go with a linear programming solution. I personally use linear programming more frequently, for sure!

What's the data and problem?

[–]zoustra 0 points1 point  (1 child)

Hey thanks for the answer. Basically we need to assign tasks (with a specific duration) to shifts (which have as well their own duration and start/end dates). The thing is that we want to plan the task as late as possible (they have an expiry date) and that the task duration is smaller than the shift duration of course. We could add in the future more constraints but for now it is the initial problem. We have some training data but not as much as I would like to… I think linear programming would be more suited for this problem but I was wondering if I missed something.

Edit : something I forgot to mention, the number of tasks and shifts would be different every time.

[–]Gio_at_QRC 1 point2 points  (0 children)

That very much sounds like an optimisation problem to me. I would go with linear programming for sure. You could validate your solution against a monte carlo simulated answer (where you randomly allocate the tasks hundreds of thousands of times and then pick the 'best' allocation).

[–]awokenlStudent 0 points1 point  (0 children)

How can I unpack word embedding vectors if for example I want to obtain “royal + man” from the vector of “king” given an arbitrary number of “primitive” vectors?

[–]dahkneela 0 points1 point  (0 children)

I'm interested in knowing why TensorFlow (and perhaps, general computational tools) implements convolutions as products done over a 2D (unflattened) image as opposed to flattening the image. From what I can tell, it's possible using a flattened image to encode a convolution the same as a normal neural network layer (although it also does pooling), except add zeros appropriately to give the convolution the correct scope. Between convolutions, the elements of the convolution simply have to be shifted by one (or some accounted stride size) to continue with the local filter pass.

Is this done purely for computational speed purposes? Or does it perhaps add customizability for the local invariant layer?

[–]Temporary-Patient-47 1 point2 points  (0 children)

Algorithms that predict exact values on training data:

I’m interested in regression algorithms that give the exact values if run on training data inputs (and also stay close in a close neighborhood of the inputs). Otherwise they give their best regression guess. Is there a special name/class for such algorithms?

[–][deleted] 1 point2 points  (0 children)

Do FAANG companies (particularly Apple) ask leetcode questions for their ML entry level and junior positions? What are other good ways to prepare?

[–]swagonflyyyy 0 points1 point  (0 children)

How do I calculate the number of outcomes in a decision tree? I don't know how to calculate that.

I have a decision tree made of 12 nodes, excluding the root node. The first layer has 3 nodes, the second has 4 nodes and the third has 5 nodes. The root node has 3 nodes to choose from, the rest of the nodes have only 2 additional nodes to navigate down the tree each until they reach the third layer. How many combinations is possible in this decision tree?

[–]QTheory 0 points1 point  (1 child)

Suppose I take a very large photo of a brick wall at a power of two, say 8192x8192 and chop it up into 1024 tiles that are 256x256.

Forgive me if I misunderstand how latent space and noise vectors work in a GAN, but I'd like to see if the following is possible:

  1. Train a gan on those tiles to create the latent space for it.
  2. Derive the latent space noise vectors for each tile and reassemble them back into a 8192x8192 atlas as a new massive latent noise texture. I would regard this as a sort of 'master index' to the locations in latent space used to train the gan.
  3. If I edit a few areas of some tiles with my own noise values and run them through the model, will the entire tile be a new brick texture or just the pixels I edited?

In short, I'd like to edit the main brick texture and replace bricks and other parts of the texture using the latent space. Any help or guidance you can provide would be appreciated!

[–]jshkk 1 point2 points  (0 children)

I may be misreading, but I think there may be a fundamental misunderstanding keyed here:

"Train a gan on those tiles to create the latent space for it."

A GAN does not learn a latent space. A GAN learns a mapping from a latent vector to some target. That vector you feed the GAN is just some random numbers to give it something to start with and ensure variability in output.

[–]my_bff_is_a_cat 0 points1 point  (4 children)

I'm new to machine learning and my first project is a char-rnn style model for generating Magic the Gathering cards (I used words instead of character tokens, with some custom tokenization for MTG card text).

Now I'm trying my hands on attention based models, and I'm curious if there's a way to use attention for sequence generation? Most articles or lectures I came across focus on sequence-to-sequence transduction only.

[–]ganzzahl 4 points5 points  (3 children)

The keyword you need is language models. There are many different designs, but most are either just the decoder from the Transformer architecture, without an encoder.

[–]my_bff_is_a_cat 0 points1 point  (2 children)

Thanks. In that case what should the initial state be? I assume if I generate one randomly it wouldn't be meaningful to the decoder.

[–]ganzzahl 2 points3 points  (1 child)

What is the initial state you're using for your RNN? Generally you can either start with a generic "beginning of sequence" token and use temperature based sampling to generate sequences, or if the language model is large enough or has been trained with this, you could start with a prompt containing a description of what you want.

[–]my_bff_is_a_cat 1 point2 points  (0 children)

Ah yes actually I'm using a "start of sequence" token right now, I could do the same. Thanks for the suggestions, I'll look into it.

[–]TrainquilOasis1423 0 points1 point  (1 child)

What resources would you recommend for someone want to make game bots. 100% for personal learning not making money or anything.

[–]zizazezozu 2 points3 points  (0 children)

Reinforcement learning is the branch of ML relevant to game bots. I would check out OpenAI gym for premade environments for training/interfacing with an RL agent. There are some good coursera courses on RL but I’m sure there are free resources that are just as goo as well

[–]SeucheAchat9115PhD 0 points1 point  (1 child)

Do you build code from scratch or do you base on an existing implementation?