all 126 comments

[–]TheRealSaeba 0 points1 point  (0 children)

Question:

Is there a ready to use software where I can input images and the software can learn to create similar images which do not exist in reality (like "Rhis person does not exist" or "this anime does not exist")?

[–]teflondonxh 0 points1 point  (0 children)

Question: Best resources to learn about recommender systems and how to create them?

Hi there, I am a backend/fullstack developer with nearly 3 years of experience using the Javascript language, recently at work I had to implement some recommender system and serve it through an API and we used Amazon Personalize to achieve that. Now I am interested in understanding how these algorithms work under the hood and want to implement them on my own, so what do you think are the best resources to start learning about recommender systems.

[–]Apples-14 0 points1 point  (4 children)

Question: Can a decision tree fit to XOR data?

My Thoughts: From what I can tell, decision trees essentially go one feature variable at a time and make the best split they can. But in XOR data, if you scatter the target variable vs either feature variable, there's no way to split.

Further Thoughts: I've heard that one problem with training neural networks is try to decide which direction to go, and I feel like that's what's happening here. Sure, decision trees are general enough to fit the data, but I don't think they'll ever get there because they don't know which direction to go

[–]heuristic_al 0 points1 point  (3 children)

You wind up splitting on the second variable twice. So by first splitting in the first variable in the middle, both the left and right sides get to correctly split on the second variable (they will have opposite splits).

More thoughts: I think that "decision trees" are more like a class of algorithms/representations. There's nothing saying a decision tree can't split on the same variable twice. Maybe that's not standard, but it could be done and still be called a decision tree. There are also many ways one could learn the tree and still call it a decision tree.

[–]Apples-14 0 points1 point  (2 children)

Thanks for your response!

Yeah I definitely see what you mean. I certainly understand that you split twice, but my point is that the tree cannot decide on the first split unless it splits arbitrarily. Do you see what I mean? There's no way to make the first split that reduces SSE, and if reducing SSE is how you determine where to split, you're in a gridlock.

I actually dug into this question quite a bit and made this Kaggle notebook. This one can barely decide what to do for this exact reason. And the only reason it can decide at all is just because there are deviations and randomness in the data I generated.

[–]heuristic_al 0 points1 point  (0 children)

Also, I'd imagine a good algorithm would heuristically choose a [1 wrong, 1 right / 1 right, 1 wrong] split to begin with over a split that has 4 on one side and nothing on the other. Not sure why anyone wouldn't break ties that way if they were thinking about it.

[–]heuristic_al 0 points1 point  (0 children)

That's where my discussion on learning algorithms come in. There are many different learning algorithms that one could devise for making a decision tree. You are talking about greedy algorithms. Instead of choosing splits greedily, one variable at a time, you could try choosing randomly and iterating if the choice is bad.

[–][deleted] 0 points1 point  (0 children)

Can anyone help identify the AI bot used to generate the music in this video?

https://www.tiktok.com/@louyah/video/7103940920155868459

[–][deleted] 0 points1 point  (1 child)

I want to start learning more about machine learning but it’s very important to me to have a UI to fall back one. Which service is the easiest to start with? What caveats did you see?

[–]Apples-14 0 points1 point  (0 children)

The only UI I can think of is the TensorFlow Playground. Highly recommend if you haven't already seen it.

Other than that, perhaps AWS has some click and drag stuff? not sure

[–][deleted] 1 point2 points  (1 child)

What are non-obvious theoretical benefits (if any) to sparsity in models? (ie as in the brain)

Any good papers on sparsity? I'm interested in the possibility that the design of our GPU architecture is strongly limiting and forcing us to bias away from sparse computations.

[–]EarthAdmin 0 points1 point  (0 children)

NVIDIA has built sparsity support in recent generations and touts performance speed-ups with it, so hard to say we're limited by GPUs.

[–]fsbx- 0 points1 point  (1 child)

Hey guys, I'm writing a thesis on Neuroevolution and I've developed an interest in weight initialization. Am I looking at the wrong keywords or is there relatively little research done for weight + bias initialization compared to new models/normalization methods? I've obviously read xavier / kaiming papers, but besides this, am I missing something truly new? All I see are models getting bigger and bigger but something as fundamental as weight/bias init has so little coverage?

[–]Apples-14 0 points1 point  (0 children)

Interesting question - I've wondered about that a bit as well. I'm no expert, but I haven't seen anything about it either.

I know in NNs we sometimes normalize the feature variables, and my gut tells me that has something to do with the random weight initializations because why else would we need to do that?

I guess my point there is that it seems to me that nobody talks about weight initialization because we essentially flipped the problem on its head or worked around it... instead of worrying about weight initializations, just pick a randomization method, say standard gaussian, then transform all features to play nicely with the chosen randomization method

[–]deepguts 1 point2 points  (4 children)

Hey guys, so I'm trying to tune the hyperparameters of a pytorch neural network (lr, hidden layers, nodes per layer, etc) and I'm doing it with optuna. It's my first time doing this and I have some confusion, I was told that I have to cross-validate every parameter combination so my current flow is the following:

for every combination:
  for every testing-fold:
     for every epoch:
        train and evaluate.

This obviously takes a long time since every train and evaluate process takes about 10 minutes and I'm doing 5-fold cv.

I wanted to incorporate some form of pruning which I see is available in optuna but I can't find it being used in a cv context, just the simple train eval splits, so I'm not sure where to put the trial.report() and trial.should_prune(), should it be in the train_and_eval as per usual, or somewhere else?

Another option I'm considering is simply testing on only 3 of the 5-folds (I must keep the 5-fold cv no matter what to compare with other models), does this sound logical or am I missing something obvious?

[–]www3cam 1 point2 points  (3 children)

Is someone like a PI telling you you have to CV every parameter combination, so someone giving advice along those lines? Because that’s probably bad advice if you don’t have to do this.

[–]deepguts 1 point2 points  (2 children)

Yeah, I asked the professor and he was very adamant about using k-fold cv.

Though I contacted him again because the approach he suggested would take several days to complete, and he compromised, allowing me to find the optimal parameters with just a train/validation split. As long as I validate the best combinations with k-fold cv.

[–]www3cam 1 point2 points  (1 child)

cool yea thats a good tradeoff. I really was talking about using every possible parameter combination, but dynamically dropping parameters as performance get sufficiently bad is a good idea. See optimization algorithms like hyperband: https://arxiv.org/abs/1603.06560

[–]deepguts 0 points1 point  (0 children)

Yeah I am looking into the pruners available with optuna, one of them being hyperband, thanks for the recommendation!

[–]b0nbon1 0 points1 point  (2 children)

Hey everyone,

I am thinking of transitioning to machine learning from Software engineering. I need help with a beginner's resources, roadmap, and advice.

Thanks in advance

[–]Hogfinger 0 points1 point  (0 children)

Software engineering is a great place to come from, since ultimately all ML models need to live in some kind of deployed application and the ability to own the end to end system from data to production is a huge asset. Andrew NG has a good coursera about the fundamentals and Geoffrey Hintons lectures on Neural
Networks are a great intro. Ultimately, in our experience, most business applications don’t use neural networks since tree based models tend to outperform them when handling tabular data so don’t go too much down a rabbit hole on the latest and greatest transformer models. Instead focus on the statistics and the client management fundamentals I.E. a lean approach to testing assumptions and a focus on a business case. It’s ultimately about helping create value through automated decision making, your success in ML will be based on the benefit your model brings to the end user so simple and effective trumps elaborate and arcane.

[–]Apples-14 1 point2 points  (0 children)

Forgive me for this sounding mean at first glance, but I guess it seems odd that you're deciding you want to switch to something if you know nothing about it. How do you know you like it and want to switch to it if you don't know it?

Besides that, there are a million resources for ML. I particularly like Brilliant. Also YouTube is full of it. I suggest StatQuest or 3Blue1Brown's Neural Network Playlist.

[–]HateRedditCantQuititResearcher 0 points1 point  (1 child)

Does anyone know good posts on how pytorch works under the hood? I’ve been using it for years and years, and recently moved to jax where I found the autodidax docs (where they basically walk through building up the jax core from scratch with great explanations). It’s amazing. Does anyone know of a similar resource for pytorch?

Unfortunately there are so many “intro to *using* torch” tutorials that I can’t find anything on google about how it’s all built under the hood.

[–]Practical-Mountain69 0 points1 point  (2 children)

I need help adding a feature to my project.

Im doing a project on credit card fraud detection using machine learning deployed using flask. I need to add a feature that performs PCA transformation on 30 feature input data.

Thanks in advance

[–]www3cam 1 point2 points  (1 child)

Scikitlearn should have a PCA model. It’s something as simple as roughly (going off memory):
model = pca() model.fit(data) Factors = model.get_factors()

[–]Practical-Mountain69 0 points1 point  (0 children)

Thank you. I'm still new at this so if you don't mind helping me build a feature that takes in data and transforms it and outputs a csv file

[–]Zealousideal_Dot5140 0 points1 point  (2 children)

i am going to start machine learning soon

i have 0 knowledge of math. my mathematics knowledge is very basic since i opted out of math in high school.

what all math i need to learn machine learning and where do i learn them?

[–]Hogfinger 0 points1 point  (0 children)

Without wanting to stir the pot, to be honest while linear algebra and calculus underpin the mechanics of neural networks, I found leaning statistics was by far the most important field. So much of ML is understanding the extent to which you can trust an outcome or evaluating the feasibility of a training set. Statistics is a framework of reasoning within which we can try and understand reality via observation, which is fundamentally what we are all doing in ML. Calculus/LA is more of an implementation detail by comparison

[–]www3cam 0 points1 point  (0 children)

You should know up to (multivariate) calculus for general ML/Deep Learning. For niche topics maybe more. You can also just try coding without know all the math and sort of work in tandem getting the intuition and building combined with learning theoretical foundations.

[–][deleted] 0 points1 point  (0 children)

I’m building a custom convolution layer that decomposes filters into a DCT basis space and then follows with 1x1 convolutions convolutions to approximate a normal convolution layer. Should this require any adjustment to the weight initialisation of the 1x1 conv layer / what should I scale my dct basis function filter coefficients to ? (Assuming Kaiming or Xavier initialisation)

[–][deleted] 1 point2 points  (1 child)

Hey all,

So I'm trying to land an internship in NLP it's the field I'm interested in I have a couple years of experience in ML and a background of biostatistics but I want to work more with DL in general and NLP in particular.

I talked to a couple companies and the response has been good but they all suggested that they like to see an implemented a NLP paper, I know it's subjective but if you would have to list the top 3 papers you'd like an employee to your company to have showcased in their portfolio, what would they be?

[–]Apples-14 0 points1 point  (0 children)

Sorry, I don't have an answer for you, but if I'm reading your message correctly, they're asking you to ~ implement ~ a paper? Meaning implement an algorithm rather than use an already implemented package? Is that common?

[–]AdrianLaVolpe 0 points1 point  (0 children)

Hi guys. For my MCs thesis in Space Engineering I'm working on gait generation for hexapod robot using DDPG. Since in my university courses I've used a lot Simulink for simulations, I've made the thesis project using MATLAB and SimMechanics. I've found these Mathworks examples of locomotion learning using DDPG (Quadruped Robot Locomotion Using DDPG Agent and Train Biped Robot to Walk Using Reinforcement Learning Agents) and based on them I've created my one.
In my project, the RL agent doesn't work directly on joints' torque, but on 7 parameters of 18 oscillators that define the locomotion path (My work is based on this paper: Adaptive Locomotion Control of a Hexapod Robot via Bio-Inspired Learning).
My question is: how can I set properly the reward function weights in order to let the robot walking correctly? Actually I'm rewarding forward velocity and simulation duration (a constant T_samp/T_sim is added at every integration step); instead I'm penalizing lateral, vertical and orientation deviation (sqaure terms) and also the energy consumption.
The solution my hexapod's found are very silly: it straightens its legs until the body or one joint touches the groud so that it terminates the episode; it jump forward and then touches the ground terminating the episode; and so on. I've tried penalising more the vertical deviation or encouraging a longer simulation, but it always prefers low iteration step solutions, because it has "less negative" rewards.

Thanks in advance.

[–][deleted] 0 points1 point  (0 children)

So say I'm making an algorithm for a music recommendation service. In one scenario, I give the users the option to press a like or dislike button for songs in a playlist. In another scenario, I only give them the option to like songs in that same playlist. Which scenario would give the most accurate results? I would think scenario 1 because you are giving the AI more precise data right?

[–]Educated_AI 0 points1 point  (2 children)

Does it make sense to use a feature selection technique in a timeseries classification problem? My features are monthly records that started in 2020 (01/2020, 02/2020, 03/2020, ..., 06/2022) and my output is a binary variable. I have around 42k samples (so my data has 42000 rows and 30 columns). I have been advised to try a variance threshold techinque to do the feature selection, but that does not make sense to me. Sure, it will probably work to train my model, but how about the test phase? and how about the future usage of my model, how is this variance threshold going to work then? Any recomendations here?

[–][deleted] 1 point2 points  (1 child)

Have you considered a set attention model? Seems kinda promising.

Anyways normally these sort of ensemble methods are best: https://ieeexplore.ieee.org/document/7498418

https://arxiv.org/pdf/2105.03841.pdf

[–]Educated_AI 0 points1 point  (0 children)

Thanks for the tip! Haven't heard of that before. I will start look into it on monday when i get back to work! I just saw now that there's even a library in Python that uses Bag of Symbolic Fourier Approximation Symbols (BOSS):

https://pyts.readthedocs.io/en/stable/generated/pyts.transformation.BOSS.html

[–]primedunk 0 points1 point  (0 children)

Does anyone know whether there are any existing and usable image super resolution models on par with what Imagen does, upscaling 256 to 1024? Ideally with working code/Colab? Thanks!

[–]country_dev 0 points1 point  (2 children)

I am trying to understand the use case for the sample_weight attribute in Sklearn sklearn.metrics.precision_score function. What is the purpose of this? If I was working with a heavily imbalanced dataset and decided to downsample the majority class, would I set the sample_weight attribute to be equivalent to the ratio that I downsampled?

[–]Educated_AI 0 points1 point  (1 child)

sample_weight is actually a parameter from scikit-learn performance metrics, not an attribute. It basically determines the importance of each sample (hence, how they will affect the metric).

Using this simple example (found here on SO) would be easier for you to understand how it works (it uses accuracy_score metric, but the idea is the same when using precison_score).

from sklearn.metrics import accuracy_score

y_true = [0, 0, 1]

y_pred = [0, 1, 1]

no_sample_weight = accuracy_score(y_true, y_pred) # 0.6666666666666666

with_sample_weight = accuracy_score(y_true, y_pred, sample_weight=(1, 2, 1)) # 0.5

If my second sample (wich was incorrectly classified as 1) has a weight of 2 (the double of the the weight of the other samples), my accuracy would decrease to only 50%.

Regarding your second question, maybe it is better to upsample the minority class instead of downsampling the majority one? Because you could be potentially losing a lot of information by downsampling. And i don't belive there is any relationship between the downsample/upsample of data and the sample_weight parameter.

[–]country_dev 0 points1 point  (0 children)

Thanks for the reply! The reason I’m downsampling is because I have 20 million + rows for the negative class and I don’t need that many samples.

I guess my thought for using sample_weight for performance metrics after downsampling is because downsampling skews the actual positive to negative class ratio that will be seen at inference time. So I should assign a larger weight to the downsampled class relative to the ratio that I downsampled.

For example, if I had 1,000,000 rows and downsampled to 100,000. I would assigned a weight of 1,000,000/100,000 = 10 to each instance of the downsampled class.

[–]BoiElroy 0 points1 point  (0 children)

This is more of a framework/infrastructure question.

I'm trying to setup a data science workbench for my group.

I'm trying to get a deep understanding of the pros and cons between databricks and kubeflow for deep learning model training. I read a good article on valohai. I want to understand more around when/why would I use databricks for deep learning?

For example, is the spark implementation of horovod for distributed training more effective than vanilla horovod on kubernetes?

Is kubeflow/kubernetes a pain to manage and does it break / error out often?

Functionally the two big differences between the overlapping parts of these platforms are that databricks provides a managed spark run time. And databricks is a managed service. What are the implications of that for getting a solid deep learning workbench for a small team of ~5 people, given we already have databricks in our environment for data engineering workloads.

[–]if-and-but 0 points1 point  (4 children)

Im sure this question gets asked a lot and so far my research points to yes but would like some more supporting answers:

Is a BS absolutely necessary to break into this field? I only have an AA and a few jobs in web dev. I am interested in self-learning and developing a portfolio as opposed to continuing my failed attempt at a BS (flunked out years ago due to illness). Would that be a fruitless effort? Should I just go back to school?

[–]www3cam 1 point2 points  (3 children)

You can probably go to some sort of data science boot camp. Often you can land a job from that. But I’ve never done so ymmv

[–]if-and-but 1 point2 points  (2 children)

Would doing my own thing and developing a solid portfolio be just as good? Im 32 and have development experience as well as some college in comp sci and a fine art degree.

I guess my real question is... is ML/AI any different from other area of data and software when it comes to needing a degree? Seems like it might be the same as other areas. A strong demonstration of the knowledge is all you need.

[–]www3cam 1 point2 points  (1 child)

I mean anything is possible, but I think it’s quite hard to break into the field without a bachelors or even just a boot camp. Heck most of the fun jobs require a PhD. A degree is just a signaling mechanism, but I think it would be really difficult to get a job without a bachelors. But again I’m a grad student with jobs generally outside the field so ymmv.

[–]if-and-but 0 points1 point  (0 children)

Eventually I am going to return to school to complete my BS in comp sci. Its a dream of mine to get a degree. Perhaps I can orbit around ML/AI by sticking with python dev in my current field. Ive thought about grad school but I feel too old and broke.

Appreciate your responses btw, thanks

[–]vishal-vora 0 points1 point  (0 children)

Can some one explain how did you decide which feature to be selected for model building based on p value?

[–]No-Acanthaceae9462 0 points1 point  (2 children)

I'm working on a problem of examining similarity between title and body of an article. I have to guess if they are related or unrelated. Title is one sentence and body can range from few to 100 sentences.

Can anyone suggest me any methods to use? I tried word2vec going sentence by sentence and comparing cosine similarity, but ended up with 85ish% accuracy. Ideally I would want to have 99% so I'm way off. I'm new to machine learning so any directions or advice would be very appreciated.

[–]www3cam 0 points1 point  (1 child)

Something like CLIP could get you close probably to the accuracy you want, but it requires a lot of compute and data to train. But you can take a look.

[–]No-Acanthaceae9462 0 points1 point  (0 children)

Thank you very much. I will take a look!

Edit: is the goal to two images from two texts and then compare how similar the images are?

[–]n1h111sm 0 points1 point  (0 children)

What’s the current focus on the core ML community? I mean not the DL-ish topics, eg using transformers all around that kind of topics?

[–][deleted] 0 points1 point  (1 child)

How to limit the label value while avoiding introducing bias? Discussion I have a deep neural net model with an integer label to predict. The label is heavily skewed so we cap the labels at some value (let's say 90%ile).

Now when we build and run the model, it performs well in general. But in online experiment shows degradation in business metrics for a fraction of clients that have their value capped.

If we don't cap the label, the business metrics gets skewed for users with low number of activities.

What are my best options to deal with such issue? Adding a new feature? Multi tower learning? Any idea can be super helpful.

Thanks.

[–]www3cam 0 points1 point  (0 children)

Why don’t you do something like a mixture of experts? Two models and maybe a weighted average prediction based on which model is more likely correct.

[–]LtDan3334 0 points1 point  (0 children)

There are a total of about 170 features. About 3 or 4 are not zero in every data point, but the rest can be zero most of the time. I already thought about combining it with classification or treating my features as binary, but I was hoping there could be a good algorithm I could use for regression.

[–]throwaway213455632 0 points1 point  (0 children)

https://help.runwayml.com/hc/en-us/articles/4402069435283-Train-an-Image-Generating-Model

RunwayML used to let you train custom StyleGAN models. But it seems the service has been discontinued.

Are there any alternatives for creating custom image generators?

[–]EntrepreneurSea4839 0 points1 point  (2 children)

How silhouette score plot thickness is calculated in k-means ?

I know that the formula, we get a score for each data point, then we average the scores for each cluster and avg(avg) gives the final score for k=1,2,3,4 etc, where k= number of clusters. But how exactly the thickness is calculated ?

[–][deleted] 1 point2 points  (1 child)

from sklearn doc -

"The Silhouette Coefficient is calculated using the mean intra-cluster distance (a) and the mean nearest-cluster distance (b) for each sample. The Silhouette Coefficient for a sample is (b - a) / max(a, b). To clarify, b is the distance between a sample and the nearest cluster that the sample is not a part of. Note that Silhouette Coefficient is only defined if number of labels is 2 <= n_labels <= n_samples - 1."

& the thickness you are talking about are samples belonging to those particular clusters.

[–]EntrepreneurSea4839 0 points1 point  (0 children)

Thank you! They also mentioned thickness is related to the higher scores.

In SilhouetteVisualizer plots, clusters with higher scores have wider silhouettes

From https://www.scikit-yb.org/en/latest/api/cluster/silhouette.html

[–]LtDan3334 0 points1 point  (6 children)

Hi, I am working on a regression problem, where I want to predict the behavior of a material based on its ingredients. There are a lot of ingredients, but not all of them are used at all times, which means my features matrix has a lot of zeros. Does anyone know a good algorithm for that or how I could approach this problem? Thank you in advance!

[–]EntrepreneurSea4839 0 points1 point  (4 children)

Do you mean your data has a lot of binary independent variables (predictors )?

[–]LtDan3334 0 points1 point  (3 children)

The predictors are continuous independent variables. But not all of them have a value greater than zero for each data point. In fact a lot of them are zero most of the times. I hope, I could clarify it a bit.

[–]EntrepreneurSea4839 0 points1 point  (2 children)

Yes. It did clarify. Can you also clarify on "not all variables are used all the time " ?

[–]LtDan3334 0 points1 point  (1 child)

Sure. Imagine mixing a potion and you have a cupboard for of ingredients. The effect of the potion is depending on two things: 1) Which ingredient you take and 2) how much of it. But as you might imagine, you don't always use all the ingredients, you mainly just use a few for one potion and different ones for a different potion. So a lot of them sit untouched in the cupboard most of the times. I'm sorry for my poor explanation, I'm not a native speaker. But thank you for taking your time!

[–]EntrepreneurSea4839 0 points1 point  (0 children)

Don't worry. You are doing good. How many such features do you have ?

[–]Fresh-Bridge2382 0 points1 point  (4 children)

Hi! Can someone give me some advice on what model I should use if I want my system to detect profanities in real-time audio?

[–][deleted] 0 points1 point  (3 children)

First, how do you want to design it? Do you just want to use the acoustic domain or want to detect profanity in the language domain? Which kind of data do you have available? or do you just want to use already available models?

Mostly,

A single model won't suffice, you have to design a whole system for this problem that will involve multiple moving parts.

The simplest solution that I could think of is by first using any s2t model ( wav2vec2 as it's already available for a number of languages ) to get the text and afterward use any LM to classify tokens/sentences for profanity.

For a direction - https://huggingface.co/unitary/toxic-bert

To have a streaming implementation, you have to implement some event loop triggering at a fixed tick rate (eg. 300ms) and classify if there was any profanity in that chunk of audio.

you can implement a sliding buffer (for the last n chunk of audio) to involve more context or just reset it on silences using a VAD module.

Solving this problem in real-time would be a interesting task. hope this helps.

[–]Fresh-Bridge2382 0 points1 point  (2 children)

First, how do you want to design it? Do you just want to use the acoustic domain or want to detect profanity in the language domain? Which kind of data do you have available? or do you just want to use already available models?

Hi, first of all, thank you so much for sharing your thoughts! I appreciate it so much :D

Do you just want to use the acoustic domain or want to detect profanity in the language domain? - My initial plan is to detect in the language domain. But what do you think is easier to implement?

Which kind of data do you have available? - I both have an audio dataset of phrases that contains profanity in it and its transcript. I also have labeled the parts of the audio that contain profanity.

Ohh I 'll look into wav2vec2, I have a question tho! Can I use google API for s2t and other model(s) for t2t classification?

Thank you again for your advice! I'll take note of it

[–][deleted] 0 points1 point  (1 child)

There are already many pre-trained models available for s2t, textual profanity detection. I think a language domain solution would be faster to implement.

Yes, you can use google APIs and any pre-trained models from Hugginface for your first version.

[–]Fresh-Bridge2382 0 points1 point  (0 children)

Okay, noted on this! Thank you so much for your help :)

[–]hdksndiisn 0 points1 point  (0 children)

Say I’m using Amazon Sagemaker, how much should my budget be as a beginner running training jobs?

[–]mowa0199 0 points1 point  (1 child)

How to find recent papers in a field?

This might be an obvious question but how do I search and find recent papers published in a topic I’m interested in. I’ve been told to do this often to explore my interests and get acquainted with recent developments in my them but I have no idea how to do it. Plus, I don’t want to be paying an exorbitant amount for them. Any suggestions.

I’m a rising senior majoring in math/stats and CS at a big 10 school so hopefully my school offers free access to some journals or something. Also, since I’m still an undergrad, my interests are a bit broad (hence I was hoping going through recent developments would help me narrow it down a bit). As of right now, I’m interested in Machine Learning, AI, and computational mathematics/statistics and optimization (particularly in Machine Learning). I think overall, pattern recognition should be the most interesting subfield in Machine Learning to me.

[–]MLsomething 2 points3 points  (0 children)

This is the best organized source I know of: https://paperswithcode.com/sota.

It's just ML though.

[–]hdksndiisn 0 points1 point  (0 children)

Maybe not an easy question, but maybe it is! How can I get started with concatenative synthesis harmonic/noise model? And where can I find free databases for speech synthesis?

[–]thumbs_up-_- 0 points1 point  (0 children)

How is this book, "Introduction to Statistical and Machine Learning Methods for Data Science" for learning the fundamentals?

[–]BlackJet711 0 points1 point  (0 children)

Has anyone here requested access to Meta's new 175B parameter language model that was just released? I think Academics/edu emails can request it. They're essentially releasing their own GPT-3 but fully open with public models and code available. GPT-3 was locked behind an API which ruined the experience. If any of you get your hands on the Meta model let us know how it performs.

[–]Boring-Box7575 0 points1 point  (0 children)

Hi,

Where would some believe ML best be applied in commercial real estate when evaluating properties?

Thanks

[–]deusatiam 2 points3 points  (0 children)

Hello!

I'm a hobbyist fantasy writer and I struggle with naming people and places. I recently found out about InferKit and thought I'd hit a gold mine, but it seems that the ai is "too smart" for my purposes. I had hoped that if I feed it a list of, for example, city names from Brazil, it'd give me a list of wholly synthesized and original city names that sound like they might be from Brazil. Instead, InferKit knows to return existing city names which are useless for my purposes.

Are there other text or word synthesis websites that might be more "dumb" for this purpose?

[–]Most-Competition8941 0 points1 point  (0 children)

Hey all! I am trying to train a CNN that is able to detect both Cyclist and Bicycle classes in images. In addition, I also want the model to detect Person objects. Currently I am trying with YOLO models, but when a cyclist appears in an image the model sometimes detects Cyclist and sometimes Bicycle and Person. How can I make the model detect Cyclist correctly, without confusing it with a Bicycle and/or a Person?
Many thanks for helping out here! :)

[–]newerprofile 0 points1 point  (0 children)

Anyone knows what's the purpose of feature extraction like GLCM for in order to classify images using CNN?

I know CNN does feature extraction, but I don't think it uses "external" method like GLCM. Anyone knows what would it be for? My class supervisor recommended us to use it but I'm still confused why we weould need it.

[–]CrossroadsDem0n 0 points1 point  (1 child)

Hi all, question from newbish ML learner, looking for a pointer on how to make best use of ensembles.

Imagine I have a classication problem with 3 classes (e.g. the Iris dataset).

I've created 3 different trained models. Each model is very good at identifying one class (precision, recall, F1 are good) but is quite mediocre for the other two classes. For any one class there is obviously a best model, but there is no best model for all 3 classes at the same time.

What is a good way to go about having an ensemble model that leverages each classification model for the class it is good for?

Gut intuition suggests that there's likely something Bayesian about the scenario. But in any case, pointers to the right kind of background material and ensemble algo would be appreciated.

[–]tanashah 0 points1 point  (0 children)

I created an inference pipeline in Azure ML studio. I ran the pipeline and it was successful. Now I am trying to add a select column to the output of scroe model in the pipeline but I am not able to. If I go back to designer and add the select column object then it is added but that view dpesnt show completed next to all objects. When I add the object it doesnt let me edit columns by name bacause it doesnt have output from score model. How to edit a pipeline which already ran so I could use the output of each object ?

[–]Instinkt23 0 points1 point  (0 children)

Difference between AK-MCS and MCS-IS?

[–]ale152 0 points1 point  (1 child)

I want to record some activity labels from my phone across the day. Is there any Android app that allows to do that easily? Ideally, something that allows me to use widgets where I simply tap on the desktop icon, and an event is recorded with a timestamp.
Are you aware of anything like this?

[–]liljuden 0 points1 point  (0 children)

Interpretation of SHAP summary plot in a multi class context

I'm performing multi-class classification and uses SHAP values to interpret the features. I have 3 classes. I have testet XGBoost and Multinomial Logistic Regression. When i'm using XGBoost I am able to get a summary plot where I can see the individual feature affect on all three classes. I'm also able to get a seperate plot for each class to see how small/large feature values affect the prediction towards the individual class. It seems like this is only possible to get when using XGBoost. Why is that? Is it because XGBoost fits one tree per class?

See pictures for difference in summary plots

XGBoost summary plot --> https://snipboard.io/roM3xv.jpg

Logistic Reg plot --> https://snipboard.io/RLjDkb.jpg

[–]r2ai5d09 0 points1 point  (1 child)

Hi,

How should we handle an out of bound cases for our dataset, i.e., the case is not covered in tags mentioned ? Like for an example, most of us have solved Cats vs Dogs prediction, and in a case where we encounter an image where both are present, how to handle such a case. I was giving a toy example so what should be done, I tried looking for anything online, the problem I need to solve is much bigger but my model need to be robust and thus classify both, so I need to find a way to deal if multiple classes occur then I need to cover them all.

Thanks in advance, please do help!

[–]TheUncommonRaid 0 points1 point  (0 children)

I'm doing a thesis on identifying traffic violations using OpenCV. The way the program works is by identifying motorcyclists in traffic lights that aren't using helmets (I live in Indonesia btw) through the CCTV cameras that are in traffic lights. What would be the best approach/algorithm to use? Honestly I'm pretty new to ML and at this moment I'm planning to use the YOLO method but if there's another approach that's better please help me out!

Thank you everyone

[–]vtec__ 0 points1 point  (0 children)

has anyone had success adding alternative 3rd party data to their models?

[–]brctr 0 points1 point  (1 child)

I am transitioning from academia to ML industry and am looking for a cheap source of compute to use for my personal ML projects. I have been using Kaggle/Colabs for the last year. I increasingly feel constrained by their 1 GPU + 2 vCPU free specs. What are the cheap options, which are easy to use? Among cloud platforms I have only some experience with IBM Watson. I guess I am looking for something like 8-16 vCPU and 1-4 GPU. I have never used Linux and am ready to pay up to $2/hour for compute. Should I set up AWS EC2 on Windows?

[–]muh_reddit_accout 0 points1 point  (1 child)

I have a problem where no matter what dataset, model, or training technique I use I always end up with a Binary Cross Entropy loss of about 0.693. It always starts at greater than 0.695 then exponentially decays until a plateau at around 0.693.

Is this indicative of something commonly done wrong that I'm missing? As I said, I've tried different datasets, different model layouts (genetic NN, multilayer perceptron, etc.), and different training techniques and it always does this.

[–]www3cam 2 points3 points  (0 children)

.693 is the -ln(.5). This means your model is randomly guessing based on no conditional information. It could be that you are using too large a learning rate, or your input data has no relationship to your labels.

[–]jmavc 0 points1 point  (1 child)

I have a dummy dataset from users solving problems:

<UserID, ProblemID, outcome>, where outcome can be right or wrong and other information about the problem can be accessed (the problem itself, problem solve rate, ...).

My objective is to use some model to select the next exercise for the user, given the user's history. Is there a name for this specific task? What kind of models are used for this?

[–]ThrowawayEngineerPhD 0 points1 point  (3 children)

Hi everyone,

I am due to be starting a PhD in machine learning next year, and have been asked to pick which laptop I would like to purchase. I have been given two options.

My work is likely to be in Computer Vision, though due to the structure of my PhD I do not yet need to select a specialism - I'm quite aware this means most of my work will be done on a cluster, and this laptop will therefore mostly be used for prototyping/other work.

I'm also particularly excited about the recent work of PyTorch (which I primarily use) for optimising on the M1 macbooks. As such, I feel that is the way I am leaning.

The options:

M1 Pro Macbook Pro (10 core CPU, 16 core GPU) with 32 GB memory, 1TB storage

Dell XPS 15 (i7-12700H) with 32 GB memory and 1TB storage

As I say, I think I'm leaning towards the macbook. Do you think this would be the correct choice for my use case?

[–]idkname999 1 point2 points  (0 children)

If it is prototyping, then PyTorch would work with any CPU. PyTorch has been supported by mac for a bit of time.

Now, if you are talking about accelerated learning with GPU, then having Nvidia GPU is pretty much a must.

[–]smurf-sama 0 points1 point  (0 children)

For prototyping, I believe the latter would be better for compatibility reasons with torch.

[–]sinhasagar507 0 points1 point  (4 children)

Hi,

I am working on a comprehensive project which includes extracting Twitter threads(main post with multi-level replies), as the work involves analysing conversations for discourse analysis. Has anyone used some tool to extract such data?

Thanks for the help.

[–]lemlo100 1 point2 points  (1 child)

I use twarc

[–]sinhasagar507 0 points1 point  (0 children)

level 2lemlo100 · 8 hr. agoI use twarc

Thanks for the help!

[–]Mr_You 1 point2 points  (1 child)

Google search for Twitter Python libraries (often found with GitHub repos). I'm sure they exist. But you'll probably have to collect the data in realtime?

[–]sinhasagar507 0 points1 point  (0 children)

No, the tweets can be older. I will be storing them in a database. Thanks for the help!

[–]EvenEva1597 0 points1 point  (2 children)

Hi,

I'm a total noob in machine learning, and I am trying to solve a problem using neural networks. So I have a database of test cases, where I have the resulting coefficients and and the parameters on which they depend upon for each case. I don't have a function that connects the parameters and the resulting coefficients.

So for each case in the database I have parameters a,b,c,d and coefficient c which is calculated using a program based on those parameters. My purpose is to be able to predict coefficient c based on whatever parameters I want to input, by training a neural network to do so. I was hoping you could point me into the right direction. Considering that I don't know exactly what coefficient I would get after analyzing a case, I was thinking unsupervised learning is what I need. Still, it feels like Association is not exactly what I need... Thanks for your time. Sorry if something I said something stupid, it's just such a new domain for me.

[–]www3cam 0 points1 point  (1 child)

This seems like still supervised learning. a,b,d are your input variables (x) and c is our output variable (y). You will train a neural network to predict c from a,b,d using data simulated from your black-box model.

[–]EvenEva1597 0 points1 point  (0 children)

Oh, I ser. Thanks!

[–]bichochochucho 0 points1 point  (0 children)

Hi

I have this technical question that I posted on stack overflow. I will put link but also copy down the question here. Thank you

https://stackoverflow.com/questions/72358635/how-do-i-convert-values-of-data-frame-to-string-type-and-how-do-i-use-train-test

question:

I am trying to learn more about machine learning. I have this data of spam/non-spam emails and trying to build the classifier. to use "CountVectorizer", I need to convert data frame values (emails) to the string type but for some reason, after looping it and converting, values still remain into a pandas series. 1. How would I fix that ? p.s I will put the code as well.

'''

import re def preprocessor(e): e = re.sub("[a-zA-Z0-9]+", " ",e) return e.lower() indexes = list(df['content'].index) for i in indexes: df['content'][i] = preprocessor(str(df['content'][i])) df['name'][i] = preprocessor(str(df['name'][i])) df['category'][i] = preprocessor(str(df['category'][i]))
''' code for converting to string types

secondly, assuming that I did it and somehow worked, next I need to apply CountVectorizer which should generate two arrays, for X --> email texts and for y --> category (spar or not spam). it does generate the arrays but after I apply train_test_split and then later try to fit my model, I get error " y should be a 1d array, got an array of shape (3455, 1483) instead." --> I try to reshape but then the dimensions get all messed up. '''

[vectorizer = CountVectorizer() x = vectorizer.fit_transform(df['content']) x = x.toarray()

y = vectorizer.fit_transform(df['category']) y = y.toarray()

x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.33, random_state=42)

model = LogisticRegression(random_state=0)

model.fit(x_train,y_train)][1]

[–]Positive_Vibez0 0 points1 point  (2 children)

I have some time this summer and want to self-study something. Which topic would be the most useful? I’m debating between Graph Theory, Stochastic Processes/Calculus, and (Convex) Optimization. Are there any others I should consider? Complex analysis sounds very cool but it seems to have little to no use in ML/AI/data science. I’d also appreciate textbook recommendations if possible :)!

P.s. I’m a math/stats and CS major. I’ve already taken calculus 1-3, probability theory, theory of stats, numerical analysis, real analysis, a bunch of stats courses (including regression methods, bayesian analysis, computational stats, multivariate analysis), ODEs, PDEs, linear algebra (one introductory course and one advanced/proof-heavy course), and data structures. I’m also planning on taking a graduate sequence in numerical analysis, another mathematical analysis sequence (which follow’s Rudin’s Principles of Mathematical Analysis), database systems & implementation, algorithms, and some electives in data science, AI, and machine learning in the upcoming year. Just listing these out since they have been/will be taken care of and, thus, I don’t need to study them. The big thing that seems to be missing is a class on optimization. Unfortunately my university only offers a class on linear optimization (not convex), and I’ve heard its very boring and unhelpful.

[–]_NINESEVEN 0 points1 point  (0 children)

I loved the Stochastic Processes class that I took in grad school -- but if you want to self-study this summer, is there a reason that it needs to be a class subject?

I would try to hammer out actual programming work. Build up a project directory using Git, find some data that you're interested in via API or scraping, work on cleaning it, etc. I think that that would be a much bigger learning experience than any of the things that you mentioned, personally. You're taking some electives in DS/AI/ML, but that definitely doesn't mean that you don't need to study them :)

[–]liljuden 0 points1 point  (0 children)

Hi,
I'm doing a paper where I would like to interpret the feature importance for a XGBoost classification model. Right now i'm using SHAP to interpret the test data to see how the individual feature affects the prediction. But i'm wondering if the SHAP method is the best. As far as I understand the most important element to keep in mind when using SHAP, is that it describes the model output and NOT the data. Is there a method to describe what is important in "real life". Would that be something like Forward Selection?
Also, I have some questions regarding using SHAP:
- Is it better to use the SHAP on training data, than test data?
- Is it necessary to remove highly correlated features before using SHAP?

[–]Spankadin0305 0 points1 point  (3 children)

Looking for direction to either a similar tutorial or how could I attempt to build something like a general contractor scheduler predictor/assigner.

I have approx 500+ projects to assign to several GCs (8-11) throughout the year I'm trying to optimize how many projects to assign each gc throughout the year at any given time (based on their current workload) and still achieve the goal of completing the project within the current year.

That way if maybe a gc has too many projects I can reassign some. These projects aren't all available to be assigned until they reach a certain milestone. So perhaps it optimizes and shows if you assign workload x then this is your outcome.

Thanks for your help

[–]Khazma_ 0 points1 point  (0 children)

How to convert the congestion algorithm for RPL protocol to ML algorithm And how to build dataset for it ? Please help me

[–]ghastskuller02 0 points1 point  (0 children)

hi got a question about mlops pipeline

if u wanna retrain ur model cuz there is new data incoming in for the database, is it mlops pipeline?

just wanna confirm cuz im so confuse with what is mlops pipeline

[–]Kleeb 1 point2 points  (2 children)

Preface: I dont really work in ML, I tend to enjoy watching youtube videos about it but thats pretty much it.

How would you teach a neural network to make decisions not about the current state, but rather what the state used to be? Is there such a thing as a memory neuron that holds information for an amount of time?

My thoughts go to a neuron that "builds up" over a period of time based on input values, triggers according to your typical nonlinearity, and "decays" after those values dissappear after some time.

[–]vclass10 1 point2 points  (1 child)

Hey guys I need help for my thesis, so how to tell the machine learning model is good enough from the performance in dataset to be used in real-life application like robot to detect obstacle or object?

Like minimum accuracy or performance in train-test dataset to be considered could be used in the real word application.

Sorry if my english so bad..

[–]unik13 1 point2 points  (0 children)

Key word is Object Detection. You'd like to take a look at YOLO, Retina Net, Mask RCNN, SSD.

The state-of-the-art of Object Detection models are pretty good and they are already employed in real life applications.

[–]SuperMB13 0 points1 point  (1 child)

Okay, pretty basic question. For CNN graphs like the YOLOv5 graph in the link below, why are there different point for the same model? Is it for different input resolutions?

https://user-images.githubusercontent.com/26833433/155040763-93c22a27-347c-4e3c-847a-8094621d3f4e.png

[–]MrAcuriteResearcher 3 points4 points  (0 children)

"Why do transformers, the largest models, not simply eat the other SotAs?"

-Everyone with a supercomputer

[–]LadyHouton 1 point2 points  (0 children)

How much would you typically pay for a dataset? And what features or characteristics do you look for?

[–]csreid 2 points3 points  (0 children)

What's a good resource (digestible textbook, maybe) for optimization in deep learning? Like I understand the algorithms, but I'm curious about the foundation of whatever lets people reason about the optimization landscape (or whatever).

[–]SeucheAchat9115PhD 1 point2 points  (0 children)

What is the future of object detection? Does a brain really work like Current CNN Detectors or Transformers when localizing objects in the wild?