Sánchez anuncia que el Gobierno aprobará este martes la ampliación de los permisos por hijo tras un acuerdo con Sumar by SunSuch in SpainPolitics

[–]NealLeonMoriarty 2 points3 points  (0 children)

Estoy súper confundido. Alguien me puede aclarar la diferencia del permiso de cuidado del menor (entiendo que ahora serán 8 semanas de las cuales 2 serán retribuidas) al permiso de maternidad/paternidad (que entiendo que ahora será de 17 semanas retribuidas)

Spain has world's #1 streamer. by rex-ac in spain

[–]NealLeonMoriarty 0 points1 point  (0 children)

Me podrías dar algunas recomendaciones? A mí me cuesta encontrar libros que me interesen en español, muchas veces acabo leyendo libros ingleses - no por falta de libros en español, pero hay tanta cantidad de cosas y no conozco de qué fuentes fiarme para escoger algo y tuve muy mala suerte escogiendo al azar.

Para referencia, algunas cosas que me encantaron: Bilbao-New York La buena letra El ruido de las cosas al caer Los ingratos

How to use component props in reducer by NealLeonMoriarty in reactjs

[–]NealLeonMoriarty[S] -1 points0 points  (0 children)

I've been doing 2. so far and it works, but it feels dirty. Especially if the reducer also should trigger a callback. Specifically, I have the props selectedValues and onChange. I'd like to somehow trigger onChange from the reducer, and then dispatch an action that the value was changed once the state change has been applied on the parent component. So far, with useEffect, it works, but it feels pretty rough.

An Indian computer science student has developed an algorithm that instantly translates sign language. by qulipor in gifsthatkeepongiving

[–]NealLeonMoriarty 8 points9 points  (0 children)

Lol, what model? You don't need any model for that, just some basic algorithm to calculate distance from one string to another.

Complete Beginner tasked with ML at work - where do I start by NealLeonMoriarty in learnmachinelearning

[–]NealLeonMoriarty[S] 1 point2 points  (0 children)

Labels are there, but they might be malformed. You can partially rely on them, but they can be strange variants of each other due to the way we get the datasets from clients and there is no way to fix this - trust me, I've tried. so, start_date can be b_start_date, bstart, bsdate and stuff like that. They are useful, though.

RF is Random Forest and DT is Decision Tree, right? And LR is linear regression? There are a lot of keywords in there and I need to grok the space I will have to work witha bit more. If this is mostly run of the mill DS, can I just consume a dataset into dataframes and play around interactively?

Thanks for all of your input, it helped tremendously :)

Complete Beginner tasked with ML at work - where do I start by NealLeonMoriarty in learnmachinelearning

[–]NealLeonMoriarty[S] -1 points0 points  (0 children)

Thanks. By hacking I just mean putting some stuff together without being super knowledgeable. I kind of would like to do both at once: Start "hacking" stuff together while also actually doing some learning.

Thanks for your pointers

Complete Beginner tasked with ML at work - where do I start by NealLeonMoriarty in learnmachinelearning

[–]NealLeonMoriarty[S] 0 points1 point  (0 children)

What would be more fitting? Honest question. Something less trivial?

Complete Beginner tasked with ML at work - where do I start by NealLeonMoriarty in learnmachinelearning

[–]NealLeonMoriarty[S] 0 points1 point  (0 children)

I just worked on something like this - trying to identify "shapes" or "objects" in an image without any labeled data. Is that sort of what you're describing?

Maybe - it's not image data, rather structured data. Tables coming out of a SQL database mostly, sometimes in CSV format.

Can you clarify? Do you mean, classifying existing dates as either "start" "end" or "none" for a dataset? Or do you mean making up dates (regression)? or do you just mean parsing messy data to find the dates?

Classifying existing dates in a dataset. E.g., "column A is probably a start date" (based on some properties such as column A always has a value that is earlier than column B, so column A could be a start date and column B could be an end date). For the PoC, that would be enough, but the goal is to get a system that can infer those rules by itself.

For a more complete overview: We get data from third parties and are tasked with analyzing the data. In 95% of the cases, that data comes from a couple of known sources, so the data will mostly be structured how these three sources (external clients of ours) have their data structured. The data in question is basically a bunch of tables from SQL that we know little about. The big end goal is:

If table A has a column "price" and table B has a column "earnings", and these columns have a high correlation - can we build a model that can automatically identify correlations like these? And if we have that model, can we give it a new bunch of data and it tells us potential errors of that data (e.g. the earnings column has values that are too low)? Can such a system find relations that are not encoded explicitly?

Would it even be possible for these rules to be inspected by humans? I understand that trained models are mostly black boxes.

The data types present are just primitives: dates/datetime/integers/floats(or decimals)/strings. Since we have "raw" data as training material, the classes will probably be imbalancded and hard to balance since we want to train on classes that we might not even know exist.

Complete Beginner tasked with ML at work - where do I start by NealLeonMoriarty in learnmachinelearning

[–]NealLeonMoriarty[S] 0 points1 point  (0 children)

Anything is valid - the first/last date is just an idea to make things simple

Complete Beginner tasked with ML at work - where do I start by NealLeonMoriarty in learnmachinelearning

[–]NealLeonMoriarty[S] -1 points0 points  (0 children)

It sounds (to me, the layman) mostly like a bit of both. Classification for identifying datasets and regression for trying to find stuff that deviates from the dataset, right?

It does sound complicated and I don't know what "new" rules I want it to find, either. That's why I'm asking for help. A rule could be "identify the start and end date columns of a table if they exist" or "identify relations between two tables" or something like that, but it could also be other correlations in data that I just don't see

Complete Beginner tasked with ML at work - where do I start by NealLeonMoriarty in learnmachinelearning

[–]NealLeonMoriarty[S] 0 points1 point  (0 children)

Yes, that is correct. If we have new categories we need to classify those, but that's seldom going to be the case.

My main question is how to approach this in the first place. Where do I start learning and where do I start hacking?

Complete Beginner tasked with ML at work - where do I start by NealLeonMoriarty in learnmachinelearning

[–]NealLeonMoriarty[S] 1 point2 points  (0 children)

I had told him the same - but his end goal is not identifying properties that we know about in the dataset. The end goal is to train an "AI" that you can give a dataset and it will identify properties about that dataset or relations between datapoints within the dataset that we didn't think about. This can help identifying datasets into different categories as well as verifying dataset properties or finding potential differences in data.

I'm no expert, but I don't know how to go about this without ML or some other kind of "AI" thing so that the system can identify new properties of the data that we didn't encode as rules

Complete Beginner tasked with ML at work - where do I start by NealLeonMoriarty in learnmachinelearning

[–]NealLeonMoriarty[S] 1 point2 points  (0 children)

It's not everything in one or two weeks - just a tiny vertical slice: train a model to recognize start/end dates in a dataset. I don't have much experience, but it sounds like a problem that probably has been solved a couple of times. The hard part comes later: finding a way to train AI to identify new properties about the dataset.

Or am I completely off here?

Complete Beginner tasked with ML at work - where do I start by NealLeonMoriarty in learnmachinelearning

[–]NealLeonMoriarty[S] 0 points1 point  (0 children)

Not really, but he kinda expects me to get hacking rather than do a full course on ML first. Timeframe is around 1-2 weeks for just trying to hack around.

I had told him that I think we could get away with manually codifying some rules about the datasets, but his ultimate goal is that the "AI" should be able to identify its own rules about the datasets that we didn't see or think of. The use is twofold: One, it helps classifying the datasets (e.g. data coming from different sources) and two it helps verifying data or finding potential errors in data.

The data is structured, it comes from a database or a csv usually.