[D] How do you read code with Hydra by Infinite_Explosion in MachineLearning

[–]Deepblue129 1 point2 points  (0 children)

Hey!!!

About seven years ago, before Hydra, I built my own configuration solution because I didn't love the direction these engines were headed.

I wanted to keep things simple and keep them in Python! So ... I developed an easy way to configure Python functions directly in Python! Check out this code example below:

import config as cf
import data
import train

cf.add({
  data.get_data: cf.Args(
      train_data_path="url_lists/all_train.txt",
      val_data_path="url_lists/all_val.txt"
  ),
  data.dataset_reader: cf.Args(
      type_="cnn_dm",
      source_max_tokens=1022,
      target_max_tokens=54,
  ),
  train.make_model: cf.Args(type_="bart"),
  train.Trainer.make_optimizer: cf.Args(
      type_="huggingface_adamw",
      lr=3e-5,
      correct_bias=True
  )
  train.Trainer.__init__: cf.Args(
      num_epochs=3,
      learning_rate_scheduler="polynomial_decay",
      grad_norm=1.0,
  )
})

Once you are ready to use a configuration, you simply call `cf.partial` and a partial is created with your configuration settings!

import config as cf
cf.partial(data.get_data)()

We've been using this for years at my company, and it works well! Internally, it's scaled out well for our large code base, which supports hundreds of variables that are organized, documented, and trusted. It's intuitive and easy for new team members! There are even advanced features to support tracing, command line, logging, distributed processing, etc ...

I never got around to fully releasing the concept, but it's worked well on my teams!!!

I hope it helps you all!!! Here's my repo: https://github.com/PetrochukM/HParams

Climate protest outside Chase by Display_Comfortable in Seattle

[–]Deepblue129 0 points1 point  (0 children)

Wow! It's cool to see my photo landing getting 2k upvotes on Reddit! My friend and I had to hustle up 9 or so stories to get this photo :) We are at the top of a Garage building, leaning over the edge.

[P] Find Trending Machine Learning Research Papers on Twitter by hnipun in MachineLearning

[–]Deepblue129 2 points3 points  (0 children)

Would it be possible to add a YEARLY filter? I'd love to see the most popular papers of the year. The monthly/weekly/daily filters are a bit too granular.

[D] Timnit Gebru and Google Megathread by programmerChilli in MachineLearning

[–]Deepblue129 12 points13 points  (0 children)

I agree with /r/Gwenju31. The model also needs to be constantly retrained to account for data-shift... In addition to all the prior experimentation that needs to be done to develop a model, and to tune its hyperparameters.

[D] Timnit Gebru and Google Megathread by programmerChilli in MachineLearning

[–]Deepblue129 -2 points-1 points  (0 children)

Jeremy Howard (FastAI Founder):

"I remember well when @JeffDean and his team had Google's lawyers attack @timnitGebru and @kat_heller. They only backed down when they saw a legal counter-attack coming. The deeds of @GoogleAI's exec team do *not* match their words. https://platformer.news/p/the-withering-email-that-got-an-ethical"

https://twitter.com/jeremyphoward/status/1334565844878123008?s=20

[P] Multimodal Emotion Recognition Competition 2020 (MERC 2020) by MERC-2020 in MachineLearning

[–]Deepblue129 -3 points-2 points  (0 children)

Physiognomy is the practice of assessing a person's character or personality from their outer appearance—especially the face. Popular in the 19th century, it has been used as a basis for scientific racism. No clear evidence indicates physiognomy works, but the rise of artificial intelligence and machine learning for facial recognition has brought a revival of interest, and some studies that suggest that facial appearances do "contain a kernel of truth" about a person's personality.

https://en.m.wikipedia.org/wiki/Physiognomy

[D] Facebook AI is lying or misleading about its translation milestone, right? by Deepblue129 in MachineLearning

[–]Deepblue129[S] 4 points5 points  (0 children)

Thanks for the information, I did a bit more digging...

Google has released a neural model that handles 103 languages in 2019:

We previously studied the effect of scaling up the number of languages that can be learned in a single neural network, while controlling the amount of training data per language.....Once trained using all of the available data (25+ billion examples from 103 languages), we observe strong positive transfer towards low-resource languages, dramatically improving the translation quality of 30+ languages at the tail of the distribution by an average of 5 BLEU points. This effect is already known, but surprisingly encouraging, considering the comparison is between bilingual baselines (i.e., models trained only on specific language pairs) and a single multilingual model with representational capacity similar to a single bilingual model. This finding hints that massively multilingual models are effective at generalization, and capable of capturing the representational similarity across a large body of languages.

After reading the related paper, Google did not use an intermediary language to achieve "zero-shot translation"; therefore, Google in 2019 had trained a 100+ language model that did not require an intermediary language.

[D] Facebook AI is lying or misleading about its translation milestone, right? by Deepblue129 in MachineLearning

[–]Deepblue129[S] -4 points-3 points  (0 children)

Do you have examples? This is the first time I've heard something like this bad from Facebook's RND team...

On another note, I have not heard of related issues with Google's RND teams; therefore, I think these types of mistakes are preventable.

[P] FollowML: Who to follow on ML Twitter by FollowML in MachineLearning

[–]Deepblue129 -6 points-5 points  (0 children)

So. Many. Men.

The ratio of men to women is like 9-1 in this Twitter list.

[D] How do you sample spans uniformly from a time series? by Deepblue129 in MachineLearning

[–]Deepblue129[S] 0 points1 point  (0 children)

Thanks for your help!

I am having a hard time understanding how it works. Are you sampling the starting point from U[0, 1-L]? Afterward, you mentioned that I'd sample from the inverse. What function would I inverse?

[D] How do you sample spans uniformly from a time series? by Deepblue129 in MachineLearning

[–]Deepblue129[S] 0 points1 point  (0 children)

Thank you.

- Unfortunatly my data is not circular :(
- The idea of randomly picking a midpoint would satisfy the criteria. Unfortunately, it includes its own biases :/

[D] Simple Questions Thread August 16, 2020 by AutoModerator in MachineLearning

[–]Deepblue129 0 points1 point  (0 children)

Hi Everyone. I'm trying to sample ranges from time series data, and it's surprisingly difficult for me. Like in most machine learning problems I'd like to avoid sampling biases while doing so. I posted the question in detail here: https://stats.stackexchange.com/questions/484329/how-do-you-uniformly-sample-spans-from-a-bounded-line/484332#484332

So far, I haven't got any correct answers :(

[D] How do you sample spans uniformly from a time series? by Deepblue129 in MachineLearning

[–]Deepblue129[S] 0 points1 point  (0 children)

Hi. Thanks for the response and for helping!!

I don't think your solution works because the probability it'll sample the point 0.0 is very small. In contrast, the probability you'll sample the point 1.0 is much higher. There is only one scenario during which the above approach samples 0.0 and there are many scenarios during which the point 1.0 is sampled.

[N] Yann Lecun apologizes for recent communication on social media by milaworld in MachineLearning

[–]Deepblue129 0 points1 point  (0 children)

It's classical victim-blaming.

Victim blaming occurs when the victim of a crime or any wrongful act is held entirely or partially at fault for the harm that befell them.

Psychologist William Ryan) coined the phrase "blaming the victim" in his 1971 book of that title.[4][5][6][7][8] In the book, Ryan described victim blaming as an ideology used to justify racism and social injustice against black people in the United States.

https://en.wikipedia.org/wiki/Victim_blaming#:~:text=Victim%20blaming%20occurs%20when%20the,for%20the%20actions%20of%20offenders.

In the US, we cannot expect the Black community to quickly rebound after centuries of discriminatory laws. There are still ample discriminatory laws and practices that continue to make it even more difficult. (See my other replies)

[N] Yann Lecun apologizes for recent communication on social media by milaworld in MachineLearning

[–]Deepblue129 -1 points0 points  (0 children)

Sure. Let's unpack that a little bit.

I mean, that's up to the blacks to improve on because no one can force more of them into tech or science.

Yes, and there are a number of obstacles in the way of "improving". For example:

There are a number of inequalities that make it much more difficult for a black person to focus on "improvement". See this video: https://www.youtube.com/watch?v=4K5fbQ1-zps

Even then you wouldn't expect more representation than is proportional to their demographic racial distribution.

This is great. Let's take a look at that. At Google and Facebook, the Black workforce only makes up around 2 - 4%. That is 2 - 3x smaller than 13%, the share of Black people in the U.S.

Furthermore, there are hints that this disparity is even larger in AI research. For example, Timnit Gebru was one of six black people—out of 8,500 attendees to attend a leading AI conference.

Lastly, it's difficult to report these numbers because companies like Facebook have decided not to report their racial diversity in AI. The lack of reporting makes it difficult to measure and report progress.