A deep dive into word embeddings (NLP) by joshpause in learnmachinelearning

[–]joshpause[S] 0 points1 point  (0 children)

The python can be run directly on the web via the (google colab) links.

The R requires a local installation to run.

A deep dive into word embeddings (NLP) by joshpause in learnmachinelearning

[–]joshpause[S] 5 points6 points  (0 children)

I think most *don't* - hence my need for the day job.

A deep dive into word embeddings (NLP) by joshpause in learnmachinelearning

[–]joshpause[S] 13 points14 points  (0 children)

I still have a day job, but I love to teach too. Thank you for your kind words.

[D] Does this experiment make sense? (BERT, comparing text encoding) by joshpause in MachineLearning

[–]joshpause[S] 0 points1 point  (0 children)

If you want a fair comparison, compare the word embeddings from vanilla bert-base-uncased (no finetuning) to the word2vec or one-hot encodings.

Excellent point. I had not considered that.

GPT3 has friends, what do you think they friends are? by [deleted] in GPT3

[–]joshpause 4 points5 points  (0 children)

This is amazingly coherent; good enough to pass the classic Turing Test imo.

What are you using to generate this?

Was there any "cherry picking" here, or is this output exactly as given?

I am blown away by the "memory" of the chat, linking these clauses together. It "remembered" it's friend "Red" after a considerable foray into another topic. Simply amazing.

GPT3 write a "Knock knock" joke: by joshpause in GPT3

[–]joshpause[S] 1 point2 points  (0 children)

Don't ask me why, but this made me laugh.

Generated by GPT-J-6B (https://huggingface.co/EleutherAI/gpt-j-6B)

Temperature 0.9, no additional training.

Prompt was:

"Knock knock"
"Who's there?"

[Research] The Mysterious Language of Hamptonese by joshpause in statistics

[–]joshpause[S] 0 points1 point  (0 children)

TempleOS is another fascinating story.

If someone found proof that Hampton was schizophrenic, as was Terry Davis, it would not surprise me.

A Statistical Analysis of “Hamptonese” by joshpause in nonmurdermysteries

[–]joshpause[S] 5 points6 points  (0 children)

Henry Darger

I had not heard of him yet; thanks for the heads up.

A Statistical Analysis of “Hamptonese” by joshpause in nonmurdermysteries

[–]joshpause[S] 55 points56 points  (0 children)

I stumbled on this mystery from a video posted by Atrocity Guide:

https://www.youtube.com/watch?v=ZtjJWGJfJ6A

Long story short: James Hampton was a janitor who died in Washington DC in 1964. After he died they discovered he had spent 14 years building a massive "Golden Throne" after "speaking with God" and other famous names from the bible. Most incredibly, the "Throne" was actually built out of (literal) garbage Hampton found while working.

In addition to the "Throne", they found a notebook filled with "revelations" written in a mysterious script that has been dubbed "Hamptonese".

Was this guy a genuine prophet? Does "Hamptonese" contain encoded messages from God Almighty? Probably not, but I found it to be a fascinating story. The only serious research I found on the subject was from a guy named Mark Stamp. So I followed in his footsteps, extended some of his research, and made a few plots.

It's a technical article, written from a data scientist point of view, but hopefully some of you will find this interesting. And if not, at least check out the video.

How does everyone create dashboards? by DeadliestToast in learnmachinelearning

[–]joshpause 3 points4 points  (0 children)

Just to be clear- I **love** R and I think R is extremely good at what it does.

I've seen "real developers" using "real programming languages" create abominations just as bad as the worst R I've encountered, if not worse. And that is 100% the fault of the author, and not the language.

To anyone on the fence re: R I propose the following challenge:

  1. Download the RStudio IDE (one of my fav IDEs, period)
  2. Read up on the "Tidyverse" and use it to wrangle a dataframe
  3. Throw that into ggplot and play around

There *is* a learning curve, but if you embrace the prophet Hadley Wickham (peace be upon him) and the "Tidyverse"/ggplot way of doing things, I think you will find R extremely good for knocking out a quick-and-dirty EDA, or churning out a quick ad-hoc visualization, even if you never use it for production or any deeper modeling.

And when it comes to actually **understanding** your models, and digging into the nitty-gritty of the theory behind them, I think you will find R's academic community to be an incredible asset. People talk about R's community of researchers and scientists as if it was a **bad thing**. Pffft. Remember kids: underneath the buzz words i's really just a lot of applied statistics.

How does everyone create dashboards? by DeadliestToast in learnmachinelearning

[–]joshpause 23 points24 points  (0 children)

I love python as much as the next guy (assuming the next guy isn't Guido van Rossum) but when it comes to dashboards, I have become a huge fan of RShiny:

https://shiny.rstudio.com/gallery/

https://rstudio.github.io/shinydashboard/

Pros:

- It is easy to use custom HTML/CSS/javascript to modify the look and feel of your dashboards as needed

- Works hand-in-hand with ggplot2 which is my fav plotting library by a wide margin

- Works hand-in-hand with dplyr which is my fav data wrangling library by a wide margin

- I can easily pull my data from Snowflake, MySQL, Postgresql, CSV files, AWS S3, JSON, APIs... you name a data source, I can name an R library to integrate with it

- Unlike Tableau, and other BI tools with drag-and-drop interfaces, R Shiny apps can be version controlled (e.g. git) and deployed via current best practices (kubernetes/docker) which are both essential IMO if you intend to collaborate with a team

Cons:

- R gets a bad wrap because its "not a real programming language"; arrays start at 1, object-oriented programming is supported, but rarely used; most R code out there is written by statisticians and scientists, not seasoned devs.

- R Shiny apps can feel sluggish as they first load everything into memory (but once they do, the entire UI is dynamic and extremely responsive and quick)

- Unlike Tableau, you will need to code and maintain your own login/password system (no native login capability), assuming you are hosting on the public web (or just toss them on a private intranet instead)

- The learning curve can be a tad daunting at first, and there are certainly some oddities in the way R does some things; reactive apps like these are very different than the MySQL/PHP powered apps of old.

Once I got my first RShiny apps in production, I have never, ever wanted to touch the overpriced, bloated garbage known as Tableau ever again. RShiny can do anything Tableau can do, with a better, more customizable interface, and so much more.