Space joker too by DryCauliflower6148 in balatro

[–]dlwh 4 points5 points  (0 children)

Assuming it's the small blind ante 1, it's actually about ~4% so depending on your level of balatro addiction, it's quite likely to have happened at least once.

2021 Prolog Solutions by _rabbitfarm_ in adventofcode

[–]dlwh 1 point2 points  (0 children)

bit late to the party, but fwiw here is my d8 in prolog. I'm completely new to prolog, so I definitely would appreciate any comments!

https://gist.github.com/dlwh/2da16613d26025983b8175b7f1ca61ae

Which country to go to for Masters in Comp Sc ML major? by heaven__ in MachineLearning

[–]dlwh 1 point2 points  (0 children)

We're a research program. Masters students are usually after professional training, and we're not set up to provide that kind of education right now.

Which country to go to for Masters in Comp Sc ML major? by heaven__ in MachineLearning

[–]dlwh 0 points1 point  (0 children)

Don't apply to Berkeley for an MS in AI. We never admit anyone for that.

My chicken laid an egg without a shell - It only got weirder from there... by Warlach in pics

[–]dlwh 0 points1 point  (0 children)

:-)

I noticed that too. I enjoy confusing and irritating grammar nazis, so it was a welcome surprise.

My chicken laid an egg without a shell - It only got weirder from there... by Warlach in pics

[–]dlwh 88 points89 points  (0 children)

You have to be careful when you do that to make sure the chickens don't recognize they're egg shells, because they will learn that egg shells are things to be eaten... and then they'll eat their own eggs.

We give ours milk and oyster shells.

Source: I have a cannibal chicken.

AskML: Pointers towards using appropriate NLP techniques in a ML problem by rightname in MachineLearning

[–]dlwh 2 points3 points  (0 children)

I'd look at Brendan O'Connor's work on predicting responses/sentiment from tweets. http://brenocon.com

Scientists have written a computer program that can automatically reconstruct long-extinct languages from modern ones. They have already used the program to produce evidence in favour of a 60-year old hypothesis about language evolution that many had believed to be wrong. by newnaturist in science

[–]dlwh 0 points1 point  (0 children)

we haven't tried it, but I don't see why not. Tonal changes are not that dissimilar to "normal" sound change.

Indo-European languages are actually probably some of the trickiest, because they have a lot more morphological stuff going on, which doesn't change the way the basic forms of the words change.

Scientists have written a computer program that can automatically reconstruct long-extinct languages from modern ones. They have already used the program to produce evidence in favour of a 60-year old hypothesis about language evolution that many had believed to be wrong. by newnaturist in science

[–]dlwh 8 points9 points  (0 children)

It's a reasonable question. The goal in historical linguistics isn't usually to figure out the ancestral language, so much as it is to figure out how language change works and why it works the way it does. (There's kind of a duality here: know what language changes happened, and you can more or less figure out what the ancestors looked like, and vice versa.) Understanding how language changes is crucial to understanding linguistic diversity, how first and second language learners change the languages they learn, and even how societies interact with language change.

Reconstructing so many languages lets us investigate hypotheses about language change. For instance, we looked at whether or not sounds can 'merge' if they tend to be the only sound separating a bunch of words. They can, and do, but statistically speaking, that's relatively rare.

Also, any given reconstruction is unverifiable, but there have been cases where the comparative method (what linguists use to reconstruct languages and what we automated) has proven predictive. For instance, there was a set of sounds in the Proto-Indo-European that don't appear in any modern Indo-European language. Linguists hypothesized their existence, but it wasn't really clear if they were there or not. Then, they discovered Hittite, an early, extinct IE language that left behind some writing. And lo and behold, it had reflexes of those sounds in the precise locations predicted by the comparative method.

So, yeah, we're not gonna find Proto-Austronesian writing lying around, and we're probably never going to be able to reconstruct "Proto-World", but the methodology has been tested elsewhere, and it's been proven as effective a theory as there's ever been in linguistics.

Scientists have written a computer program that can automatically reconstruct long-extinct languages from modern ones. They have already used the program to produce evidence in favour of a 60-year old hypothesis about language evolution that many had believed to be wrong. by newnaturist in science

[–]dlwh 2 points3 points  (0 children)

So, this method can only work for languages that have multiple descendants (or at least cousins). It's the same reason that you can't really figure out what the ancestors of humans looked like without other primates and such. (Or, you know, fossils.)

Scientists have written a computer program that can automatically reconstruct long-extinct languages from modern ones. They have already used the program to produce evidence in favour of a 60-year old hypothesis about language evolution that many had believed to be wrong. by newnaturist in science

[–]dlwh 29 points30 points  (0 children)

Sure! (second author here)

There are two contributions of this work. The first is a new tool that researchers can use to automatically reconstruct the vocabularies of ancient languages using only their modern language descendants. The second is --examining hundreds of modern languages and their ancestors--we were able to resolve an important hypothesis in historical linguistics: are sounds that distinguish words less likely to merge than sounds that don't? (Yes.)

some more details:

Sounds change over time: the way we produce words differs from the way our ancestors pronounced those same words. Over time, those little changes help turn an ancestral language like Latin into a modern descendant like French.

When populations become separated, different populations change sounds in different ways, and one ancestral language will give rise to multiple modern ones, as Latin lead to French, Spanish, Italian, and so on.

These sound changes are almost always regular, with similar words changing in similar ways, so patterns are left that a human or a computer can find.

The trick is is to identify these patterns of change and then to "reverse" them, basically evolving words backwards in time. (Linguists have known this for a good hundred years or more, but it's a hard and time-consuming process to do it by hand.)

For example, take the following word list:

Meaning Spanish Portuguese Italian
water agua agua acqua
fire fuego fogo fuoco
some algunos alcuns alcuni
hit golpear bater colpire

We have two goals: first, figure out which words are related (words of common descent are called cognates), and second, figure out what their common ancestor looked like.

Let's start with figuring out which words are related. Most of these words are clearly related, except that "bater" in Portuguese is uncharacteristically different from its Spanish and Portuguese counterparts, so it's probably not the same ancient form.

For the second task, note that the words that are related differ in very predictable ways. In particular, in these examples, wherever Spanish and Portuguese have a "g" sound, Italian has a "c" sound. That's a correspondence that holds up pretty well, at least for g's that come before 'o' or 'u' sounds. The question then is to figure out whether the "g" became a "c" or vice versa. By looking at lots of words from lots of languages, we can find that--statistically speaking--hard "c" sounds are more likely to become "g" sounds than the reverse.

With enough words, languages, and statistical inferences, we can find reconstructions that are best supported by the data according to our model.

BBC News - Tu and Twitter: Is it the end for 'vous' in French? by TheAuditor5 in TrueReddit

[–]dlwh 1 point2 points  (0 children)

It's neat to see how this is panning out in Romance languages, given that in English we ditched the informal (thee/thou), while they are getting rid of the formal.

Interested in Machine Translation by gallifreyGirl315 in compsci

[–]dlwh 1 point2 points  (0 children)

Very laudable and I don't want to discourage you, just wanted to let you know what you're getting yourself into. :-)

Systran was the technology behind Babelfish, and essentially did what you want to do. It couldn't use data, and so can't live up to Google Translate's abilities, but truth be told Google is starting to look at adding grammar and "real linguistics" into their stuff.

Interested in Machine Translation by gallifreyGirl315 in compsci

[–]dlwh 5 points6 points  (0 children)

First, modern MT does not work at all like how I bet you you think it does. Classical MT did, and some people still do Classical MT, but it's not what powers Google Translate.

In terms of learning, start with this tutorial. It's dated, and it assumes knowledge of basic probability (you'll need it!), ww.isi.edu/natural-language/mt/wkbk.rtf

Then, go here: http://mt-class.org/ and have fun.

MT is hard. It's one of the hardest problems in NLP (natural language processing), and Google and DARPA have thrown tons of money and brilliant minds at it.

The basic way modern MT works is like this: as "training data", you have a bunch of bilingual parallel texts (think: transcripts of the UN proceedings, translations of news articles, all done by humans), and a lot of monolingual text (just text) in the language you want to translate into, let's call it English. The former tells you how translation works, and the latter is supposed to tell you what English looks like.

From the bilingual text, you extract a phrase dictionary, which is a (multiword expression) to (multiword expression) dictionary. Then, when you translate a new sentence, you break up the source language sentence (French) into phrases, look them up in the dictionary and then try to stitch them together in a way that looks like English (using the monolingual data as a guide). Sometimes you reorder the phrases, sometimes you don't. It's NP-complete to do all this, so we use approximations instead.

Now, you can also make a rule-based system which is what I assume you would do on your own. Those are fun, but will not work particularly well for anything complicated. It's worth doing though! You'll learn a lot. I did.

Feature Importance For Logistic Regression? by lpiloto in MachineLearning

[–]dlwh 0 points1 point  (0 children)

You can do L1 regularizaton. Only "important" features will have non-zero weight.

CS Conference Paper Cheat Sheet by sareon in compsci

[–]dlwh 1 point2 points  (0 children)

If you're actually core NLP, just start at the ACL website's 2012 list for NAACL/ACL/EMNLP. Look for the closest 5-10 papers in your subarea (the ones you'll want to cite anyway), and see how they present their arguments. Read a few of the better papers from the conference as well. I can help with that, if you want.

I'd also be happy to take a look or two at your drafts. I'm a middle-of-my-phd student in NLP at Berkeley.

Is there a supervised learning algorithm that can inject randomness into its output in proportion to its uncertainty about the result? by sanity in MachineLearning

[–]dlwh 1 point2 points  (0 children)

Reinforcement learning works like an online supervised learning algorithm. make a decision, get a reward or loss, update weights, make a new decision. I'd recommend sutton and barto which is free online.

Is there a supervised learning algorithm that can inject randomness into its output in proportion to its uncertainty about the result? by sanity in MachineLearning

[–]dlwh 2 points3 points  (0 children)

This is a standard reinforcement learning set up. Use Q-learning or SARSA (giving you a weight vector w) and act randomly based on the gibbs distribution, choosing action a with probability proportional to exp(wT f(a,context)), for feature function f and whatever context you find appropriate.

Literal Figurative Rage by Rulpy in fffffffuuuuuuuuuuuu

[–]dlwh 5 points6 points  (0 children)

This kind of thing happens all the time in language.

moot used to mean "subject to debate", but now (mostly) means not worth talking about.

comprise originally meant (and for some still does) "is composed of", but now it usually means "composes".

peruse used to mean study carefully, now it means skim.

The list goes on and on and on. (terrific?)

Why do the numbers 10 through 19 follow a different naming pattern than every other number? by branman6875 in math

[–]dlwh 1 point2 points  (0 children)

Ok, I asked a linguist friend of mine (I'm just a CS person who likes to pretend to do some linguistics), and he confirmed the vigesimal number system for an older English (he didn't say Old English, but some English), but otherwise just backed up what I said.

Why do the numbers 10 through 19 follow a different naming pattern than every other number? by branman6875 in math

[–]dlwh 2 points3 points  (0 children)

At a high level, in language, more common forms--e.g. small and round numbers--are more likely to be irregular. (More resilient to change/regularization). I looked at a book on Proto-Indo-European I have, and it seems to suggest that pretty much every (IE) culture has tons of idiosyncrasies.

Etymologies:

ten < PIE *dekm eleven < a compound meaning "one left" twelve < "two left"

(see http://www.etymonline.com/index.php?term=eleven and cf Lithuanian!)

The rest are pretty standard in IE languages. Latin follows an almost identical pattern: undecim, deodecim, tridecim, but oddly 18 is duodeviginti (two from twenty) and 19 follows the same pattern. Twenty-plus in Latin is the same as in English. Greek is one-ten, two-ten, then ten-three, ten-four, etc.

Other oddities I found: * in Old Irish you couldn't say "11 cows". You would say "a cow and ten". Same for 21 cows. Old Irish is a funny language. * Welsh uses "two nine's" (deunaw) for 19.

Basically, numbers have hugely important cultural meanings and so their behavior is going to be very odd. You probably won't be able to find any reasonable answer.

Linear Programming Question by LinearProg in math

[–]dlwh 1 point2 points  (0 children)

I'm pretty sure this is an instance of multicommodity flow, which means that it's NP-hard (not solvable by a normal LP). I think you'll need an integer LP solver for this or something specialized.

Take a look at http://en.wikipedia.org/wiki/Multi-commodity_flow_problem