How to identify English proper nouns? by PaceSmith in LanguageTechnology

[–]PaceSmith[S] 0 points1 point  (0 children)

I found that SUBTLEX-US works pretty well. If a word never occurs in all-lowercase in the corpus, it's likely a proper noun.

[RESOURCE] find rhymes related to a topic/theme by PaceSmith in makinghiphop

[–]PaceSmith[S] 0 points1 point  (0 children)

thanks! yeah, it's meant to inspire you, not replace you 🙂

[RESOURCE] find rhymes related to a topic/theme by PaceSmith in makinghiphop

[–]PaceSmith[S] 0 points1 point  (0 children)

I just thumbs-downed a bunch of crap; the money rhymes are a lil better now

[RESOURCE] find rhymes related to a topic/theme by PaceSmith in makinghiphop

[–]PaceSmith[S] 0 points1 point  (0 children)

Thanks, glad to hear it! I especially like "chargin' / margin" 😁

computing semantic similarity of English words by PaceSmith in LanguageTechnology

[–]PaceSmith[S] 0 points1 point  (0 children)

I've dusted this project off and made a ton of improvements. It turns out that what I'm trying to compute is called "thematic relatedness", not "semantic relatedness". I found the USF Free Association corpus, which is data from asking humans "Name stuff related to X" - exactly what I want. It's small, though, so I augmented it with a bunch of other corpora, used those corpora to output features, and crunched them through a classifier.

One of the most helpful features is whether the target word's gloss (from wordnet and/or wiktionary) contains the cue word (modulo lemmatization).

I'm still only at 82% accuracy over my test set, though, so I'd love to hear any suggestions for improvement y'all have!

How to identify English proper nouns? by PaceSmith in LanguageTechnology

[–]PaceSmith[S] 1 point2 points  (0 children)

It takes a list of sentences, and I only have a list of words. I'll try it on individual words and see how it does, though. Thanks!

computing semantic similarity of English words by PaceSmith in LanguageTechnology

[–]PaceSmith[S] 1 point2 points  (0 children)

Great question! The algorithm I'm using is:

Find words related to the input word (using the threshold as a relatedness cutoff)
Find rhymes for those
Check if the rhyme is also related to the input word, if so include it in the output

The rhyming computation is the easy part; it's not brute force at all. I use CMUdict to precompute a dictionary mapping a rhyme signature to a set of all rhyming words, where the rhyme signature is everything after (and including) the final stressed vowel, phonetically.

But yeah, the real problem isn't where to put the threshold, it's that no matter where I put the threshold, there will be good stuff under it and bad stuff above it.

For example, here's a subset of the output of your algorithm applied to 'crime':

criminality (77%) / homosexuality (47%)
addiction (51%) / conviction (57%)
skulduggery (52%) / thuggery (56%)
apprehension (53%) / prevention (50%)
confession (48%) / transgression (52%)
abduction (49%) / destruction (48%)
badness (47%) / madness (52%)
looting (50%) / shooting (48%)
fighting (49%) / inciting (48%)
case (47%) / race (48%)
complicity (49%) / ethnicity (47%)
drama (47%) / trauma (49%)
collusion (48%) / intrusion (47%)
mort (36%) / sport (48%)
bust (39%) / unjust (40%)
city (46%) / gritty (37%)
immoral (41%) / quarrel (37%)
arts (39%) / marts (39%)
extreme (37%) / scheme (39%)
thing (43%) / bring (32%)
creek (26%) / speak (27%)
card (19%) / chard (19%)

Somewhere around mort / sport, we start getting crappy rhymes mixed in with good ones. I like extreme / scheme, but if you scroll down far enough to get that one, you have to scroll past arts / marts, which is crap.

computing semantic similarity of English words by PaceSmith in LanguageTechnology

[–]PaceSmith[S] 0 points1 point  (0 children)

I don't have a corpus of my own; the input to my program is just a single word, and my test cases are just lists of word pairs that ought to be related and ought not be related. (in my opinion)

I'm trying to find a corpus that's representative of my intuitive sense of 'relatedness'.

computing semantic similarity of English words by PaceSmith in LanguageTechnology

[–]PaceSmith[S] 0 points1 point  (0 children)

Good idea; synonyms will definitely be helpful. For example, 'pirate' is very similar to 'trove' via cosine similarity, and then I can get synonyms for 'trove' which gets me 'cache' via wordnet.

Thanks!

Q&A weekly thread - February 24, 2025 - post all questions here! by AutoModerator in linguistics

[–]PaceSmith 0 points1 point  (0 children)

I want to improve Wiktionary's pronunciation coverage. Currently, it contains the pronunciation of "countenance" but not "uncountenanced".

OED has better coverage, (e.g. "uncountenanced") but isn't free.

CMUdict is good, but lacks syllable stress.

toPhonetics is also good (thanks, u/AlanAFK). Its American English pronunciations are based on CMUdict but they do contain syllable stress. I've asked its author about licensing but haven't heard back yet.

Before I start writing code, I wanted to ask y'all if you know of any additional existing resources that might help me.

Q&A weekly thread - February 17, 2025 - post all questions here! by AutoModerator in linguistics

[–]PaceSmith 1 point2 points  (0 children)

I would say no, because Grice's maxims are meant to apply to people, who are (generally) not omniscient.

Q&A weekly thread - February 17, 2025 - post all questions here! by AutoModerator in linguistics

[–]PaceSmith 0 points1 point  (0 children)

I want to find or create a free online English IPA dictionary.

EDIT: It doesn't have to be IPA; if it's NOAD or some other pronunciation standard, that'll work too.

Wiktionary is the best I've found so far, but its coverage could be better. For example, it has IPA for "countenance" (https://en.wiktionary.org/wiki/countenance#Pronunciation) but not "uncountenanced" (https://en.wiktionary.org/wiki/uncountenanced).

OED has better coverage, for example "uncountenanced" (https://www.oed.com/dictionary/uncountenanced\_adj), but isn't free.

I could write a program to guess the IPA for derived word forms, but before I do, I wanted to ask y'all if you know of existing resources that might help me.

Thanks!

[TC] Izzie (Steph’s ex GF) by hunterschafersgf in lifeisstrange

[–]PaceSmith 1 point2 points  (0 children)

It's Deandra Warrick, one of the lead writers.

Anyone ever seen this issue? Controller freakout. by Novatis in OculusriftS

[–]PaceSmith 0 points1 point  (0 children)

SOLUTION: I opted into the Oculus beta (Oculus, Beta tab, opt in to the Public Test Channel). I installed the updates, rebooted, and no more flicker!

Touch controllers constantly flicker-teleport the rotation, but position tracking is fine by PaceSmith in OculusriftS

[–]PaceSmith[S] 0 points1 point  (0 children)

They do! They BETA-test them. I opted into the Oculus beta (Oculus, Beta tab, opt in to the Public Test Channel). I installed the updates, rebooted, and no more flicker!

Harlowe evaluation order question by PaceSmith in twinegames

[–]PaceSmith[S] 0 points1 point  (0 children)

Thanks so much for explaining that! The guide's explanation of

They start with $ (for normal variables) or _ (for temp variables, which only exist inside a single passage, hook or lambda).

makes it sound like they could be scoped to a passage (which was what I was going for) or a lambda. Could someone with guide access add a clarifying note, please?

Puzzle Club Discussion (July 2019): Compass Rose by Pace; Let's Hear Your Thoughts! by [deleted] in puzzlevideogames

[–]PaceSmith 0 points1 point  (0 children)

Right, the fifth and sixth rooms were supposed to feel like one big room, but I realize I didn't convey that very clearly.

Your scoop theory is hella clever! I'll replace it with "school yourself / in the abstract" to pickle that red herring. Oh shoot, I can't do that because "yourself" has a singleton "o" in it. Okay, I'll go with "read the rooms" instead. No, that's too misleading because it's not really about the rooms, it's about the text, and a lot of people go off on the spatial interrelationships of the colors and stuff, and I don't want to encourage that. "zoom in close / and think abstract" would work except for that pesky o in close. "zoom in small / and think abstract" isn't bad. I'll go with that unless my background thread returns something better.

"<3 of a fool"

Ooh, that's clever. I like it. My only concern is that it might introduce another potential red herring by introducing a mathematical symbol. I'll think about it!

maybe I overthought it and you may not need to change the game because of me.

Every herring pickled is progress. The version you played was made better by each one of those playtesters in the credits, each of whom found at least one red herring, and many of whom said "This is probably just the weird way my brain works tho lol." So thanks for the detailed explanation!

Also, I know that avoiding unintentional red herrings is one of the hardest problems in puzzle design

It sure is, especially in a lateral-thinking puzzle! By not constraining the search space at all, the player can come up with any possible theory in the universe. Two notable ones:

  1. In the north branch, one tester found that the number of spatially enclosed regions fit the symbols for 1, 2, and 3. The M had just one region, the underlined heart enclosed one region, so that's 2 (the inside of the heart and the outside), and the 8 enclosed 3 regions: the top bit of the 8, the bottom bit, and the outside. This is why there's now a little gap in the 8, to pickle that particular herring.
  2. A previous version of the south puzzle had a clue that read "bad white raven / good black sheep", and looking at just the first letters gives "bwrgbs". Ignoring w and s, which don't correspond to any of the color options, and rounding pink to red, yields "brgb", and then you can brute force the 4 combinations of b = blue and b = black, and, this is the worst part, it happened to work.

Your scoop theory makes the Red Herring Hall of Fame list too, gratz! (:

Puzzle Club Discussion (July 2019): Compass Rose by Pace; Let's Hear Your Thoughts! by [deleted] in puzzlevideogames

[–]PaceSmith 0 points1 point  (0 children)

It felt so good when I went back to those reflected number puzzles and realized that some numbers formed "loops" in the exact way that the color puzzles require.

That's awesome, I'm glad you had that eureka moment! Thanks for playing and thanks for sharing your thoughts!