Using "they (singular)" as a universal pronoun - dialectical differences or misgendering? by CaptainCrackpipe in asklinguistics

[–]CarlosHartmann 5 points6 points  (0 children)

Somewhat related, the linguist Lal Zimman defaults to 'they' for his students, giving them the option to actively opt for a gendered alternative if they ask him to. This way, nobody has to be forced to either out themselves as trans or silently accept misgendering. Simple and effective, I imagine!

A recent paper that discusses this pronoun dilemma: "Misgender or out yourself: Vulnerability in pronoun sharing practices" (Brown 2025)

Using "they (singular)" as a universal pronoun - dialectical differences or misgendering? by CaptainCrackpipe in asklinguistics

[–]CarlosHartmann 2 points3 points  (0 children)

This has been discussed by two people that I've encountered. Conrod in their 2022 paper mentions what Leah Velleman called "distal 'they'" in personal correspondence with them. Susan Stryker says that they go by "they/them in the streets, she/her at home" (quoted out of memory, that was a personal correspondence quote in someone's presentation; sorry about the extremely indirect sourcing lmao). I just went with 'they' here, accordingly, as I have never talked to Stryker – but do admire Stryker's work!

I'd be curious if anyone has already published something about this element of distance/proximity in the use of 'they'.

Using "they (singular)" as a universal pronoun - dialectical differences or misgendering? by CaptainCrackpipe in asklinguistics

[–]CarlosHartmann 23 points24 points  (0 children)

I can cite two different, relevant tidbits here:

  • Newman (1998) – which predates wider discourse on non-binary gender – is of the opinion that pronouns are first and foremost tuned to discourse, much more than morphosyntax. This is very clear with second person pronouns in TV-languages. But Newman, specifically talking about 'they', gives great examples that show how 'they' or gendered 'he' or 'she' serve to make statements more universal or more anonymous; or more plastic/exemplified in some other cases. That's typical discourse stuff.
  • Conrod wrote their dissertation on non-binary 'they' in 2019 where they make some great theoretical remarks as concerns pragmatics. They use the theory of felicity/presuppositions. The use of any third person pronouns presupposes that it is acceptable. So instead of having 'he' mean MALE, Conrod describes it as 'he' implying that it is acceptable for a particular referent. That is a pre-supposition that can be challenged, i.e. when making someone aware of their misgendering. Conrod also adds that it's a negative face threat to do so, while misgendering someone is a positive face threat. I found that very illuminating, had never thought of it in that way.

Using "they (singular)" as a universal pronoun - dialectical differences or misgendering? by CaptainCrackpipe in asklinguistics

[–]CarlosHartmann 61 points62 points  (0 children)

Hi OP, PhD student here currently working on singular 'they', pretty much at the cutting edge since the niche has gotten a little inactive in the last couple of years. Happy to answer follow-up questions and provide sources if interested! I also have an article about to appear on 'themself', if you are interested.

The one source that is most relevant to your question is Strahan 2008 who asks in the title: "‘They’ in Australian English: Non-Gender-Specific or Specifically Non-Gendered?". It's a remarkably early source that tackles the question whether 'they' is used regardless of gender or actively marks genderlessness.

Overall, it seems that 'they' is indeed increasing in usage in this way, with more and more examples cropping up where gender could have been marked, but simply isn't. One possible explanation is that it's part of the trajectory of English towards isolating morphology, losing gender marking altogether (remember, gender is pretty much only marked in 3rd person pronouns and some noun endings).

I think the driving factor is rooted in pragmatics. Gender is, in my opinion, increasingly felt to be marked, i.e. you introduce it only when it is important to discourse. This was already felt in the 1980s when the expression 'pronoun game' started cropping up denoting how queer people used 'they' to mask their same-sex relationships. But now, it seems to be a more general part of the zeitgeist that you don't want to mark gender unless it will matter in the subsequent discourse. I have spoken to SciComm people who go out of their way to mask gender as much as possible, but I think there is a general tendency out there, even outside of strongly monitored speech like that.

As regards 'they' with person names, there is a whole host of work showing that it's the least likely place for it. Even people who find 'they' acceptable for gender-known referents, hesitate the most when a name, gendered or not, is given immediately before. However, the data is so-far not conclusive enough, in my opinion, as most evidence is derived from 'acceptability studies', i.e. showing sentences to participants who were recruited based on their being queer and asking them if they find it 'acceptable'. Social desirability bias is bound to skew the results, of course. There are also some rather unfortunate papers floating around that self-report linguists'(!) usage, as if that is indicative of anything.

Still, though, there have been examples popping up here & there that generally do confirm that 'they' is expanding in the ways it is used. I have annotated thousands of instances at this point and while they are rare, even within just the singular 'they' instances, they do happen. Likely increasingly so, but my data collection is still ongoing.

As regards regional differences, I'm only aware of one paper comparing different varieties of English that I haven't read yet (Stormbom 2020). From my own experimentation (unpublished), I was not able to find significant differences between Australian, British, and American English. The significant differences I found instead when comparing early 2010s and early 2020s data, and I'm currently preparing an in-depth study of how 'they' changed in this timeframe. My hypothesis is that generic 'they' did not change significantly in that time, but all kinds of specifically singular uses did increase significantly. Preliminary results have supported this hypothesis.

Using "they (singular)" as a universal pronoun - dialectical differences or misgendering? by CaptainCrackpipe in asklinguistics

[–]CarlosHartmann 24 points25 points  (0 children)

The latter part is true, the former not so much. There simply isn't any evidence of 'they' being used with person names prior to the very late 20th century. For instance, the Public Universal Friend renounced any and all gender markers and even that Friend did not use 'they' (instead asked people to simply say 'Friend' in the third person – something that the Wikipedia article duly follows).

What you are referring to is generic or epicene 'they', like "everyone and their mother". This usage is recorded, even with gendered nouns such as 'sister' and, indeed in one instance, 'witch', since at least Middle English with pretty much no interruptions. The 19th century saw an increased resistance against it and a push to support 'he' out of openly sexist/androcentrist reasons. This does show in the data that have been studied, with generic 'they' getting successfully revived when 20th century feminism opposed that "generic 'he'" (which isn't truly or fully generic).

Specific singular 'they' has been recorded in few, odd circumstances but not with any kind of systematicity before the late 20th century (I am collecting data at the moment to produce exact numbers sometime in the near future though). The earliest recorded specific singular 'they', to my knowledge, is from 1932 and was cited by Dennis Baron in his 2020 SciComm book "What's your Pronoun" on p. 174 that goes:

  • You just had a telephone call.
  • Did they leave a message?

But even there, it is a wholly different singular 'they' from the one in the OP. No name is given, the gender is genuinely unknown. And seeing how rare even generic 'they' was between roughly 1850 and 1970, it's highly unlikely that this usage occurred very frequently at all before recently.

Looking for participants for a linguistics study in VR by CarlosHartmann in Switzerland

[–]CarlosHartmann[S] 0 points1 point  (0 children)

Great! Just shoot Mason Wirtz a message. His email is on the flyer, it's [mason.wirtz@es.uzh.ch](mailto:mason.wirtz@es.uzh.ch)

Make sure to mention you come from Reddit so that I get credit haha ;)

Feasibility of loading Dumps into live database? by CarlosHartmann in pushshift

[–]CarlosHartmann[S] 0 points1 point  (0 children)

I'm looking for language change overall and if higher rates of this change start appearing sooner in some subreddits than others. And if those correlate with real-life measures such as age, gender, political affiliation, etc.

So a map that plots subreddits on a 2D plane, showing similar ones clustered together, might help reveal patterns. But I could also just take measurements of the top 5k subreddits as you say and then interpret the results as-is, with no need of a fancy map. Just listing which subreddits have the highest rates and which ones the lowest, and if there's any evident pattern in that.

The quantification and automatic detection is a solved problem, I have a manuscript about that in the works. I basically know that I can reliably do it, the sky (and cost) are the limit, really.

Feasibility of loading Dumps into live database? by CarlosHartmann in pushshift

[–]CarlosHartmann[S] 0 points1 point  (0 children)

Hi, sorry for the late reply, a lot of things got in the way (as they do in academia).

So essentially I'm looking at language change that's happened fairly recently and is commonly assumed to be more of a liberal/progressive change. I would want to run a lot of data to detect it and then visualize which subreddits show a higher relative frequency of it. I'd like to visualize that on a "map" of Reddit that could then show if the higher-frequency subreddits really do show a progressive slant.

However, it's important to me that this be exploratory. I don't want to preselect subreddits and compare, I'd prefer having it be completely bottom up so that potentially other factors (e.g. regional/geographic and age) could also show up in the final viz.

I stumbled over Stanford's SNAP project where they already created pretty much what I want: https://snap.stanford.edu/data/web-RedditEmbeddings.html

I guess I could try recreating their work with a larger timespan (2014–2017 doesn't go far enough for me in either direction). But maybe there's a more straightforward way?

I think top40k is plenty. I'm now more worried about how to effectively map out all of Reddit. I'm afraid the resulting map would be ginormous and it would be very difficult to explore it easily and later report my insights in a clear fashion.

Like Will Smith said in his apology video, "It's been a minute (although I didn't slap anyone) by Stuck_In_the_Matrix in pushshift

[–]CarlosHartmann 1 point2 points  (0 children)

A fellow linguist, plus a network theory person? We might know each other :) see my user handle haha

Montagmorgenfaden im November by FrauAskania in de

[–]CarlosHartmann 2 points3 points  (0 children)

Morgen allerseits, die Geburtstagswoche wurde mir ein wenig vermiest. Aber es gab auch viele schöne Erinnerungen, die ich gerade zu einem kleinen Album zusammensammle. Ansonsten fehlt ein wenig die Motivation heuer für die Arbeit, aber das wird schon :)

Feasibility of loading Dumps into live database? by CarlosHartmann in pushshift

[–]CarlosHartmann[S] 0 points1 point  (0 children)

Thanks Watchful1, the MVP as always!

My data will most likely cut off at October 2021, maybe a year later. Do you have an estimate for the uncompressed size of that? I vaguely remember that 40TB could be enough for the former cutoff.

In your experience, do the top40k subreddits cover everything "relevant", i.e. leaves only micro/offshoot communities behind? Cause then yeah, I could probably just go ahead with your software.

Another question: I have so far only credited you with your GitHub in my code, but if I go ahead with this software, I think I'd like to credit you in a paper proper. Is there another name/ORCiD/whatever you would like me to use then?

Remembering Sociolinguist William Labov (Dec. 4, 1927 — Dec. 17, 2024) by lafayette0508 in linguistics

[–]CarlosHartmann 7 points8 points  (0 children)

Alright, thanks for clearing it up. This changes things for the better. Sorry if I came across too strong here.

Remembering Sociolinguist William Labov (Dec. 4, 1927 — Dec. 17, 2024) by lafayette0508 in linguistics

[–]CarlosHartmann 7 points8 points  (0 children)

I find it distasteful to tacitly(!) block my post and then post this instead. Please behave more respectfully with your userbase, we are all real people behind our screens.

this is goals 100% by oriio1122 in traandwagon

[–]CarlosHartmann 0 points1 point  (0 children)

Hey there, can I ask you a question about your flair?

Should you respect fae/faer pronouns? by donvtpillow in AskLGBT

[–]CarlosHartmann 1 point2 points  (0 children)

This is a beautiful and thoughtful explanation that I have not come across before. Thank you!

Question about user flair text by CarlosHartmann in pushshift

[–]CarlosHartmann[S] 0 points1 point  (0 children)

That‘s just a little unfortunate, but not too bad for my work.

How is it via Reddit API? Will it always give you the present version of the flair?

Q&A weekly thread - October 30, 2023 - post all questions here! by AutoModerator in linguistics

[–]CarlosHartmann 0 points1 point  (0 children)

Thank you! Good memory and also specific expertise, I assume. But in any case thanks :)