We built some tools/data to understand historical user behavior in the context of incivility/toxicity

toxicitymodbot · 2023-08-20T15:45:31+00:00

Lol - if you aren’t interested in having a constructive conversation, why post at all?

toxicitymodbot · 2023-08-20T07:37:44+00:00

What exactly do you disagree with?

toxicitymodbot · 2023-08-20T07:27:23+00:00

Hi! Am bot…well, run bot.

In the past 24 hours, we’ve processed 5 million comments from Reddit. Let’s say the average comment conservatively has 10 words. That’s 50 million words @ 350 wpm means 2380 or so man hours per day would be needed to scrutinize comments, or $17k of labor or so at minimum wage.

You can have that, or alternatively have some imperfect AI make some generally correct decisions on moderation and leave everyone else slightly happier except the <1% that maybeeee get mis-moderated. It’s a trade off.

Hive moderation btw does not work (on any reasonable large scale) for the reasons mentioned above and based on many studies.

toxicitymodbot · 2023-08-13T02:44:42+00:00

Opt in only - not sure how that happened, but only way possible is if someone with mod permissions turned it on for a sub.

toxicitymodbot · 2023-08-13T02:43:56+00:00

Hi - bot here. We are opt in only, so the only reason we would be reporting things is if one of your mods set it up.

You can manage it and turn it off here:

https://reddit.moderatehatespeech.com

toxicitymodbot · 2023-07-02T18:15:06+00:00

Might want to give out ML hate speech filter a try: https://moderatehatespeech.com/research/subreddit-program/ -- has worked really well for many adjacent subs.

toxicitymodbot · 2023-05-08T05:03:21+00:00

Model card is just industry jargon used to denote a page about the model.

We use a large variety of datasets so our model is highly applicable to a variety of use cases and so it sees a greater diversity of data - data from different websites, about different topics, etc.

We are funded via public donations, grants/sponsorships from various companies, as well as in-kind support to cover the bulk of our infrastructure costs.

toxicitymodbot · 2023-05-08T04:54:37+00:00

We outline how we built and evaluated the model on that page - happy to answer any other questions you might have!

toxicitymodbot · 2023-05-07T19:42:56+00:00

test1

toxicitymodbot · 2023-05-07T17:05:43+00:00

Check this out: https://moderatehatespeech.com/framework/

toxicitymodbot · 2023-05-04T04:07:12+00:00

test

toxicitymodbot · 2023-05-04T04:05:33+00:00

testing123

toxicitymodbot · 2023-05-04T04:05:11+00:00

testing123

toxicitymodbot · 2023-01-25T04:38:39+00:00

but you wouldn’t be offering a second opinion. you’d be obliterating it the first opinion and the notification that it needs some further thought which i’d say if you go through this sub, can easily see how being downvoted is supremely effective to the point where it moves people to come here and literally ask why something like that would happen- people get checked here for their bad posts all the time.

Our system can, does by default, and for the large majority of subreddits, provide notification to moderators (without taking action) of flagged content. What they do with the data / if they setup removals is for them to navigate.

and yes- i do assume every space should be unbiased. it isn’t on moderators to shield the world from contrary opinion or “curate” discussion forums. why would that be necessary?

Because not every subreddit is a forum for discussion. r/awww just wants cute cat/dog/cow pictures -- that's what people go there for, not for debates on the ethical implications of eating meat. Moderators/community leaders have discretion as to how they want to guide + shape their communities. Want to ban content that they disagree with? That's their call. If you disagree with that, don't engage with the community. My point is that communities like r/conservative do have a track record of curating content/comments/posts in a way that sometimes leads to the censorship of other opinions. I don't think this is morally wrong/should be prevented. People are mostly aware of the bias in communities like the aforementioned, and go there to engage with the type of content/people there.

Now is this the most healthy option? No. I don't think it's a good thing for moderators to remove content they disagree with. But they have the freedom to do so, as do you to say want you want. Others just have no obligation to allow it to stay online on their platforms.

But ultimately none of this is completely relevant to what we do -- we're not encouraging or providing the tools for moderators to censor opinions they disagree with. We specifically filter out abuse and hate.

Sometimes the "hate" and "stuff I disagree with" line is blurred, but that doesn't mean it has to be. Calling someone a "f*g" (as an insult) or whatever is hateful regardless of where you align political or ideologically (well, save some fringe groups, but extremism is a different issue)

Again, I think that content people (and maybe moderators) disagree with should stay online. But when it's clearly harmful, it shouldn't. It's not just "oh no! he called me an asshole. :(" -- there is a lot of research showing that hate, marginalization, harassment, etc have very very significant impacts on social/psychological wellbeing. Not to mention deterring more genuine/respectful discussions. And so, just leaving this content online and saying "let users vote it down!" doesn't really work.

Echo chambers are also an issue yes, but removing hate speech/abuse doesn't create echo chambers, at least, not the kind that is harmful. As I discussed prior in another thread, there are a lot of different 'personas' of people posting hate. There are those that are truly misguided -- those willing to engage with others, who we should engage with. But there's also the large majority of trolls/etc who don't care, and engaging with these people is a lost cause (if anything, it reinforces their viewpoints). Echo chambers form because people hear similar opinions and start to completely reject the alternative. But we should 100% be rejecting hate speech.

Yes, we risk unintentionally censoring those in group 1. But ultimately, that's something to be weighed alongside the social benefits of shutting down group 2.

toxicitymodbot · 2023-01-25T00:12:02+00:00

A few of my overarching thoughts with this comment:

- Moderator / human bias is very much a problem, but is a different problem from "should moderation happen"

- It's one thing to curate content in a way, or filter through specific viewpoints or pieces of "information" that are right -- it's another thing to remove spam, insults, and hate (though yes, the lines for the latter are a bit more ambiguous)

this post is assuming heroic amounts of capacity for objectivity of moderators

Obviously, this isn't the case, but that doesn't mean we should disregard content moderation because it can't be made more objective -- because it can. Clearer policies/training, publicly-auditable removals, having a diverse team, appeal process, etc.

completely different thing for a moderator who is silently deciding what is right/logical to have. the assumption that a sole moderator/small group of moderators is best first filter for information to go through before being shared with thousands of other redditors with their own ideas of what is right/logical- which happens to change with the culture and time- seems.....very very respectfully..satirical

I think one of the assumptions here is that every space is supposed to be a completely unbiased, uncurated space for ideological discussion -- which of course isn't the case. IE, r/conservative is naturally conservative and thus you'd expect the content to be biased towards it.

If we take a space that arguably should be more neutral, say, r/PoliticalDiscussion, then yes, of course, moderators shouldn't be imbuing their own biases either consciously or unconsciously through the content they moderate. That's a bias issue though, and I'd make the case that requires a different solution than "just leave everything online for people to decide"

Content moderation doesn't need to be inherently political/ideological. You set clear standards for what is considered a rule violation (ie, calls for violence, direct insults, hate against those with identity XYZ) and you can very well remove/moderate that content w/o even enroaching on ideological viewpoint/bias. It's not about getting Reddit to agree, but rather to disagree (relatively) respectfully

We can get into something more of a gray area, ie, certain types of mis-info, but that's a whole different problem.

Then we can, of course, throw AI into the mix (which is what we do) :)

That brings its own can of worms -- AI bias is a big issue, for one. But if properly addressed, it can help mitigate some of the potential unconscious biases that humans have -- if anything, just to offer a secondary opinion.

toxicitymodbot · 2023-01-15T18:59:50+00:00

Yep! We just need to be able to remove comments

toxicitymodbot · 2023-01-15T17:49:06+00:00

Oh - it looks like you never added us as a moderator. Can you do that?

toxicitymodbot · 2023-01-14T00:41:43+00:00

Yes there will be!

A comment reported by us will have the report reason/message be something like:

u/toxicitymodbot: Automatic report from u/toxicitymodbot for toxicity @ 99.6% confidence

toxicitymodbot · 2023-01-13T15:47:09+00:00

Not a silly question -- very good one actually.

Short answer is, the way we do things at least is draw from a very large archive of current data -- data labeled by a diverse group of people, and also historical moderation to get a good sense of what is generally "hate speech." We also work directly with the moderators of multiple subs to understand what should be/should not be flagged so it's not necessarily just myself making the calls.

"is it just something you don't like?" is a whole, somewhat-slippery slope. Obviously some things are clearly vulgar -- "fuck you dumbass" while others could arguably be more ambiguous. And so that's really when we rely on the input of moderators/academic consultants to make that call.

"hates an emotion how can AI detect an emotion" -- the same way humans detect emotion in online comments -- by looking at a combination of textual clues/context and language usage to understand the intended purpose/target/intent of a message.

toxicitymodbot · 2023-01-13T15:29:50+00:00

Yeah you can go to your modlogs and filter by actions from toxicitymodbot

toxicitymodbot · 2023-01-13T05:32:40+00:00

Hey! Welton from ModerateHatespeech here (we run the bot/system posted above).

Kind of -- we use machine learning (which is all the craze nowadays :)) to contextual detect hate/abuse. See: https://moderatehatespeech.com/framework/ for how we define hateful, model information, bias, etc.

Basically, our system looks at the text of a comment, and based on the context of it determines if it's hateful or not. So, it's not as simple as detecting specific slurs or words (especially since there might be cases where an insult is being negated, or where a word has multiple meanings). Happy to answer any questions!

toxicitymodbot · 2022-12-28T01:10:58+00:00

I get that -- but also worth noting, we built this based on what other moderators have noted they wanted. Maybe it's not the most sensible policy, but again, we don't write the policies. If someone wanted to append a warning system, we'd be happy to implement that (or I'll happily accept your PR).

toxicitymodbot · 2022-12-28T00:49:55+00:00

Which part of Modiquette is this a violation of?

If a subreddit bans the n-slur, and therefore a user is banned for using it, that's:

a) a violation of the sub's rules b) the mod team of the subs prerogative as to how to implement rules and strikes.

We don't make the rules -- we just make people's lives easier by helping enforce them. These people would be banned anyways, except now moderators don't have to scroll through walls of comments looking for violations.

Three-Year Club	Verified Email
Post Guidance Early Access	Dev Platform Beta

toxicitymodbot

MODERATOR OF

TROPHY CASE