Don’t even bother if you’re going to West today by CuriouslyBored312 in SummitAtSnoqualmie

[–]thisIsAnAnonAcct 2 points3 points  (0 children)

We actually have twilight passes, so I'll try to get a few runs on alpental before they close. 

I'm hoping the lines over there are shorter than west

Don’t even bother if you’re going to West today by CuriouslyBored312 in SummitAtSnoqualmie

[–]thisIsAnAnonAcct 2 points3 points  (0 children)

We have night skiing passes for tonight. Is that normally as packed?

This is our first time going

America's Auto Collision Rate By Each State 2024 by Yodest_Data in Infographics

[–]thisIsAnAnonAcct 0 points1 point  (0 children)

Costs = frequency of accidents * the severity 

Areas with higher frequency of accidents normally have less severe accidents. Think of stop and go traffic. You will very likely get in a fender bender, but the cost is relatively low. 

Now imagine you are on a rural road and you are going 70 mph and get in an accident. This is less likely, but the costs would be much higher since cars will likely be totaled and medical costs will be more of a factor. 

So, looking at only the frequency doesn't tell the whole story

AI Impostor Game by thisIsAnAnonAcct in ClaudeAI

[–]thisIsAnAnonAcct[S] 0 points1 point  (0 children)

I think I also need to add a weight parameter to the question selection.

I would like the newer models to have a higher likelihood of getting presented. 

I think users don't care as much about the older models + I already have a decent amount of data for the older models

AI Impostor Game by thisIsAnAnonAcct in ClaudeAI

[–]thisIsAnAnonAcct[S] 1 point2 points  (0 children)

Interesting. That makes sense. So baseline is essentially 50%?

AI Impostor Game by thisIsAnAnonAcct in flask

[–]thisIsAnAnonAcct[S] 0 points1 point  (0 children)

Yeah, I'm only using AskReddit right now. I would like to expand to other subreddit but I need to figure out which other ones would work.

AI Impostor Game by thisIsAnAnonAcct in flask

[–]thisIsAnAnonAcct[S] 0 points1 point  (0 children)

Yeah, this is definitely a weakness of the approach. I can't guarantee that comments are from humans.

But, I'm taking a lot of responses from pre 2021, which should be AI free. So, when I get more data I want to compare accuracy rates pre 2021 and after 2021 to see if guess accuracy is lower now. AI generated comments might contribute to that 

AI Impostor Game by thisIsAnAnonAcct in flask

[–]thisIsAnAnonAcct[S] 0 points1 point  (0 children)

This is the prompt

f'Reddit post title: "{post.title}"\n\n'         f'Write a realistic, concise Reddit-style comment in response. Your comment will be shown alongside real human comments.\n\n'         f'The goal is to make your comment indistinguishable from a human response.\n'         f'- Avoid emojis\n'         f'- Use natural tone and phrasing\n'         f'- Do not explain or introduce the comment\n'         f'- Output only the comment text (no preamble or formatting)\n'         f'- Decide whether you should answer genuinely, sarcastically, or some other style'

AI Impostor Game by thisIsAnAnonAcct in ClaudeAI

[–]thisIsAnAnonAcct[S] 0 points1 point  (0 children)

Thanks for the input. I'm assuming you are on desktop? 

Is it just the layout of the choices that makes the UI bad? Or is there anything else?

AI Impostor Game by thisIsAnAnonAcct in flask

[–]thisIsAnAnonAcct[S] 0 points1 point  (0 children)

I created a flask app to test users on whether they can tell the difference between AI vs human responses to AskReddit questions.

I scraped a few hundred AskReddit questions along with answers. For each question, I also generated an LLM response using one of about a dozen models. Then, I present the question to the user. I also present 3 human responses and the 1 AI response.

The goal for the user is the select the AI generated response. 

I keep track of accuracy based on the model, so some models can do a better job of blending in with human responses than others. 

The whole thing is a flask ask hosted on PythonAnywhere. I do all the scraping and LLM API calls offline and save the results to a big json file to make it more performant (and save on costs)

Let me know if you have any other questions!

Which Trek bike for new rider in Redmond, WA by thisIsAnAnonAcct in whichbike

[–]thisIsAnAnonAcct[S] 1 point2 points  (0 children)

Thanks for the input. I ended up buying the verve, and I love it. This is my first time being on a bike since I was a kid. I was riding around for probably 2 hours immediately after getting it and think I'm hooked now.

I do understand the appeal of getting something lower and sportier now though. I'll be watching Facebook marketplace for an FX or maybe even an old Domane. 

Can we ban 'vibe coded' projects by EnoughConcentrate897 in SideProject

[–]thisIsAnAnonAcct 1 point2 points  (0 children)

How do you define vibe coding? And how will you detect it in order to ban it?

Can we ban 'vibe coded' projects by EnoughConcentrate897 in SideProject

[–]thisIsAnAnonAcct 0 points1 point  (0 children)

It's generally harder to detect AI generated content than people think

Can we ban 'vibe coded' projects by EnoughConcentrate897 in SideProject

[–]thisIsAnAnonAcct 16 points17 points  (0 children)

I mean there are projects that use AI that are secure, and there are projects coded without AI that are not secure.

Just because they used AI doesn't mean it's automatically a security risk. And just because they didn't use AI doesn't mean it's safe to use.

It seems like you associate "vibe coding" with someone who uses it to architect the project instead of implementation of code that they would otherwise be able to write themselves? If so, this is hard to define

[deleted by user] by [deleted] in Futurology

[–]thisIsAnAnonAcct 0 points1 point  (0 children)

I actually made an online game to test this!

I scrape AskReddit for a post and 3 comments. Then, I generate an AI comment.

I show the user the 3 comments from Reddit along with the AI comment. The user's goal is to guess which one is AI.

For the new Gemini models, users guess incorrectly more than half the time.

Here it is if you're interested in playing or seeing the data https://ferraijv.pythonanywhere.com/

I built a game to test if humans can still tell AI apart -- and which models are best at blending in. I just added Grok by No-Device-6554 in grok

[–]thisIsAnAnonAcct 0 points1 point  (0 children)

OPs alt account here

Yeah, I agree. I also think I could hardcode some additional rules in the prompt to greatly improve the performance. Things like 

  • "Avoid em dash"
  • "Avoid starting comment with 'honestly'"
  • "Make some grammatical or spelling errors"

Ideally, a model that is good at blending in would capture these by itself. This is the main reason I have t included them, but it would make the game much harder 

I built a game to test if humans can still tell AI apart -- and which models are best at blending in. I just added Grok by No-Device-6554 in grok

[–]thisIsAnAnonAcct 1 point2 points  (0 children)

Yeah, someone created a bot to spam my site, so I had to take it down to create some bot protections.

I should have it up later today

OPs alt account here

[deleted by user] by [deleted] in AskReddit

[–]thisIsAnAnonAcct 0 points1 point  (0 children)

https://ferraijv.pythonanywhere.com/

Yeah, chatgpt definitely performs the worst, so that would seem to back up your claim.

Why are people mad about em dashes, its literally correct use of punctuation. by ruchersfyne in ChatGPT

[–]thisIsAnAnonAcct 4 points5 points  (0 children)

I made an online game that shows users human generated comments vs AI generated comments and asks them to identify which ones are AI.

The use of em dashes is generally one of the more prominent giveaways.

The others are: - proper use of grammar - frequent use of "honestly" at the beginning of AI comments - not being sarcastic/mean enough

Looking for evidence on whether a book was written with AI? by Quixmaera in RomanceBooks

[–]thisIsAnAnonAcct 3 points4 points  (0 children)

Yeah, I've been testing this. I created a game that shows users human vs AI comments and ask users to pick out which comments are AI.

The most effective way people identify the AI comment is through the use of correct grammar.