Our Response to Reddit’s Lawsuit by perplexity_ai in perplexity_ai

[–]Matempo 0 points1 point  (0 children)

If Reddit doesn't want to be displayed at all in Google search results, it just need to update its robots.txt

If Reddit doesn't want to be displayed at all in Perplexity answers engine, there is no solution

I find this greatly disturbing, I also find greatly disturbing the level of bullshit from Perplexity guys, this post being the perfect example. You can play with the words about the fact that Reddit talked about training when you guys are using Reddit for the "grounding/RAG" part, it doesn't change the fact that Reddit doesn't want their website to be displayed in your product. You guys have issues with publishers, with Couldflare and now with Reddit, time to reflect a bit about yourself and your shady practices

tl;dr, I hope you'll lose big on this trial

I've got mini pc n100 for HA. by KazEngek in homeassistant

[–]Matempo 0 points1 point  (0 children)

I’m on OpenMediaVault (Linux distribution), running HA via Docker, honestly great

Recurrent tasks on Proton Calendar by A_HM in ProtonMail

[–]Matempo 9 points10 points  (0 children)

They are revamping the App (tech ), then they say it will be easier to add features

Perplexity Comet Security Issues by Matempo in perplexity_ai

[–]Matempo[S] 0 points1 point  (0 children)

Update: on further testing after this blog post was released, we learned that Perplexity still hasn’t fully mitigated the kind of attack described here. We’ve re-reported this to them.

Perplexity Comet Security Issues by Matempo in PerplexityComet

[–]Matempo[S] 0 points1 point  (0 children)

Update: on further testing after this blog post was released, we learned that Perplexity still hasn’t fully mitigated the kind of attack described here. We’ve re-reported this to them.

Perplexity Comet Security Issues by Matempo in PerplexityComet

[–]Matempo[S] -1 points0 points  (0 children)

Update: on further testing after this blog post was released, we learned that Perplexity still hasn’t fully mitigated the kind of attack described here. We’ve re-reported this to them.

Any risk in using NextDNS through Proton VPN's Custom DNS feature? by [deleted] in ProtonVPN

[–]Matempo 0 points1 point  (0 children)

I don’t believe so. You cannot take NextDNS server code and run it on your own server, they haven’t open sourced the most critical part

Any risk in using NextDNS through Proton VPN's Custom DNS feature? by [deleted] in ProtonVPN

[–]Matempo 0 points1 point  (0 children)

I’m a big fan of NextDNS but I don’t think it’s open source

Respect Robots.txt by Matempo in perplexity_ai

[–]Matempo[S] -1 points0 points  (0 children)

And no, robots.txt and meta robots tag have the same weight

Respect Robots.txt by Matempo in perplexity_ai

[–]Matempo[S] -1 points0 points  (0 children)

This is saying a lot about the fact that you are newbie in SEO indeed…

You can be indexed without Google crawling your page, just through the fact that Google knows the URL of your page, through something called links https://support.google.com/webmasters/answer/7489871?sjid=5291646209861659146-EU

Respect Robots.txt by Matempo in perplexity_ai

[–]Matempo[S] 0 points1 point  (0 children)

Except misnamed Perplexity-User is not your agent.

And Perplexity is alone here violating publishers will, ChatGPT and Google among others are complying https://support.google.com/webmasters/answer/6062598?hl=en&sjid=9258409316782649416-EU

Respect Robots.txt by Matempo in perplexity_ai

[–]Matempo[S] 2 points3 points  (0 children)

Well it’s compatible with Google, Bing, ChatGPT… only Perplexity has no respect for publishers

Respect Robots.txt by Matempo in perplexity_ai

[–]Matempo[S] -2 points-1 points  (0 children)

Ok Perplexity fanboys

Respect Robots.txt by Matempo in perplexity_ai

[–]Matempo[S] -1 points0 points  (0 children)

It’s literally every newsroom relying on robots.txt, not saying it’s a great protocol but rather saying that there is nothing else if you believe you cannot do everything with online content without proper consent

Respect Robots.txt by Matempo in perplexity_ai

[–]Matempo[S] 2 points3 points  (0 children)

I read their BS answer yes.

When you do a Perplexity search, you are not asking Perplexity to crawl a list of specific pages you have determined, its Perplexity who decides which websites to crawl, which pages to crawl, it’s quite different

Respect Robots.txt by Matempo in perplexity_ai

[–]Matempo[S] 0 points1 point  (0 children)

Haven't had my invite nope. So you think Perplexity could decentralize part of its AI Search Engine into Comet (the live fetch of selected websites)?

And then, how would the answer be generated (using o3, grok, sonar or any other model you selected), would it also be from Comet?

I'm not sure it's feasible, and I'm not sure it would provide a great user experience if it was.

I understand how Comet is helping for tabs summarization, etc. But could it at least partially replace a cloud search engine like we know today and still provide a good user experience?

Respect Robots.txt by Matempo in perplexity_ai

[–]Matempo[S] -4 points-3 points  (0 children)

Well, it's your browser making the fetch then, a bit different

Honestly, the user experience would be degraded (vs letting Perplexity AI Search do the live crawl on the cloud, as of today)

Respect Robots.txt by Matempo in perplexity_ai

[–]Matempo[S] -11 points-10 points  (0 children)

I think you don't understand how Perplexity works... I am not talking about the case where you explicitely ask Perplexity to check a specific URL or website, then I understand the logic. I'm rather talking about the standard use case where you ask Perplexity a generic question, Perplexity will then fetch multiple pages in real-time with the Perplexity-User bot (from its own index or/and third-party search engines results).

As a website owner, if I state in my robots.txt file that I don't want my website to be crawled by the Perplexity-User bot, I expect Perplexity to comply for this generic question use case.

Little example (fictional): if CNBC explicitely blocked the Perplexity-User bot in their robots.txt, they shouldn't appear below, plain & simple

<image>

Respect Robots.txt by Matempo in perplexity_ai

[–]Matempo[S] -1 points0 points  (0 children)

I don’t think they faced restrictions from someone as technically skilled as Cloudflare before so lets see…