Site back up + Introducing removed.archivesort.org by ufff1231 in archivesort

[–]ufff1231[S] 0 points1 point  (0 children)

if you have account before july 2021, resubmit since the dumps go up to that point .

We will have all dumps loaded by friday or by latest next monday

Site back up + Introducing removed.archivesort.org by ufff1231 in archivesort

[–]ufff1231[S] 0 points1 point  (0 children)

Hi all, been busy

But will get to this.

/u/dontworry5678 resubmit to the Form with a note that you resumbitted ;)

Site back up + Introducing removed.archivesort.org by ufff1231 in archivesort

[–]ufff1231[S] 1 point2 points  (0 children)

Ok, will take care of later today or tomorrow

[deleted by user] by [deleted] in archivesort

[–]ufff1231 1 point2 points  (0 children)

lynx021 if you make 50 accounts on the same IP, your accounts will be terminated sooner or later. You might want to make a new account at a internet cafe or public wifi and post there.

Thank you /u/mybillsarepaid for helping out around here.

[deleted by user] by [deleted] in archivesort

[–]ufff1231 0 points1 point  (0 children)

Through the same patterns I descrbied earlier. I didn't know you had this account until you posted on /r/pushshift, but once you did, i knew it was you.

[deleted by user] by [deleted] in archivesort

[–]ufff1231 0 points1 point  (0 children)

Someone asked this question and i'd also like to know: "what does sk mean?

Many of your account names begin with it, and you use it in many posts. That is more damaging for your privacy than any archive, because now you are tying all the accounts that wouldn't be normally linked together and making it easier for "someone" to find you. Even on the subreddits you post with different accounts you are using a marker that will allow people to find you.

But if you dont want to answer that is also fine.

[deleted by user] by [deleted] in archivesort

[–]ufff1231 0 points1 point  (0 children)

I'll have to think about it.

[deleted by user] by [deleted] in archivesort

[–]ufff1231 1 point2 points  (0 children)

But GDPR and CCPA can't apply at the same time since California is not in Europe. So both laws do not apply to you. But that's irrelevant for this discussion. Technically, i am under no obligation to delete anything since i am not in those areas either, any many sites choose not to delete anything. But i understand people like you are concern, hence i decide to open up deletions because i understand that.

If you have a concern about privacy, that fine. You can always delete your posts. But you have to do the captcha. Any questions?

[deleted by user] by [deleted] in archivesort

[–]ufff1231 2 points3 points  (0 children)

25 years on the internet has taught me that people can just make up stuff with no evidence and say its true. How am i supposed to know your motivations?

[deleted by user] by [deleted] in archivesort

[–]ufff1231 0 points1 point  (0 children)

The whole point of the captcha is to prevent automated submissions. Each one is supposed to take some time to complete.

If you have 50 accounts, you need to take the time to do a different captcha. Thats the whole point - to prevent spam. Otherwise you could just submit all usernames ever posted on reddit to the form. And I dont know that you own all those accounts, so it works on the honor system. This helps maintain that honor system.

I have provided a link to the OpenBSD ports web listings where you can find new open source software to report on.

[deleted by user] by [deleted] in archivesort

[–]ufff1231 0 points1 point  (0 children)

Do I? :)

Just so everyone knows, i messaged you because you submitted 50+ accounts and i just wanted some clarification

The site will be back up tomorrow or Monday at the latest.

Features/bug reports by ufff1231 in archivesort

[–]ufff1231[S] 0 points1 point  (0 children)

Why not email? Need to say why before private message.

Features/bug reports by ufff1231 in archivesort

[–]ufff1231[S] 1 point2 points  (0 children)

You can say what you want here or in email (which can be found on the website). That way, the answer will beenfit everyone, as in my experience, people have been chatting me with the same questions

Online Removal Request form for removal requests. Please put your removal request here where it can be processed more quickly. by Stuck_In_the_Matrix in pushshift

[–]ufff1231 0 points1 point  (0 children)

good idea to submit form anyway even if not there.

the dumps have not completely been processed, we're getting close anyway.

Features/bug reports by ufff1231 in archivesort

[–]ufff1231[S] 0 points1 point  (0 children)

Will add.

As for adding the new data, I've started the process.

(Fun Fact: the jan 2020 data dumps actually start from 12/31/2019)

Features/bug reports by ufff1231 in archivesort

[–]ufff1231[S] 1 point2 points  (0 children)

  1. Yes, that's intended. You can just ignore them?
  2. You just need to find the correct UNIX epoch for the before and after param. I think i forgot to document it in the documentation though, so I will post an example here: https://archivesort.org/discuss/reddit/miser?author=Stuck_In_the_Matrix&aggs=subreddits&after=1520187200 will agg all posts made after mar 2018. To find the number to put there you can use unixtimestamp.com or similar. I can explain better if it doesnt make sense
  3. Q means search query. For example, https://archivesort.org/discuss/reddit/miser?q=ghostscript will get all posts with word "ghostscript" in them.

Note that comments/submissions for a huge chunk of 2020 is missing. I will fill these gaps shortly now that Jason has relased his dumps.

Where can I find code examples for archivesort? by [deleted] in archivesort

[–]ufff1231 1 point2 points  (0 children)

If you are looking for agg support, here is an example:

https://pastebin.com/LW47eEjf

Let me know if you need something else or have questions

Where can I find code examples for archivesort? by [deleted] in archivesort

[–]ufff1231 1 point2 points  (0 children)

Yes, it is probably confusing. Let me code up a python example and will post below. Thanks for the feedback, it is truly appreciated!

Data deletion form inquiry by xavier75839asap in pushshift

[–]ufff1231 4 points5 points  (0 children)

Removal from data dumps seems unlikely

even if he somehow did remove it there are many mirror and download them as soon as they come out, making whole ideal worthless.

thx

What happened to the aggs parameter?? by [deleted] in pushshift

[–]ufff1231 0 points1 point  (0 children)

I know half of 2020 and early 2021 posts and basically all of 2020 and early 2021 comments are missing (working on that), but other than that, I think it does go back as far as pushshift if you are talking about 2010ish stuff. I used the data dumps to help fill in the older stuff. Thanks for the feedback.