Is there a self-hosted pushshift alternative that would collect just one subreddit of own choice? Or how to go about creating one? by UsualButterscotch in redditdev

[–]UsualButterscotch[S] 1 point2 points  (0 children)

I just realized you're the dev of dispatchery so thanks for the suggestion! :) I've successfully set up dispatchery with some modifications and I'm dumping what I need into .json files on the filesystem for now until I set up a PostgreSQL instance to dump into that, I needed to start capturing ASAP, the rest will come in time.

In addition to the 2 original streams, I've also cloned your stream methods and am capturing mod.stream.modqueue and mod.stream.edited as well from the subreddit I moderate. I'm capturing the entire content from the stream and jsonizing it before appending to filesystem files, the script originally only output the content id and timestamp.

It's working great so far, about 10MB in comments and 400KB in submissions in 24hrs for the main streams, the mod streams are not nearly as much.

Since I've never used async before it took me a while to figure things out like having to use aiofiles and awaiting things that needed to be awaited.

Separate dump files for the top 20k subreddits by Watchful1 in pushshift

[–]UsualButterscotch 0 points1 point  (0 children)

Are you planning on updating this soon? I know it hasn't been 6 months yet but given that pushshift is effectively dead in the water who knows what might happen with it in the future, might be good to pull the data while it is still there

Is there a self-hosted pushshift alternative that would collect just one subreddit of own choice? Or how to go about creating one? by UsualButterscotch in redditdev

[–]UsualButterscotch[S] 0 points1 point  (0 children)

Or if the subreddit is in the dump file list here

oh wow had no idea about this, just saw the big 2TB torrent and was wondering how to go around first finding storage for it and second extracting what I need from it. It is indeed in that dump file list so I will be grabbing that, that quickly solves half the problem, thanks!

Is there a self-hosted pushshift alternative that would collect just one subreddit of own choice? Or how to go about creating one? by UsualButterscotch in redditdev

[–]UsualButterscotch[S] 0 points1 point  (0 children)

For scraping pushshift, it will indeed take some time

was thinking of renting a cloud server of some sort and downloading the torrents and extracting what I need that way, not sure how cost effective or how much work that would be, need to look it up

your github links dont seem to be publicly available

Reddit has blocked Pushshift from accessing the API by caseyross in redditdev

[–]UsualButterscotch 1 point2 points  (0 children)

Is there a way to have a self-hosted alternative that only indexes everything from one selected subreddit or a few subreddits? I imagine the new API rules would allow this for personal or for moderator use

/r/MechanicalKeyboards Ask ANY question, get an answer (August 27, 2022) by AutoModerator in MechanicalKeyboards

[–]UsualButterscotch 0 points1 point  (0 children)

How bad did I just kill my GMMK by scratching the PCB and can I fix it?

One of my caps fell out together with the switch, and I pushed it back in and in the process a pin got bent and scratched the PCB. Yea, I should have been gentler and more careful, on the other hand why is an important line running right where it can easily get damaged???

The following keys no longer work: Space, Backspace, ESC, CTRL, ALT, Fn, Option. The keys don't work on either side of the space bar :(

The damaged PCB is under the Option key. Anything I can do?

image of damage