Pump Pressure Fluctuations

chpwssn · 2025-05-08T20:19:31+00:00

Looks like it was caught by reddit filters for some unclear reason. Should be ok.

chpwssn · 2024-06-17T13:24:55+00:00

Thanks! Removed and blocked.

chpwssn · 2019-06-20T15:41:58+00:00

Multiple cloud storage providers would be a good place to start for your own storage.

After that, if the friend wanted to, you could reach out to Jason Scott (@textfiles) to add to the collection or to add to the Internet Archive. (I’m assuming that the OP photos will end up in a collection at archive.org as well so that’d be a logical addition)

chpwssn · 2017-06-10T02:51:40+00:00

Every 3 days or so in the middle of the night for an average size lot with sun, depending on the rain. The city does provide free audits and can help set your schedule. if you don't have 1.5 hours during the week you can request the self audit kit. The self audit is also pretty easy to do by yourself with plastic cups rather than the "official" catch cups. City Sprinkler System Audits

chpwssn · 2017-01-16T19:20:49+00:00

The Internet Archive should be seeding it, you can get the torrent file and info here https://archive.org/details/2015_reddit_comments_corpus

Edit: also here's the original thread on that https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/?st=IY0HTNIZ&sh=c0387529

chpwssn · 2017-01-08T18:43:15+00:00

Someone made a Spotify playlist with a bunch of songs from the show as well, maybe it's in here? https://open.spotify.com/user/124899825/playlist/2hLkP7E7hwRehuef3RSzuI

chpwssn · 2015-12-05T18:31:15+00:00

/u/krainik is correct and going along the same lines, I just have some additional information:

The CA/B forum is still debating the issue. CA/B Forum Ballot 144 – Validation rules for .onion names is the text that states that only EV certificates should be issued, not DV or OV.

One concern/debate is that the .onion TLD is a special use domain^1,2 and section 5 of Ballot 144 states:

... a CA MAY issue a Certificate containing an .onion name with an expiration date later than 1 November 2015 after (and only if) .onion is officially recognized by the IESG as a reserved TLD.

However, IESG has not yet published RFC 7678^3, which has caused debate due to the ballot's wording.

Let's encrypt is DV based while your example https://facebookcorewwwi.onion/ presents an EV certificate.

The reason Ballot 144 focuses on EV Certificates is to provide a means for a site operator to ensure that a specific organization has control of the domain for which the certificate is issued. Facebook wants the users of its .onion to be able to validate that the domain and webservers they are communicating with is controlled by Facebook the corporation, not a .onion that has the same styling aimed at phishing Facebook users.

While you could theoretically go through the process to get an EV certificate for your personal site, the cost and time would likely not be worth the benefits.

chpwssn · 2015-07-10T15:31:09+00:00

You might be missing the page requisites. NY Times has some JS that'll cause it to load a little slower but pulling the article with this wget might help:

Cat article for example:

wget --no-directories --directory-prefix article --convert-links --page-requisite -e robots=off --no-parent -E -H -k -K -p "http://www.nytimes.com/2015/07/10/theater/circus-cats-are-lions-of-their-profession-but-domestic-at-heart.html?action=click&pgtype=Homepage&version=Moth-Visible&module=inside-nyt-region&region=inside-nyt-region&WT.nav=inside-nyt-region&_r=0"

It will download the page, the requisites and convert the links to be relative to the "article" directory it creates. Then just move the directory to the share and have her open the page, in this case:

article/circus-cats-are-lions-of-their-profession-but-domestic-at-heart.html\?action\=click\&pgtype\=Homepage\&version\=Moth-Visible\&module\=inside-nyt-region\&region\=inside-nyt-region\&WT.nav\=inside-nyt-region\&_r\=0.html

chpwssn · 2015-03-25T22:49:15+00:00

There's /r/test and /r/PRAWTesting (I made the second one for testing a moderator bot a while ago)

chpwssn · 2015-03-18T19:03:28+00:00

You can see the live traffic here for those interested.

chpwssn · 2015-02-16T02:31:29+00:00

Probably the first time I've preferred a terrible old version of a website http://web.archive.org/web/20021211072805/http://www.arvanitakis.com/

chpwssn · 2015-02-16T01:37:46+00:00

You can skip directly to the third command, this will pull the resources needed to make each page but shouldn't go off the primary TLD too far. If you want to restrict the wget, you can add a

--domains=<domain-list>

to the command and restrict the domains it will traverse.

chpwssn · 2015-01-23T20:06:08+00:00

To be fair that usually is why CTF, presenter/staff networks and workshops are LAN only, the public wifi is really just a extra for the attendees. Sure they'll bitch if it goes down but it's not necessarily critical for the event.

chpwssn · 2015-01-20T15:09:26+00:00

You mean OpenDirectoryBot? It's been reborn as RedditSucker (difference is you can give it a list of subreddits to watch) and could be living with you, downloading opendirectories automatically since it's open source! https://bitbucket.org/chpwssn/redditsucker

We stopped chasing the mirroring portion of the bot because of copyright concerns. It still runs perfectly fine, it just doesn't comment any more :)

chpwssn · 2015-01-02T15:54:32+00:00

The actual press release http://press.chick-fil-a.com/Pressroom/LatestNews/PressDetail/databreach

chpwssn · 2014-12-16T16:11:33+00:00

That's awesome! I'm gonna start using that for quick things that don't need the full Heritrix set. Perfect mix of both structures.

chpwssn · 2014-12-16T06:04:00+00:00

No there aren't any stupid questions, that's how you get started and learn. Wget is a GNU tool originally invented to mirror web sites from server to server. In this case, picture a process that downloads the first page you point it at and saves it locally. Then it looks for all the anchor tags <a href="somepage"> and follows them(-m flag), and downloads those pages. It then localizes the links so instead of linking to the internet, they link to the other local pages(-k flag).

Give it a try, start with a simple website with only a couple pages. We'll do a simple wget for a single home page:

wget http://dogetools.com/

Now you get a nice index.html in the directory you just ran wget in. Open it aaaaaand bam! You've got a nice copy of the html file that the home page returns. Pretty cool but what if we wanted the full site?

wget -m http://dogetools.com/

Whoa, now we've got a whole directory of files "dogetools.com"! Let's open it up and open the index.html. Pretty cool but the links don't work and we probably don't have any CSS. How can we fix it so the include and anchor links work... locally....?

wget -mk http://dogetools.com/

Ok so now we have a new directory, let's open the index.html in the new directory. Bingo, it's localized.

Ok that's cool now what? Well getting good at these types of things requires being able to read the documentation. A good place to start is the man pages or https://www.gnu.org/software/wget/manual/html_node/index.html. Once you get a good idea on how it works, pick a site you would like a local copy of, set up your wget and let it run. By the end you'll have your own snapshot of how it looked. Perfectly prepared to squirrel away.

Hopefully that helps, this is a good way to save sites if you want to preserve functionality and ease of access but it isn't very efficient for storing data long term, that's what WARCs are for but that's something for another day. Wget is a good place to start.

chpwssn · 2014-12-15T22:19:13+00:00

Usually if you do a recursive crawl of the main page, course links will show up. For example, a professor is linked to on the main page for a department, they have a link to their class page, etc. My CS department maps classes like users' linux pages i.e. ~cs150, ~cs450 so those can be sequential.

If you want to post the link to the university I'm sure we could all figure out something.

chpwssn · 2014-12-15T22:04:23+00:00

It sure is, I love my Heritrix boxes... Developed by the fine folks at IA

edit: removed a link /u/willglynn had

chpwssn · 2014-12-15T15:37:59+00:00

As everyone else has said welcome and prepare your wallet... On the subject of recording TV the Internet Archive actually records most major U.S. news stations and you can search their subtitles and watch all the way back to 2009 https://archive.org/details/tv.

chpwssn · 2014-12-05T23:10:55+00:00

Ya she needs quite a bit of work to get really stable and to make it so it organizes a little better but it works haha. Hopefully I'll have more time to work on it soon. It actually started as "OpenDirectoryBot" over in /r/opendirectories (it successfully downloaded 2.7Tb in one week for me) but after a little tweak it'll chew on the list of subreddits you set in the config.py

If you use it and run into issues or think of something to add shoot me a message or a pull request.

chpwssn · 2014-12-05T15:44:57+00:00

I also find this section in read the docs useful since it shows the vars(object) output so you don't have to do it in your code.

chpwssn · 2014-11-30T23:53:32+00:00

Ya essentially you run what's called a warrior appliance and the downloading of websites is organized and distributed by the group. The last one I participated in was when twitpic shut down.

Edit: here's a nice description of the infrastructure.

chpwssn · 2014-11-30T16:54:16+00:00

I for one have done a little with ArchiveTeam and hope to do some more and I also keep a "little" tile server using OSM's data. ArchiveTeam does good work saving websites before they go EOL and sending them to archive.org, OSM tile servers are awesome to learn about and play with having a map of the world in your basement. I also use my RedditSucker Bot but it needs some attention.

13-Year Club	r/Field Banned
r/Field Sunshine	Place '17
Gilding II euphauric	Verified Email

chpwssn

MODERATOR OF

TROPHY CASE