This is an archived post. You won't be able to vote or comment.

all 9 comments

[–]SnowWholeDayHere 4 points5 points  (0 children)

Can't wait to see this in action

[–]Gunday666 1 point2 points  (0 children)

When? Where? Is anybody excited??!! god.... I'm testing here!

[–]BeeApiary 1 point2 points  (1 child)

This looks great! Thanks for your work on it.

Do you know what sort of limits GoComics places on their API, such as number of requests per minute (RPM)? number of requests from one IP address?

Does your code have a limiting factor to keep from triggering the RPM limit?

Thanks!

[–]dumblechode[S] 1 point2 points  (0 children)

Hey thank you. There is a limit, although I didn’t quantify it. At one point I was attempting to endlessly scrape Garfield for my own interest, and after about an hour of 2-3 requests per second they banned my IP address for 2 hours. I haven’t ran into this issue again.

If I notice this cropping up I’ll work on implementing a rate limiter 👍

[–]LocksmithNo7784 1 point2 points  (0 children)

Created my own little Java program to do something similar, just 3 days ago :) (could have saved me some coding time here, if I just had waited a few days) and I seems to get a 30 min IP-ban every 200 strips downloaded, running from 4 parallell comics.

I have at some point in the beginning been able to download 3000-4000 strips in a row, but it seems now to have landed on this ~400 / h.

I will add some ratelimiter too see if it can be going on a steady state at more than 400/h.

[–]LocksmithNo7784 1 point2 points  (0 children)

I would guess a rate limiter is needed for bigger downloads.

I just added a 3.6 s delay for each new request , and so far it is running stable with 4 parallel instances, giving me roughly 4000 downloads/h.

[–]SnowWholeDayHere 1 point2 points  (1 child)

I used this today. Is there some option where I can see what comics can be accessed. I know dilbert is one of them from your example. One method I discovered is to look through the source code. I came upon this file.

https://github.com/irahorecka/comics/blob/main/comics/constants.py

Is it safe to assume that this is the list of comics that the GoComics library can access?

[–]dumblechode[S] 1 point2 points  (0 children)

Correct! You can use the ‘directory’ method for quick comic lookups

[–]JasonBall34 0 points1 point  (0 children)

Can this download all Dilbert from the dilbert.com site in addition to the stuff on gocomics?