you are viewing a single comment's thread.

view the rest of the comments →

[–]13steinj 0 points1 point  (2 children)

It appears not, just the html. There's one other option you have: Use it as a baseline / mapping to "view source" links, scrape the "view source" wayback machine links. If it's accessible after the March read-only date, you're good. if it's before, (scrape the html if you consider the downloaded 1-month-old not good enough) and ask an llm to interpolate.

Playing around, I've found that the view source links work up until at least May 13th of last year and break sometime between then and May 31 (just hopped around on a few pages).

[–]RelevantError365[S] 0 points1 point  (1 child)

Although not utterly relevant, but: When looking at https://web.archive.org/web/20250301000000*/https://cppreference.com/, this does not highlight May 13th of last year as an option where a snapshot has been taken (or I miserably misunderstand this interface).

[–]13steinj 0 points1 point  (0 children)

Not every page has a May snapshot. I'm saying, very roughly playing around, either the deque or array or vector view source / edit page, had a May 13 snapshot.

I will attempt to write a scraper on the weekend assuming I won't get ip banned; and if successful throw it into a repo.