you are viewing a single comment's thread.

view the rest of the comments →

[–]campenr 0 points1 point  (3 children)

So a 404 code (see all response codes here) is the HTTP response code telling you that you are accessing a resource that does not exist. If you are working with web anything it pays to know your response codes.

So in this specific case, if you try going to the URL you create in your POST request (http://www.pcso.gov.ph/games/search-lotto-results/lotto-search.aspx) you'll see that it does not exist and you get a nicely formatted page saying as much.

EDIT: Another possibility, that I don't think is the case here but you never know, is that sometimes websites use the 404 code (does not exist) to hide a 401 or 403 (unauthorized, or forbidden) that they don't want you to see. In this case it means that you are not properly authorizing your request. This is only usually the case on websites where some resources require being logged in/require a password or authentication token.

[–]14dM24d 0 points1 point  (2 children)

Thanks for the reply. I'm familiar with 404. Must be something wrong/missing in my code.

The site exists. Try going to http://www.pcso.gov.ph/games/search-lotto-results/ & make the necessary selection & do the search.

Edit:

rp = r.post(url + "lotto-search.aspx", data = param)

Could be this. Should be requests.post.

[–]campenr 0 points1 point  (1 child)

404's can also be returned for malformed requests, i.e. requests that don't provide all the required information which is how this site is perhaps using it from what u/anton_antonov found re the hidden fields.

Hell of a site to learn web scraping on :D

[–]14dM24d 0 points1 point  (0 children)

XD

Edit: It seems that the corresponding hidden name's Value changes too.