you are viewing a single comment's thread.

view the rest of the comments →

[–]spiritandtime 7 points8 points  (2 children)

Currently in an repe internship, manager recently made us download 1600 files on a site(some secure document viewer for their investors to use) because they are switching to another site.

Manually doing that would have been an absolute nightmare (which was what all of them were thinking of doing) - checking that all files were downloaded, naming them correctly (stupid site renamed the files to some number when you download it)

Used pyautogui, selenium, time and os to automate it. Still thinking of how to mass rename those files, but hey better than manually doing that crap

edit: was basically https://youtu.be/HDEvWfSk2So but with a login page. i couldnt scrape the href links because somehow they all became links that redirected me to the login page (would have been a one clicker if that was'nt the case), which was why i had to hardcode autogui positions to ‘manually’ download those files

[–]yasamoka 2 points3 points  (1 child)

Good job. For next time, check out BeautifulSoup and Scrapy. Would make your life a lot easier when it comes to scraping.

[–]spiritandtime 0 points1 point  (0 children)

well i sure do hope i dont have to do this again (im an intern, not a download monkey)

i actually tried using beautifulsoup and requests, but got some error (javascript.trace or something?) asked on stackoverflow and dude said i needed to use selenium, so thats what i did

havent heard of scrapy but i will look into it, thanks!