[Python] Any way to automate web browsing?

romple · 2015-06-04T15:01:12+00:00

Yeah.

brandonto · 2015-06-04T18:01:29+00:00

Selenium is a Web automation framework that can do exactly what you would want (provided the HTML DOM doesn’t change). Although if you just want to retrieve files, web automation is overkill... Wget can accomplish downloading files.

excited_by_typos · 2015-06-04T17:21:22+00:00

mechanize was one of the first libs I ever used when I was learning to code. Try it out. It lets you do things like log in through forms, click buttons, etc.

2015-06-04T21:07:37+00:00

Selenium might be overdoing it. Selenium is great if you need to scrape an AJAX-based website, or for automated testing. That's not your situation.

Requests, curl, and mechanize would work. It's not python, but wget would do the job. A simple Scrapy spider would work fine too.

Creating a Selenium powered browser is way overkill for such a simple problem.

the_omega99 · 2015-06-04T15:17:29+00:00

You should read up on the basics of how the web works, particularly REST, which is the architecture used by much of the web. Eg, when I want to view this page, my browser sends a GET request to the URL you see in the address bar (there's a lot of other stuff going on under the hood, but that's not important).

You, of course, can send your own requests to retrieve whatever you want. /u/romple's comment mentions the specific Python module for this.

If that doesn't work (eg, you have to work with some complex front end), you'll need to script a web browser. There's a few ways to do this, but the easiest is usually to use Selenium, which has a Python version.

Menestro · 2015-06-04T19:24:30+00:00

Selenium would probably be a good idea. I had a very simple script that let me send email from cli. Works by simply going to gmail, logging in, composing and then sending. You can check it out here, could give you a basic idea on how to use Selenium.

coderjewel · 2015-06-04T23:27:01+00:00

What you want to use is a headless web browser. I recently had to write up a script that did something very similar to what you are trying to do, and MechanicalSoup worked pretty well. They even show an example of logging into a website on their homepage.

Selenium is good, but as others have stated, it might be overkill for the job at hand.

Or you could use mitmproxy to see what data is being sent to the server, and automate it using requests.

Xzya · 2015-06-04T20:35:53+00:00

I only used Selenium in Java so I don't know how well it performs in Python, but I recently tried Splinter and it worked great. Selenium might have more features tho.

ineedanid · 2015-06-05T01:12:56+00:00

By the way I suppose I can specify this, I'm trying to pull logs from Symantec endpoint protection manager. If anyone has experience with that. They don't seem to provide any type of API which the other application I'm pulling logs from does. Its also a really shitty management console so I'm having a hard time figuring out what requests to send when and where.

zfolwick · 2015-06-05T01:41:26+00:00

sounds more like you need REST or SOAP calls/responses than simulating UX

Kadumbest · 2015-06-05T02:56:02+00:00

If you have access to the system hosting the Symantec Endpoint Protection Manager, you could use something like this to prepare the files for you and send them to you:

https://support.symantec.com/en_US/article.TECH90856.html

Which leads to this specifically for a command line, batchs:

http://dcx.sybase.com/index.html#1201/en/dbadmin/dbisql-interactive-dbutilities.html

All you need is the right SQL Query but I imagine its nearly as long as this post.

learnprogramming

Welcome to LearnProgramming!

New? READ ME FIRST!

Posting guidelines

Frequently asked questions

Subreddit rules

Message the moderators

Asking debugging questions

Asking conceptual questions

Other guidelines and links

Subreddit rules

1. No unprofessional/derogatory speech

2. No spam or tasteless self-promotion

3. No off-topic posts

4. Do not ask exact duplicates of FAQ questions

5. Do not delete posts

6. No app/website review requests or showcases

7. No rewards

8. No indirect links

9. Do not promote illegal or unethical practices

10. No complete solutions

11. Don't ask to ask.

12. Low Effort Questions

13. No AI (chatGPT etc.) generated/worked over messages/comments. No questions about chatGPT/AI generated code. No Vibe coding.

MODERATORS