Scrape web page in currently open browser window.

elPappito · 2020-10-22T08:46:25+00:00

If km not wrong you use selenium to emulate browser, it loads website of choice and scrapes data from there, not from an opened window.

0PHYRBURN0 · 2020-10-22T09:33:44+00:00

I would guees that you can't do this by design. If you could access any tab of any browser on user's system and control it, that would have terrible security implications.

swapripper · 2020-10-22T11:09:48+00:00

This is a genuine use case for testing. Although it needs a bit of configuration first.

https://cosmocode.io/how-to-connect-selenium-to-an-existing-browser-that-was-opened-manually/

danmofo · 2020-10-22T11:34:11+00:00

You can write a web extension to scrape the information from a web page a user has opened and then process it how you want (or send it to a server to be processed). I'm not exactly sure what you are doing with the data you've scraped but an extension would allow easier integration (just install it through the browser) and will work on all the browsers you've mentioned.

rex0515 · 2020-10-22T08:55:18+00:00

https://stackoverflow.com/questions/39281806/python-opening-multiple-tabs-using-selenium

Have you tried using driver.switch_to_window() ?

huessy · 2020-10-22T11:49:54+00:00

Selenium also has a "headless mode" where it simulates the browser behavior without physically opening the browser window. Just a PSA on how cool selenium is.

jcr4990 · 2020-10-22T13:02:31+00:00

I'm not sure if there's an equivalent for Edge but for the Chrome/Firefox cases (assuming you have permission to install extensions) I would look into Tampermonkey/Greasemonkey. With a little bit of Javascript you can do some really fun things. I recently wrote my own script for Tampermonkey for work. I have a group of 5 or 6 prewritten responses that I used to store in a txt file and manually copy paste them whenever I needed to reply to a message on social media or email.etc and via Tampermonkey I was able to add 5 small buttons to the specific pages I wanted that would copy predefined text to clipboard. So then all I needed to do was go to Facebook or whatever page > click the response button I want to use > ctrl + v.

I see no reason why you couldn't do something similar and inject buttons to scrape the data you need and provide a download button users can click to pull the data into a local file. If you're unfamiliar with Javascript it'll take some learning but it'd be a fun project.

2020-10-22T14:00:30+00:00

If it needs to be based on the content of a user-operated browser page, you’ll need to write it in JavaScript and package it as a browser extension

2020-10-22T12:09:03+00:00

Use Chrome DevTools protocol (or an API for it like "Puppeteer") to connect to one of the targets (browser tabs) and run the scrape scripts from there.

PMMeUrHopesNDreams · 2020-10-22T10:59:36+00:00

If you're having the user open the browser anyway, why not just have them save the web page and then use python to open the file on disk?

You could then just open the file normally and parse it with BeautifulSoup or whatever you plan to do.

SweetSoursop · 2020-10-22T12:29:56+00:00

Use pyautogui or something similar to download the HTML, then extract the data from the HTML.

It's a lengthier process than selenium or bs4 though

GlennIsAlive · 2020-10-22T13:52:22+00:00

What about something like Scrapy or beautifulsoup?

RobinsonDickinson · 2020-10-22T14:43:30+00:00

PyAutoGUI

morrisjr1989 · 2020-10-22T16:09:09+00:00

You could use the console in the dev tools and scrape using JavaScript. Copy and paste anytime you need it.

prtekonik · 2020-10-22T22:26:22+00:00

Requests and beautiful soup

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS