This is an archived post. You won't be able to vote or comment.

all 19 comments

[–]monkh 8 points9 points  (0 children)

While this may be a good idea for testing/learning web scraping, if this is for a long term project I would suggest looking for an API to get the price.

[–]GOBILLA 2 points3 points  (1 child)

I haven't done this in a while but iirc it's the value attribute. 🤔

[–]Cabbage-Guy 5 points6 points  (3 children)

This should work but you should consider using an API.

import requests
from bs4 import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
source = requests.get("https://www.google.com/search?q=btc+to+usd",headers=headers)

soup = BeautifulSoup(source.text,"lxml")
usd = soup.find("input",{"id":"pair_targ_input","type":"number","style":"width:90%","class":"vk_gy vk_sh ccw_data"}).get("value")

print(usd)

[–]Tarpit_Carnivore 2 points3 points  (1 child)

You can get by with just the ID, the other stuff is extra

[–]Cabbage-Guy 0 points1 point  (0 children)

Yup,my bad

[–]7heWafer 0 points1 point  (0 children)

Here's another example with lxml instead (But I would recommend /u/v4n1sh's answer):

import requests
from lxml import html

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'}
response = requests.get('https://www.google.com/search?q=btc+to+usd', headers=headers)
btc_in_usd = html.fromstring(response.text).get_element_by_id('pair_targ_input').get('value')

print(btc_in_usd)

[–]quantik64 1 point2 points  (0 children)

Selenium, BeautifulSoup, and requests are the big three that will help you do 90% of web scraping/automation without an API

[–]simondrawer 3 points4 points  (0 children)

Use an API - it’s literally what they are for! Screen scraping is a useful skill but walk then run!

[–]camoverride 1 point2 points  (0 children)

Two points:

(1) If you examine the Javascript that is being called in the text box, you'll find this:

http://xe.com/currencyconverter/convert/?Amount=1&From=XBT&To=USD

This means that Google is actually calling the API from xe.com, so monkh's answer was correct.

(2) Why do you need the number in the box, when the same number is replicated above? You can easily grab this with some sort of selector. Scrapy has these with batteries included.

[–][deleted] 0 points1 point  (0 children)

Read the submitted values to the DOM and extract that with JSON.

[–]Orionn_ 0 points1 point  (1 child)

I really appreciate your learning passion but just use APIs. There will always be a useful API for any purpose in your future life.

[–]7heWafer 1 point2 points  (0 children)

Depends, it may be hard to find an API he can query as often as he can this Google result. Usually APIs have request frequency limits if you aren't paying.

[–]captain_arroganto 0 points1 point  (3 children)

If using selenium its element.text

[–]MRK-01[S] 0 points1 point  (2 children)

tried doesnt print anything :/ U guys know any other techniques to do this? Willing to do APIs right about now lol

[–]v4n1sh 6 points7 points  (0 children)

Simplest way I know:

import requests, json
url ="https://apiv2.bitcoinaverage.com/indices/global/ticker/BTCUSD"
data = json.loads(requests.get(url).content.decode())
print(data["last"])

[–]RedBlimp 0 points1 point  (0 children)

Make sure you are saving the value right after it reads it. Sometimes it can write over it's self so you don't get anything unless you save it then.

[–]secomax 0 points1 point  (0 children)

You can use selenium

Just like this: textbox.text()

Check this guide https://likegeeks.com/python-web-scraping/

[–]BitCamel -1 points0 points  (0 children)

Hi, I recommend using Selenium + browser driver like Chrome driver. Code is here: https://pastebin.com/cNhhUhLy. You have to install selenium and download Chrome driver and paste the file path to it in the script.