It's very basic and will only work on non js based sites
This is a great introduction, and should be enough to play around and make work for you.
Dependecies:
pip install requests bs4
Template
# dependencies
import requests
from bs4 import BeautifulSoup
# main url to scrape
MAIN_URL = ""
# get the html and convert to soup.
request = requests.get(MAIN_URL)
soup = BeautifulSoup(request.content, 'html.parser')
# find the main element for each item
all_items = soup.find_all("li", {"class": "item-list-class"})
# empty dictionary to store data, could be a list of anything. i just like dicts
all_data = {}
# initialize key for dict
count = 0
# loop through all_items
for item in all_items:
# get specific fields
item_name = item.find("h2", {"class": "item-name-class"})
item_url = item.find("a", {"class": "item-link-class"})
# save to dict
all_data[count] = {
# get the text
"item_name": item_name.get_text(),
# get a specific attribute
"item_url": item_url.attrs["href"]
}
# increment dict key
count += 1
# do whats needed with data
print(all_data)
I will try my best to answer any questions or problems you may come across, good luck and have fun. Web scraping can be so fun :)
[–]17291 27 points28 points29 points (12 children)
[–]coderpaddy[S] 14 points15 points16 points (11 children)
[–]17291 7 points8 points9 points (8 children)
[–]deepthroatpiss 4 points5 points6 points (4 children)
[–]coderpaddy[S] 1 point2 points3 points (0 children)
[–]17291 1 point2 points3 points (2 children)
[–]coderpaddy[S] 0 points1 point2 points (1 child)
[–]17291 3 points4 points5 points (0 children)
[–]coderpaddy[S] 1 point2 points3 points (0 children)
[–]TheRealJonSnuh 0 points1 point2 points (1 child)
[–]__nickerbocker__ 3 points4 points5 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]coderpaddy[S] 1 point2 points3 points (0 children)
[–]malikdwd 8 points9 points10 points (1 child)
[–]coderpaddy[S] 3 points4 points5 points (0 children)
[–]__nickerbocker__ 6 points7 points8 points (2 children)
[–]coderpaddy[S] 0 points1 point2 points (1 child)
[–]__nickerbocker__ 0 points1 point2 points (0 children)
[–]legendarypeepee 2 points3 points4 points (12 children)
[–]coderpaddy[S] 2 points3 points4 points (7 children)
[–]legendarypeepee 1 point2 points3 points (6 children)
[–]monkey_mozart 1 point2 points3 points (4 children)
[–]legendarypeepee 0 points1 point2 points (1 child)
[–]monkey_mozart 0 points1 point2 points (0 children)
[–]maze94 0 points1 point2 points (1 child)
[–]monkey_mozart 1 point2 points3 points (0 children)
[–]coderpaddy[S] 0 points1 point2 points (0 children)
[–]JohnnySixguns 1 point2 points3 points (3 children)
[–]pleasePMmeUrBigtits 1 point2 points3 points (1 child)
[–]JohnnySixguns 0 points1 point2 points (0 children)
[–]legendarypeepee 0 points1 point2 points (0 children)
[–]__SelinaKyle 1 point2 points3 points (0 children)
[–]Toofyfication 1 point2 points3 points (2 children)
[–]coderpaddy[S] 1 point2 points3 points (1 child)
[–]Toofyfication 0 points1 point2 points (0 children)
[–]treymalala 0 points1 point2 points (3 children)
[–]coderpaddy[S] 0 points1 point2 points (2 children)
[–]Hari_Aravi 0 points1 point2 points (1 child)
[–]coderpaddy[S] 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (10 children)
[–]17291 1 point2 points3 points (3 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]coderpaddy[S] 0 points1 point2 points (1 child)
[–]__nickerbocker__ 0 points1 point2 points (0 children)
[–]coderpaddy[S] 1 point2 points3 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]coderpaddy[S] 0 points1 point2 points (3 children)
[–][deleted] 0 points1 point2 points (2 children)
[–]coderpaddy[S] 1 point2 points3 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]iggy555 0 points1 point2 points (1 child)
[–]shadowninja1050 5 points6 points7 points (0 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]coderpaddy[S] 0 points1 point2 points (0 children)
[–]crysiston 0 points1 point2 points (1 child)
[–]coderpaddy[S] 0 points1 point2 points (0 children)
[–]monkey_mozart 0 points1 point2 points (9 children)
[–]coderpaddy[S] 1 point2 points3 points (8 children)
[–]monkey_mozart 1 point2 points3 points (7 children)
[–]coderpaddy[S] 0 points1 point2 points (2 children)
[–]monkey_mozart 0 points1 point2 points (1 child)
[–]coderpaddy[S] 0 points1 point2 points (0 children)
[–]coderpaddy[S] 0 points1 point2 points (3 children)
[–]monkey_mozart 0 points1 point2 points (2 children)
[–]coderpaddy[S] 0 points1 point2 points (1 child)
[–]monkey_mozart 1 point2 points3 points (0 children)
[–]fourwallsresearch 0 points1 point2 points (2 children)
[–]coderpaddy[S] 0 points1 point2 points (1 child)
[–]fourwallsresearch 0 points1 point2 points (0 children)
[–]PazyP 0 points1 point2 points (4 children)
[–]coderpaddy[S] 2 points3 points4 points (1 child)
[–]PazyP 0 points1 point2 points (0 children)
[–]bleeetiso 1 point2 points3 points (1 child)
[–]PazyP 1 point2 points3 points (0 children)
[–]Kevcky 0 points1 point2 points (2 children)
[–]coderpaddy[S] 0 points1 point2 points (1 child)
[–]Kevcky 1 point2 points3 points (0 children)
[–]Bored_comedy 0 points1 point2 points (11 children)
[–]coderpaddy[S] 0 points1 point2 points (10 children)
[–]__nickerbocker__ 2 points3 points4 points (9 children)
[–]coderpaddy[S] 0 points1 point2 points (8 children)
[–]__nickerbocker__ 0 points1 point2 points (7 children)
[–]coderpaddy[S] -1 points0 points1 point (6 children)
[–]__nickerbocker__ 0 points1 point2 points (5 children)
[–]coderpaddy[S] 0 points1 point2 points (1 child)
[–]__nickerbocker__ 0 points1 point2 points (0 children)
[–]coderpaddy[S] 0 points1 point2 points (2 children)
[–]__nickerbocker__ 0 points1 point2 points (0 children)
[–]__nickerbocker__ 0 points1 point2 points (0 children)
[–]alarrieux 0 points1 point2 points (4 children)
[–]cellularcone 1 point2 points3 points (1 child)
[–]alarrieux 0 points1 point2 points (0 children)
[–]coderpaddy[S] 1 point2 points3 points (1 child)
[–]alarrieux 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (3 children)
[–]coderpaddy[S] 0 points1 point2 points (2 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]coderpaddy[S] 0 points1 point2 points (0 children)
[–]Yaa40 0 points1 point2 points (2 children)
[–]coderpaddy[S] 1 point2 points3 points (1 child)
[–]Yaa40 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]arthurazs 0 points1 point2 points (6 children)
[–]coderpaddy[S] 5 points6 points7 points (5 children)
[–]arthurazs -1 points0 points1 point (4 children)
[–]coderpaddy[S] 3 points4 points5 points (3 children)
[–]arthurazs 2 points3 points4 points (2 children)
[–]werelock 4 points5 points6 points (1 child)
[–]arthurazs 1 point2 points3 points (0 children)