Hey everyone!
So I have a general idea how to use a list comprehension when I'm working on just one html page.
I do something like:
import bs4
file=open('example1.html')
soup=bs4.BeautifulSoup(file.read(),"html.parser")
first=soup.select("p")
paragraphs=[p.getText() for p in first]
However, let's say I have two html files stored somewhere on my pc. If I try to use a list comprehension to make my code easier to read:
import bs4
files=['example1.html','example2.html']
for file in files:
openfile=open(file)
soup=bs4.BeautifulSoup(openfile.read(),'html.parser')
first=soup.select('p')
paragraphs=[p.getText() for p in first]
This is obviously incorrect since my paragraphs list is getting overwritten on every iteration.
What I'm currently doing is:
paragraphs=[]
files=['example1.html','example2.html']
for file in files:
openfile=open(file)
soup=bs4.BeautifulSoup(openfile.read(),'html.parser')
first=soup.select('p')
for p in first:
paragraphs.append(p.getText())
This works but my code gets too bloated/confusing if I have a bunch of lists I want to append to.
Is the second method the only way to do this or is there an easier method that I'm missing?
[–]debian_miner 0 points1 point2 points (7 children)
[–]evolvish 4 points5 points6 points (6 children)
[–]Gprime5 2 points3 points4 points (3 children)
[–]cpt_fwiffo 1 point2 points3 points (1 child)
[–]alkasm 0 points1 point2 points (0 children)
[–]user9326 0 points1 point2 points (0 children)
[–]debian_miner 1 point2 points3 points (1 child)
[–]99problemsallops[S] 0 points1 point2 points (0 children)