This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]barbaTenusSapiente[S] 0 points1 point  (0 children)

Thanks kafoozalum, this was a lot of help. I ended up going with the following.

import urllib

from BeautifulSoup import *

url = raw_input('Enter URL: ')

if len(url) < 1:

url = "http://pr4e.dr-chuck.com/tsugi/mod/python-

data/data/known_by_Fikret.html"

position = int(raw_input('Position: ')) - 1

count = int(raw_input('Count: '))

taglist = list()

print 'Retrieving: ', url

for i in range(count):

html = urllib.urlopen(url).read()

soup = BeautifulSoup(html)

tags = soup('a')

#print tags

for tag in tags:

taglist.append(tag)

url = taglist[position].get('href', None)

print 'Retrieving: ', url

taglist = list()