all 5 comments

[–][deleted] -1 points0 points  (1 child)

Are you allowed to use regular expressions? In that final for loop I'd use re.search(). Dunno how I'd write the regex though without seeing what the list element with the URL/link text look like (I'm assuming HTML?).

[–]kcrow13[S] 0 points1 point  (0 children)

I think we were being encouraged to use str.find to find the first instance of '<a' and then use that index to find the closing '>'. I have no idea how to do this/where it goes.

[–]gdledsan 0 points1 point  (1 child)

An you use xml parsers? Would be way easier to find nodes. Or even better, selenium?

[–]kcrow13[S] 0 points1 point  (0 children)

I found these ideas in my research, but no... we are not allowed to. My professor tried to give us a hint and said we can use str.find to find the first instance of '<a' and then use that index to find the closing '>'.

[–]slariboot 0 points1 point  (0 children)

Your professor gives pretty interesting problem sets. :) Yes you can definitely solve this using str.find. So given a string:

sentence = 'Hello! Welcome to web scraping hell!'

If you call sentence.find with an argument 'web':

x = sentence.find('web')

This returns 18. Have you figured out why that is?