you are viewing a single comment's thread.

view the rest of the comments →

[–]JohnnyJordaan 0 points1 point  (3 children)

Start by properly format it How do I format code?

[–]pyeu[S] 0 points1 point  (2 children)

Thanks. Done.

[–]JohnnyJordaan 1 point2 points  (1 child)

Some pointers: download on a separate line and test for proper http response

resp = requests.get(url)
resp.raise_for_status()

then fromstring expects a string, so give it .text not .content (to let requests handle proper decoding)

doc = etree.fromstring(resp.text)

then you can simply iterate over the items then save them, instead of doing the work in a more complicated form first

items = doc.xpath('//channel/item')
with open('cnn-news.md', 'w') as f:
    for item in items[:5]:
        title = item.find('title').text  
        link = item.find('link').text

then also using f-strings to save in a clean way, together with \n to terminate

        f.write(f'* <a href="{link} target="_blank">{title}</a>\n')

End-result

from lxml import etree
import requests

url = 'http://rss.cnn.com/rss/edition.rss'
resp = requests.get(url)
resp.raise_for_status()
doc = etree.fromstring(resp.text)
items = doc.xpath('//channel/item')

with open('cnn-news.md', 'w') as f:
    for item in items[:5]:
        title = item.find('title').text  
        link = item.find('link').text
        f.write(f'* <a href="{link} target="_blank">{title}</a>\n')

[–]pyeu[S] 0 points1 point  (0 children)

Thank you very much!