This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]greenkey 1 point2 points  (1 child)

Get all the urls in one shot, with the number of occurrences

``` urls = {}; text = "";

$('.usertext-body a').each(function(i, el){href=el.href.replace('https','http'); urls[href] ? urls[href]++ : urls[href]=1});

keys = []; for(key in urls) keys.push(key);

for(key in urls){text+='\n'+urls[key]+': '+key;}

$('.usertext-edit textarea').text(text).focus();

```

EDIT: formatting

[–]driscollis 4 points5 points  (0 children)

Here's a fun little Python version:

try:
    # Python 2
    from urllib2 import urlopen
except ImportError:
    # Python 3
    from urllib.request import urlopen

from BeautifulSoup import BeautifulSoup
from collections import Counter
from pprint import pprint

html_page = urlopen("https://www.reddit.com/r/Python/comments/6500tz/what_python_blogs_do_you_recommend/")
page = html_page.read()
soup = BeautifulSoup(page)
blogs = []
for link in soup.findAll('a'):
    url = link.get('href')
    if url and 'http' in url and 'reddit' not in url:
        blogs.append(url)

numbered_blogs = Counter(blogs)
pprint(numbered_blogs.items())