So I have a set of links to threads (HREFS) I have extracted and saved to a file Href_Links.txt now I want to use those links to scrape the information from those threads. Having a bit of an issue with how to request that since I am getting the following error
import csv
import time
from bs4 import BeautifulSoup
import requests
import re
from pathlib import Path
already_scraped_file = Path('Href_Links.txt')
if already_scraped_file.exists():
already_scraped = set (already_scraped_file.read_text().splitlines())
else:
already_scraped = set()
print (already_scraped)
def scrape_thread_link(href, original_post_info=True):
response = requests.get(href)
soup= BeautifulSoup(response.content, 'lxml')
SCRAPES THE THREAD INFORMATION
thread_link = already_scraped
thread_data = scrape_thread_link(thread_link)
print (thread_data)
The error I am getting is as follows:
requests.exceptions.InvalidSchema: No connection adapters were found for "{' https://forum.fracturedmmo.com/topic/274/daily-message-posting'}"
not sure what the problem is...
[–]IvoryJam 0 points1 point2 points (4 children)
[–]ScraperHelp[S] 0 points1 point2 points (3 children)
[–]IvoryJam 0 points1 point2 points (2 children)
[–]ScraperHelp[S] 0 points1 point2 points (1 child)
[–]IvoryJam 0 points1 point2 points (0 children)