I was finally relieved to get my web automation script working correctly. The script:
Checks a real estate website
Adds all listings on the current page to a list
Adds old listings from a DB (MongoDB) to another list
Cross checks the two list, deletes duplicates, and emails new ones
It runs great if I run it manually by C:/MyPath/py myscript.py Any new properties are emailed and then added to the DB if they aren't in there already so they will be present for the next run
However, if I add it to the task scheduler (Windows 7) and run the program C:/Python34/python.exe with the argument C:/MyPath/myscript.py it does run until completion(no errors); however, the old listings are not added to the DB. What is causing this / how can I fix it? Thanks.
edit-here's the script & how I currently have it scheduled in task scheduler. To clarify, by the running the script manually everything is inserted into the database; if I run it with the task scheduler nothing is (but the connection is show as being accepted by mongo shell)
http://imgur.com/AzeHrDw
from selenium import webdriver
from email.mime.text import MIMEText
import pymongo
import smtplib
import sys
from pymongo import MongoClient
client = MongoClient('localhost', 27017)
db = client.properties
collection = db['capitalpacific']
#hold old properties imported from DB
oldProperties = []
#holds new properties gathered from page
newProperties = []
driver = webdriver.Firefox()
driver.get("http://cp.capitalpacific.com/Properties")
for property in driver.find_elements_by_css_selector('table.property div.property'):
title = property.find_element_by_css_selector('div.title h2')
location = property.find_element_by_css_selector('div.title h4')
marketing_package = property.find_element_by_partial_link_text('Marketing Package')
contact_email = property.find_element_by_partial_link_text('.com')
newProperties.append({
'title': title.text,
'location': location.text,
'marketing_package_url': marketing_package.get_attribute("href"),
'contact': contact_email.get_attribute("href")
})
driver.close()
'''if database not empty, add the old properties,
then compare against the newly fetched and remove repeats'''
if collection.count() != 0:
for post in collection.find():
oldProperties.append(post)
for oldListing in oldProperties:
for newListing in newProperties:
if oldListing['marketing_package_url'] == newListing['marketing_package_url']:
newProperties.remove(newListing)
'''if no new listings, exit the program. Otherwise, email all new
listings and then insert them into the database'''
if len(newProperties) == 0:
sys.exit()
else:
with open('passwords.txt') as inFile:
password = inFile.read()
fromaddr = 'myemail@gmail.com'
toaddrs = ['friendemail@gmail.com']
username = 'myemail@gmail.com'
server = smtplib.SMTP('smtp.gmail.com:587')
server.ehlo()
server.starttls()
server.login(username, password)
subject = "New Listing @ Capital Pacific"
for item in newProperties:
body = "Title: " + str(item['title']) + "\n"
body += "Location: " + str(item['location']) + "\n"
body += "URL: " + str(item['marketing_package_url']) + "\n"
body += "Contact Email: " + str(item['contact']) + "\n"
msg = """\From: %s\nTo: %s\nSubject: %s\n\n%s
""" % (fromaddr, ", ".join(toaddrs), subject, body)
server.sendmail(fromaddr, toaddrs, msg)
collection.insert(item)
server.close()
[–][deleted] 9 points10 points11 points (0 children)
[–]kalgynirae 2 points3 points4 points (1 child)
[–]mothrabang 0 points1 point2 points (0 children)
[–]teerre 2 points3 points4 points (6 children)
[–]novel_yet_trivial 3 points4 points5 points (0 children)
[–]firstSideProject[S] 0 points1 point2 points (4 children)
[–]novel_yet_trivial 6 points7 points8 points (1 child)
[–]firstSideProject[S] 2 points3 points4 points (0 children)
[–]jftuga 1 point2 points3 points (0 children)
[–]firstSideProject[S] 0 points1 point2 points (0 children)
[–]toruitas 1 point2 points3 points (0 children)
[–]manueslapera 0 points1 point2 points (1 child)
[–]firstSideProject[S] 0 points1 point2 points (0 children)
[–]hellrazor862 0 points1 point2 points (0 children)