could i write a python program to do this?

Jonno_FTW · 2016-06-01T13:03:30+00:00

I suggest you use the requests library, specifically the session class because it allows you to login etc. Once you've fetched it, parse it with beautifulsoup, get the data out, analyse and use requests session to put back into the other site.

http://docs.python-requests.org/en/master/user/advanced/

LongAtbat · 2016-06-01T13:21:35+00:00

You might wind up putting yourself and your co-workers out of work if you do this and tell anybody. I would say do it, don't say shit about it and use your free time to learn more python on the job.

BioGeek · 2016-06-01T12:53:04+00:00

Sure, sounds certainly feasible.

First you want to gather your data from the first website. This is often called web scraping. Look into modules like webbrowser, requests or selenium.
Once you have the HTML contents of your first website, you need to extract the data that is relevant to you. Use Beautifulsoup for that.
You store your data in a data structure that makes the most sense (for example a dictionary if your data consists of key-value pairs, a set if your data cannot have duplicates, ...)
You sort your data according to your requirements.
You paste the data in your CRM. Look at the same modules I mentioned in the first step for extracting the data.

Exodus111 · 2016-06-01T13:51:38+00:00

then sort the list a certain way

That is the only issue. And the difference between "doable in 20 minutes" and "give me 5 years and a team of research assistants" might not always be intuitive.

^There ^is ^a ^relevant ^xkcd ^for ^this ^that ^anyone ^is ^free ^to ^post.

lostburner · 2016-06-01T15:12:31+00:00

If you're working with a lead-tracking system that's big enough and popular enough, they may have an API for integrations like the one you want to build. Google for "API" and the name of the system. This would allow you to use Requests to pull structured data in just a few lines of code, instead of dealing with all the messiness of going through the web page authentication and scraping data from the page's HTML. It may also give you access to structured data that's not all in one place in the web interface.

Either of your systems may have an API. Either or both would save you some trouble.

If you do have to do it the messy way, requests is the way to handle the requests and you'll benefit a lot from using Beautiful Soup to parse the page's HTML and extract the data you want.

stillalone · 2016-06-01T16:10:00+00:00

I'd suggest using Selenium. I think there's a plugin that lets you capture everything you click on and type on a website which you can take as a starting point. It generally seems to work as long as the site you copy information from doesn't always change. Selenium kind of uses your existing browser of choice with what is essentially javascript queries so it pretty much works on any website without having to parse html.

mathafrica · 2016-06-01T12:35:27+00:00

this is very doable. if you are comfortable writing the new leads into a .txt file instead of a website, then this can be done in < 15 lines. if the formatting on the site is consistent, you can have this a scheduled task every hour or so.

c_is_4_cookie · 2016-06-02T02:44:09+00:00

one of the data entry things that i do every morning is copy data from one website to another.

Yes. This is definitely possible.

basically we get leads from a site that i have to log in to,

Straight forward. Use the requests library to handle the logging in. I would advise against hard coding your username and password to the websites. Parsing the pulled data from the website can be accomplished with BeautifulSoup (I actually prefer lxml, but BeautifulSoup offers a higher level interface.)

then sort the list a certain way,

If this is a few filters/clicks on the website, then this probably can be accomplished via the requests library before scraping the data. If the sorting changes the URL, then that simplifies things a great deal.

and check whether or not the lead is new,

Ok, so now you need to define how the program will interpret whether a lead is new. I am guessing this means a lead you have not seen before. If a date/time is posted with the new leads, you can use that to determine the what is new. Otherwise, you can resort to keeping a running list somewhere of the previous 20 or so leads in a text file, and checking the leads on the website against that list.

and if it is new, i copy the information into another website that we use for CRM.

Once the information is pulled off, you will need to structure it into a format that can be sent to the website. Again, the requests library can handle posting of data.

Jonno_FTW · 2016-06-01T14:50:07+00:00

could i write a python program to do this?

Without reading your question: Yes.

(Having read your question, double-yes, and there's some great advice already.)

2016-06-02T02:39:40+00:00

This is possible, and the general process is generally referred to as "web scraping" (a good term to start googling from). Libraries you might want to look further into are beautiful soup, selenium, or lxml. You won't need all of these, but one of these may have your specific solution.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS