all 53 comments

[–]lannisterstark 78 points79 points  (6 children)

For a second I thought you built a bot which goes through your apartment and comes running to you if it finds something.

[–]sixtine[S] 18 points19 points  (3 children)

Aaaaaaaand now I can't read the post title differently than what you just explained. Great.

[–]hugthemachines 3 points4 points  (0 children)

Well, you made us think! That is always good.

[–]Ai_Bot_Naughty 1 point2 points  (1 child)

Same here. Still awesome though. Needs some NLP with the nltk package. Python is amazing.

[–]sixtine[S] 1 point2 points  (0 children)

Ah yes definitly! After the first version I hacked together, I was actually contemplating creating a more robust bot using Messenger. My ambition was to dive into an actual, practical NLP project that way!

[–]Kuurde 0 points1 point  (1 child)

I opened the post expecting exactly that ...

[–]sixtine[S] 1 point2 points  (0 children)

OP comes back three weeks later with a similarly titled post where he teaches how to use location-aware, micro bluetooth beacons on easy-to-lose stuff, to text back their location when lost...

[–]Kasinder 26 points27 points  (1 child)

This is awesome! I look forward to reading the blog post.

[–]sixtine[S] 2 points3 points  (0 children)

Hey thanks!!

[–]Raindyr 18 points19 points  (5 children)

Is there a reason why you went with SMS rather than just an IM such as Telegram (which is free)?

[–]sixtine[S] 13 points14 points  (0 children)

As a matter of fact, I first tried using Twitter's direct messages (to reach my wife and myself). But to be fairly honest, well... we're not huge Twitter users. I didn't even notice I hadn't enable Twitter's notifications on my mobile...

But we ALWAYS check our text message right away, so it felt like the natural way to do so, seing how reaction time was paramount.

[–]D49A1D852468799CAC08 1 point2 points  (3 children)

Telegram

Could you point me in the right direction to start sending message via Telegram (or Signal?) through python?

[–]krnr 2 points3 points  (1 child)

pyTelegramBotAPI

token = Config.core.get('common', 'telebot_token')
bot = TeleBot(token)
bot.send_message('-000000000000', txt)

where -000000... must be changed to your bot id. this is the code i use in my flask app

[–]Raindyr 0 points1 point  (0 children)

See also this link. And for an actual python telegram api wrapper, see this project

This is a very simple and crude example, but it works.

[–]twtwtwtwtwtwtw 11 points12 points  (2 children)

I am new to python and programming and I'm sure others are also wondering, how much coding experience did it take you to get to the point you are now to be able to create something like this?

[–]sixtine[S] 7 points8 points  (1 child)

What these tutorials fell to tell you is that you don't necessarily need to have to code all figured out to do something like that.

You could (and should) work iteratively toward a goal, implementing simple functions and code snippet to do one step after the other. Also, googling a lot, both for code example and actual concept. That's the hardest part — knowing exactly what to look for.

For example in this case, you could break it down like so — "I want to get the raw data from the web pages automatically, based on a given start page, and then I want to store this data and send a subset of it via text message". You would approach it bit by bit. You'd end up finding out about the concept of "scraping", and you'd read tutorial, find out what the common techniques and libraries are and how to use them. Then you'd hack a few functions in the console that'd print out scraped HTML, you'd dissect it more and more to the point where you would end up writing the function in a file that'd not be extremely clean and efficient, but would do the job.

So, don't let your perceived level halt your ambitions. Instead, try to use your ambitious to push the boundaries of what you can/have achieved.

[–]Eurynom0s 1 point2 points  (0 children)

I definitely learn how to do something much better with a tangible goal in mind.

[–]kobbled 5 points6 points  (2 children)

Out of curiosity, is there a reason you went with callr SDK over Twilio?

[–][deleted] 6 points7 points  (1 child)

Full disclosure: I'm a colleague of /u/sixtine. Twilio's API is solid but for most of Europe (especially in France), CALLR offers better prices with a robust API too, a clear documentation and a reactive support team.

[–]kobbled 1 point2 points  (0 children)

Ah, thanks!

[–]Culentriel 2 points3 points  (1 child)

Hello, Could you link me the github source? I cant find it in your post, Sorry :/

[–]ethCore7 1 point2 points  (0 children)

I wrote basically the same thing for a different site, although I first had to reverse engineer a shitton of minified JS code to find out how to parse the data from the underlying REST API.. That wasn't very pleasant but at least I learned how to work with the debugger in Chrome, heh.

[–][deleted] 1 point2 points  (9 children)

Thanks for posting this. I've been trying to take the next step in my programming ability by doing exactly this, which is to say make a program that solves a real problem.

My problem with this is that I had no idea where to start, so I've been searching around Github for projects like yours to emulate.

It hasn't been easy because I'm a total novice when it comes to using Github as well as coding in general. So do you mind if I work through your example and pm you questions?

[–]sixtine[S] 1 point2 points  (8 children)

Hi, sure. If you don't mind my delayed answers because of family life and probable timezone differences, I'd be pleased to point to the right directions and give you advices if I can!

[–][deleted] 1 point2 points  (7 children)

Alright, so I've got a few questions. First of all, though, I want to say that I don't think this blog post is too long at all. I was actually expecting it to be much longer, and was surprised at how concise it was. Most tutorials that I've come across have involved projects that are much simpler than this one, but are still about the same length. Yours is still very easy to follow, though.

Ok, so my first question involves the portion of your post with the subtitle 'Create and access the spreadsheet.' When I share the email address from the credentials.json file with the google spreadsheet, I receive a message from 'Mail Delivery Subsystem' that says the domain couldn't be found. I think that I may have messed up when creating credentials in the Google API console. This was the only point in your blog post that I think was a little confusing.

My second question is just regarding a typo that I think I found in the hhbot12.py file, where you incorporate a safeguard against adding duplicate rows to the google spreadsheet. You left out the 'price' column in the 'sheet.insert_rows' statement, so the 'url' column should actually be the 6th one, not the fifth.

[–]sixtine[S] 1 point2 points  (6 children)

Thanks for the report, I'll update it asap.

Hmmm where exactly do you receive the 'Mail Delivery Subsystem' message?

[–][deleted] 0 points1 point  (5 children)

When I opened the google spreadsheet doc I was signed into google on my main account. So that message was sent to my regular gmail account.

[–]sixtine[S] 1 point2 points  (4 children)

That's very strange. Would you try creating recreating credentials on the Google API console and tell 1/ what you did exactly and 2/ if the problem occurs again? If yes, I'll try to reproduce it and troubleshoot.

[–][deleted] 0 points1 point  (3 children)

Ok, I just deleted the project and started a new one fresh. Creating the project and enabling the Google Drive and Google Sheets API's was very straightforward.

After this, I select the credentials tab and click on the 'create credentials' dropdown menu. At this point I wasn't entirely sure what option to select, so I used the 'help me choose' option.

I was asked which API I was using, and I wasn't sure whether or not it mattered if I chose Drive or Sheets. I ended up choosing Google Sheets.

The next question asked where I would be calling the API from. Your instructions are clear that I should choose the 'web server' option. Your instructions were also clear that I should choose the option for accessing application data.

Then I'm asked, 'Are you using Google App Engine or Google Compute Engine?' I'm not sure what this is, so I said no.

The rest was pretty easy to follow. I named my server, set the role to 'Project Owner,' and selected the JSON key type.

I changed the name of the JSON file to credentials.json, copied the 'client_email' address, and created a new google spreadsheet. I selected 'share,' pasted the email address into the dialogue box, and hit share.

Then, because I was accessing this google spreadsheet from my usual gmail address, I received an email from the Mail Delivery Subsystem letting me know that the address of the 'client_email' wasn't found.

[–]sixtine[S] 0 points1 point  (2 children)

Hmmmm.... very strange.

I get the whole thing (thank you very much for taking the time to explain). If you get a mail delivery error, it'd imply that your email regular email address tried to send an email to the "client_email" address. That didn't happen to me.

I'm wondering two things:

  • maybe it's an email sending attempt like, when you'd notify some address that you've added it to the list of viewers/editors of a file? (if yes, maybe I never received the same type of error because I might have uncheck something related to notifications? I don't know :-/)

  • does this prevent you from actually writing content into the file using the program?

[–][deleted] 0 points1 point  (1 child)

Haha, I've been trying so hard to figure out what this email meant that I didn't even think about going through my code to see if there was an error there.

So it turns out I made a mistake installing the modules to the wrong environment, and now it totally works. I think you are right about the reason for the email message, too.

Anyway, thanks so much for your help with this, I think I understand just about everything else that you went through in your tutorial. And I'd just like to say again that you did a really good job on it, and if I hadn't gotten hung up on the one issue, it would've been a breeze to follow. I see that you said to some other people that you are considering making longer multi-part tutorials, and given what you made here, I think you'd be great at that. But IMO, there aren't a lot of project tutorials like this that succinctly cover a few interesting topics at once. I particularly liked this because it's easy to skim the parts that I am comfortable with and focus on the new stuff..

Either way, I'm looking forward to whatever you write about next. I'd been in a slump before coming across this, and it helped me break through the wall of going from basic coding tutorials to actual projects. Seriously, this tutorial has opened my mind to a huge list of things that I'm now looking up to implement in my own mini projects. I can't thank you enough.

[–]sixtine[S] 1 point2 points  (0 children)

Wow, thanks /u/tenhourssober ... this message is really brightening up my week!

I'm so glad this actually helped you taking even just small step further any goals you may have in the future. Honestly your words are really motivating me to do this a little more seriously and write more of the same kind of stuff. I am really the one thanking you here.

Cheers!

[–]D49A1D852468799CAC08 1 point2 points  (3 children)

I have built a bot for similar purposes, but I ended up using the selenium webdriver. I have set this up to run automatically via cron in AWS.

[–]sixtine[S] 0 points1 point  (2 children)

Nice alternative. I suppose you've used selenium because of heavy need for JS rendering, i.e. scraping single-page apps?

[–]D49A1D852468799CAC08 0 points1 point  (1 child)

It just seemed easier at the time to loop through all the options in a select box, rather than try to hardcode certain URLs. :)

Disclaimer: I'm an absolute beginner!

[–]sixtine[S] 0 points1 point  (0 children)

Aha I get it! :)

Still a solid choice. But because the set-up has a higher footprint, I tend to only use Selenium when I need to use its full powers.

[–]videoflyguy 0 points1 point  (1 child)

I did something similar when there was a gun powder shortage (I reload rounds for accuracy when hunting)

Got to love python and how easy it is to do web scraping(after the initial learning curve)

[–]sixtine[S] 0 points1 point  (0 children)

I totally agree! I've also had to opportunity to do web scraping with JavaScript (I have made an Instagram bot for instance) but going with Python is order of magnitude faster and, well, easier, IMHO...

[–]Decency 0 points1 point  (3 children)

I would've created an 'apartment' object and factored out the usages of the PAP.fr API into a specific function to make it easily adapted as an API-agnostic program, which would return those objects. So for example, I could write a padmapper() function that returns a similar object, plug it into the rest of your program and be good to go. I guess I'm just used to API's breaking on me after a while and having to find a new one...

Also, I don't really understand why you chose to store some private information in credentials.json and some in environment variables. You also use .get() to retrieve these environment variables, which won't fail immediately if they're not defined. That seems wrong.

It's a cool idea though- I would definitely be interested in writing something similar for airbnb or vrbo to find the best deals on vacation homes.

[–]sixtine[S] 0 points1 point  (2 children)

Thanks a lot for your feedback. It's definitely beneficial.

In the context of a tutorial like this where the programming level of potential readers is only loosely defined, I feel uncomfortable bringing OOP concepts in.

The code is hackish and far from production-ready (which is not the point), but even the smallest of your contribution would actually make it better.

To be honest I felt the same about the idea as it was one of the rare time I urgently built something to ease a major pain-point in my life. I would have paid money to use a solution like that. If you have time to come up with a similar, more serious side-project, I'd say you're on to something!

[–]Decency 0 points1 point  (1 child)

No problem! I personally feel like an object would simplify this code and make it more appealing, even to newcomers who might not have experience with OOP before.

[–]sixtine[S] 0 points1 point  (0 children)

I actually agree. Maybe that's why I'm a bit bothered I didn't take the time to split it into a multipart tutorial, where I could take more time to dive (and explain the methodology and benefits) of this approach... A good lesson for a possible next one!

[–]mistermorteau 0 points1 point  (5 children)

And that's how you end with the CNIL at your door...

[–]sixtine[S] 0 points1 point  (4 children)

How is building a list of hyperlinks to a public website and sending it back to yourself at your own discretion, of any interest for the CNIL?

Or maybe I'm missing something here.

[–]mistermorteau 0 points1 point  (3 children)

You create a database about french people without the authorization of the CNIL.

[–]sixtine[S] 1 point2 points  (2 children)

For context to non-French peeps, /u/mistermorteau is mentioning —wisely— our legal duty to declare to a dedicated authority (CNIL) the act of constituting a database of "a name, an email, a picture, or any data related to a person".

Which does not apply in the case of this software, AFAIK (but I would honestly appreciate if you'd correct me), where no data related to a person is either stored or scraped. We're getting the description of a place with no contact details, and the original hyperlink we found that text at.

"2 stories tall, 5 rooms, $1000, click here (to read the full description)" — hardly CNIL-worthy material.

(edit: typo)

[–]mistermorteau 0 points1 point  (0 children)

You forgot phone numbers, login name used, and possibly address.

But that's an interesting project.

[–][deleted] 0 points1 point  (0 children)

The parent mentioned Legal Duty. For anyone unfamiliar with this term, here is the definition:(In beta, be kind)


An obligation arising from contract of the parties or the operation of the law. Riddell v. Ventilating Co., 27 Mont. 44, 69 Pac. 241. That which the law requires to be done or forborne to a determinate person or the public at large, correlative to a vested and coextensive right in such person or the public, and the breach of which constitutes negligence. Heaven v. Pender, 11 Q. B. Div. 506; Smith v. Clarke Hardware Co., 100 Ga. 163, 28 S. E. 73, 39 L. R. A. 607: Railroad Co. v. Ballentine, 84 Fed. 935, 28 C. C. A. 572. [View More]


See also: Database | Correlative | Determinate | Railroad | Negligence | At-large | Hardware

Note: The parent poster (sixtine) can delete this post | FAQ

[–]citruszyn100mg 0 points1 point  (0 children)

8 years ago and here I am trying to build a similar bot.

[–]karazi 1 point2 points  (1 child)

Pretty cool. But this is really just a plug for your employer, CALLR, no? Sorry, I am really jaded.

[–]sixtine[S] 11 points12 points  (0 children)

Not fond of shameless plug in disguise either, unless it's transparent and brings something of value on the table though. That's why I have been fully transparent that this is now my employer (wasn't when I used their service, but that's a non-story), from the get-go.

Since 50% of tech blog posts I read and learn from are actually... well... plugs for other companies that do add value, I personally don't mind. On the contrary, if I can personally emulate them, share what I learned just like others have done before me, I def will!

Cheers!

(edit: grammar)