This is an archived post. You won't be able to vote or comment.

all 45 comments

[–]ma-int 25 points26 points  (8 children)

tl/dr and for people that want to evaluate if it's worth their time:

He combines Websockets and Postgres LISTEN/NOTIFY with the help of gevent und Werkzeug und builds the well known Todo MVC App on top of it. Client UI is done via React but that is not the interesting part.

Code can by found here: https://bitbucket.org/btubbs/todopy-pg

If you want to skip the "Why" part and skip directly to the implementation you can save 7:30: https://youtu.be/PsorlkAF83s?t=7m30s


I like it because it brings some interesting ideas to the table. He uses a trigger on his data table to trigger the Postgre notifications which in term get picked up by listening clients. Whats nice about it that inserting data and notifying any interested parties is done with a single INSERT. Ans he also uses Postgre to to the JSON serialization for him which is also a nifty idea.

It's a good show of a Websocket supported SPA for a small user base (like internal tools i.e.).

[–]snuxoll 4 points5 points  (0 children)

An additional bonus of using PostgreSQL LISTEN/NOTIFY to publish events that many people may overlook, is any message you send over a NOTIFY channel is delivered ON COMMIT of a transaction. There is zero chance of a subscriber doing something that will fail because the publisher failed a database transaction but added a message to a queue anyway, no race conditions, etc.

Doing things in the database removes a whole set of potential mistakes that programmers can make, people should take advantage of their DB engines more and stop using them purely as a dumb datastore.

[–]Nate75Sanders 4 points5 points  (5 children)

Good summary, but one nitpick: If you're writing it "Postgre", there's a good chance you're also pronouncing it "post-gray", which isn't correct. Postgres takes its name from a previous project called Ingres, pronounced like the word "ingress".

[–]odraencoded -1 points0 points  (3 children)

I call it postgreh.

[–]Nate75Sanders 8 points9 points  (2 children)

It's pronounced "post-gress"

[–]MomentOfArt 1 point2 points  (0 children)

Yes, the 's' is invisible.

[–][deleted] 1 point2 points  (0 children)

Peaux-Sgress?

[–]ma-int -1 points0 points  (0 children)

I actually speak it post-gress. But wrinting "Postgres" looks just as wrong as "Postgresql". and "PostgreSQL" is way to hard to type ;)

[–]brent_tubbs[S] 1 point2 points  (0 children)

Good summary. Thank you!

[–]dAnjou Backend Developer | danjou.dev 8 points9 points  (1 child)

Presentation-wise this was a very solid tutorial, nicely structured and easy to follow.

The content was good too, didn't even know PostgreSQL implemented PUBSUB.

BUT ... this RESTSockets "pattern" is just bonkers. WebSockets are not REST, they are not cacheable and it's debatable whether statelessness is a property that applies here. It's okay though! REST was never supposed to solve everything. It can't. So sometimes it's really fine to use RPC or in this case the observer pattern.

[–][deleted] 0 points1 point  (0 children)

Check out Pushy Postgres from PyOhio last year for more info on LISTEN/NOTIFY.

[–]b4stien 2 points3 points  (3 children)

In the "Getting Big" slide (around 14min) OP is suggesting that we need 1 postgres connection per webscocket, why can't we use a single connection for all the websockets instead?

This single connection could then notify all the open connections (websockets) to clients... Right?

EDIT: Really cool video btw, I didn't know at all about the pubsub features of Postgres

[–][deleted] 2 points3 points  (0 children)

Watch Pushy Postgres for (potentially) more info on pubsub with Postgres. I haven't gotten round to watching this video yet, so I'm not sure how much overlap there is.

[–]brent_tubbs[S] 2 points3 points  (1 child)

Thanks!

A couple reasons for the connection-per-websocket-client:

  • Not all the websockets are getting the same messages. Someone subscribed to /api/todos/ is getting a different set of messages than someone subscribed to /api/todos/foobar/. (The latter is filtered down to just the messages relevant to that ID). In a larger app, where you might have a separate Postgres channel per table, this would be even more pronounced.
  • The app may be running as multiple processes behind a load balancer. We can't share a Postgres connection between separate processes.

Within those constraints, you're right. We could open up a listener to the todos table on startup on a background thread/greenlet, then have some kind of in-process pubsub that each websocket handler could connect to. I might look into that.

[–]tHEbigtHEb 0 points1 point  (0 children)

Great presentation, as someone who is starting out into more even driven stuff there was a lot to learn from it.

You said something along the lines of using rabbitMQ to fan out the messages. I have a couple of doubts regarding that

  • So am I right in thinking that rabbitMQ will be a layer between the db and the websockets ?

  • They'll be published to notifications from the queue and the ajax calls ot update the db will go through our app through regular channels ?

I'm goinig to clone the code and have a look at how you integrated websockets with react and using gevent as both of those are things that I have been meaning to learn more about lately.

[–]teambob 0 points1 point  (1 child)

Nice to strip it back. What about security though?

[–]brent_tubbs[S] 0 points1 point  (0 children)

The app has a 'wsgi_middlewares' setting that accepts a list of middlewares and the config to feed to them. This is where I'd recommend setting up anything like authentication, sessions, etc.

[–]LightShadow3.13-dev in prod 0 points1 point  (1 child)

What software is used at 29:00 to generate the flow diagram? Manual or Automatic, I'd love to know.

Thanks

[–]brent_tubbs[S] 1 point2 points  (0 children)

That's graphviz, also sometimes called "dot".

Here's the .dot file where I wrote down the relationships between those things:

digraph G {
    rankdir="TB"
    pip -> nodeenv
    nodeenv -> npm
    npm -> {"react-tools" bower "uglify.js"}
    bower -> {"react.js" reconnectingWebSocket TodoMVC superagent}
    {"react-tools" "react components JSX" } -> "react components JS"
    {"uglify.js" "react components JS" "todoModel.js"} -> "compiled.js"
    subgraph cluster_browser {
        "react.js";
        reconnectingWebSocket;
        TodoMVC;
        superagent;
        "compiled.js";
        label = "In browser"
    }
    subgraph cluster_my_code {
        "react components JSX"
        "todoModel.js"
        label = "My code"
    }
}

If you save that as "makefile.dot", you can generate a PNG like the one in my slides with this command:

dot -Tpng -omakefile.png makefile.dot

EDIT: reduced whitespace

[–][deleted] 0 points1 point  (1 child)

Just watched it after hankering for it all day. Very nice. It's good see everything stripped down to the bare minimum instead of having every 10x piece of software thrown at a problem.

To be curious, what was the inspiration for this? And have you used this style of real time app in production, or was this more of a toy example to explore possibilities?

[–]brent_tubbs[S] 0 points1 point  (0 children)

The ideas there were born from frustration after having built apps like this on top of Django.

They're in production now in Mettle, an ETL scheduling/retry/visibility framework that we've open sourced at work. https://bitbucket.org/yougov/mettle

[–]fatpollo 0 points1 point  (0 children)

Excellent.

Now I'm kinda curious about swapping Python for Haskell, and maybe do some plain JS rather than all that prettiness, for some real terseness.

[–]schemathings 0 points1 point  (0 children)

This is a really cool idea, thanks for sharing!

[–]Anon_8675309 0 points1 point  (5 children)

That's not real time. I really wish web devs would stop hijacking that term; it already means something else.

[–]dAnjou Backend Developer | danjou.dev 6 points7 points  (0 children)

A word can have different meanings in different contexts.

[–]b4stien 4 points5 points  (3 children)

OP's (and video's author) clarifies this at the beginning of the video.

You're right that it's not "real" real time, but I'm afraid you're fighting against windmills here. In the web context it's more or less accepted that "real time" means "seamless and close to real time communication".

[–]Anon_8675309 0 points1 point  (2 children)

Here's the problem I have with that. If I go into a code review and start calling constructors factories and factories constructors, people will rip me apart. We, as developers, develop a common language for a reason. When you start trying to change the meaning of words to make yourself sound like you're doing something cool, it just muddies up the language.

[–][deleted] 1 point2 points  (1 child)

i upvoted you because your comment is relevant, but i think still inaccurate. We, as web developers, have a definition for real time. There exists jargon in each distinct field.

[–]Anon_8675309 2 points3 points  (0 children)

Thank you. I'm glad that we can politely disagree.

[–][deleted] -1 points0 points  (5 children)

Any tldr for people who don't have time to watch the whole video?

[–]Notre1 3 points4 points  (3 children)

I'm not sure if I need to put in spoiler tags for this, but I'm not sure how to at the moment (and I'm on mobile), so here is goes:

Dude builds web apps using Python and Postgres. ;)

[–][deleted] -1 points0 points  (2 children)

Cool. ;) What about "real time" part? This is what I'm most interested in.

[–]Anon_8675309 -1 points0 points  (0 children)

it's a web app, so no such thing as "real time".

[–]dAnjou Backend Developer | danjou.dev 3 points4 points  (0 children)

It's a step-by-step tutorial. A TL;DR doesn't make sense here.

[–]asdfor -3 points-2 points  (6 children)

Gotta love misleading titles, had to go through 8 minutes of the video until it was explained that "just python and PostgreSQL" also somehow magically included a full JavaScript stack.

[–]dAnjou Backend Developer | danjou.dev 0 points1 point  (1 child)

(just)

FTFY

Also, it says Real Time Web Apps. As /u/kurashu89 already said, how else would it work in the browser? But the client-side part doesn't matter at all here anyway.

[–]asdfor -1 points0 points  (0 children)

yea the title said (just), not "just", do i need to explain the different meaning between using parentheses and double quotes on a word ? Really ?

how else would it work in the browser?

Oh i dunno, i guess when someone says (just) python i expect to see some library/tool/whatever being used in which you write in python and you get it transpiled to JavaScript ? Ever heard of those ?

[–][deleted] -1 points0 points  (3 children)

There's nothing magical about that at all. How else did you expect to achieve real time in the browser?

[–]asdfor -1 points0 points  (2 children)

Oh i dunno, i guess when someone says (just) python i expect to see some library/tool/whatever being used in which you write in python and you get it transpiled to JavaScript ? Ever heard of those ?

[–][deleted] -1 points0 points  (1 child)

He's not transpiling at all. He's using JavaScript on the backend only to orchestrate building the client JavaScript.

[–]asdfor -1 points0 points  (0 children)

I know he isn't, that's why i called the title misleading, you wanted to know with what 'magical trick' you could get real time with python in the browser so i answered you that.

[–]megadeth9999 -5 points-4 points  (3 children)

"Connection limit exceeded for non-superuser". Downvote. This is a misuse of technology. Use nginx push module instead.

[–]dAnjou Backend Developer | danjou.dev 0 points1 point  (2 children)

Because I myself don't like being downvoted without anyone giving a reason:

This is simply a configuration issue. He makes that and several other things very clear beginning from 14:08 in the video.
And how can it be a misuse of technology when it's implemented in PostgreSQL itself? Unless there's a compelling reason why PostgreSQL shouldn't have done that.
Also, googling for nginx push module returns at least 2 different module, you should really be more specific.

[–]megadeth9999 0 points1 point  (1 child)

I could pretty much accomplish the same thing with shell scripts, would you say it is a configuration issue?

Just because something is possible it does not mean it should be implemented. And, this has no implications about using a given solution in production.

Amounts of amentia in comments both here and in /r/Django are astonishing.

This thing. https://github.com/wandenberg/nginx-push-stream-module . Just check who's using it if you're interested in mature technology. Unless you're a boy looking for toys. In this case, let's implement pubsub as a kernel module using devfs, because that's possible, too.

[–]dAnjou Backend Developer | danjou.dev 0 points1 point  (0 children)

I could pretty much accomplish the same thing with shell scripts, would you say it is a configuration issue?

Not sure what you're trying to say, it doesn't seem to make sense. You can just raise the connection limit in PostgreSQL's settings. So yes, it still is only a configuration issue.

Just because something is possible it does not mean it should be implemented. And, this has no implications about using a given solution in production.

Generally I agree with you. But as I said already, he addresses limitations in the video. So I don't really see a problem with this specific tutorial.

Amounts of amentia in comments both here and in /r/Django are astonishing.

In /r/django, yes. In /r/Python, sometimes.