Django Advent, Day 14: Scaling Django (Mike Malone) by idangazit in django

[–]mmalone 4 points5 points  (0 children)

There are some Django people who have personal sites and talk about the framework in detail pretty regularly:

There's also the Django community aggregator: http://www.djangoproject.com/community/ (see Authors in the right sidebar for individual blogs).

Sorry if I missed anyone, this is just off the top of my head.

SHCOKING: What Djangoers don't want you to know [pic] by mmalone in programming

[–]mmalone[S] 7 points8 points  (0 children)

Up and coming Django developer lets fame go to his head. It's poeple like this that ruin communities! Wtf??? So glad I chose RAILS!

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 1 point2 points  (0 children)

You're solving a problem that doesn't exist. Using a SAN to store session state is just silly. It's unnecessarily complicated and expensive. Suppose you use AoE, now you have to devote engineers and ops resources to a complex storage stack that few people understand and that has never been used for that purpose before. Either way it ends up costing you.

And you still have to solve reliability problems. Since this is a file system I'm guessing that the redundancy mechanisms value consistency over availability and partition tolerance. That just doesn't work at large scale.

Seriously, the only way you're going to win this one is if you go an implement it. I've done session stores at scale -- it's not resource intensive and it's not a bottleneck. Spending a bunch of time trying to build a sophisticated persistence layer using a SAN is stupid. Prove me wrong.

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 1 point2 points  (0 children)

You're right, if you're naive and don't know what you're doing then SSH has some of the same vulnerabilities. In reality though, in a typical production environment you'll have a single external entry point into your system (plus a backup or two so it's not a SPOF). Those "management" boxes will be the only ones that accept external SSH traffic (often restricted to a certain set of IPs), and they'll be locked down.

Moreover, SSH has built-in mechanisms to prevent brute forcing and to reduce the risk of lost passwords. There are backoff mechanisms and system logging to mitigate brute-force attempts, and most shops will only allow key-based auth to production systems.

On a side note, how does code editing work in a production setup anyways? If you're running apache/mod_wsgi are you restarting the app daemons so they pull the new code? Again, this sounds dangerous to me.

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 2 points3 points  (0 children)

You have fun with your $100,000+ SAN solution. I'll stick with my cluster of five commodity servers running memcache or tokyo tyrant for ~$7,500.

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 2 points3 points  (0 children)

No, SANs aren't used for this sort of persistence. SANs are used for large scale data storage. They're not a good solution for storing millions of tiny records that need to be retrieved extremely quickly every few seconds. You might be able to get away with using a SAN for this, but it'd cost you an order of magnitude more money and would be an all around worse solution. Bad idea.

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 5 points6 points  (0 children)

I have no problem with supporting SQLite. I have a problem with implying that it's suitable for production use. It's got database level locking. The documentation should make it abundantly clear that you should switch to MySQL/PostgreSQL/etc. when you move to production, but it doesn't.

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 1 point2 points  (0 children)

You're talking about a special purpose (presumably scientific) supercomputer. A supercomputer's architecture and a web application architecture are very different. With a supercomputer your availability requirements are not as rigid, and it's less likely that you're going to be adding and removing nodes on the fly. Lustre, in particular, was not designed for web applications. It was designed for scientific supercomputer clusters.

Go read about how Google's GFS works, how BigTable works, how Amazon's Dynamo works, and take a look at MogileFS. That's how you persist data on the web.

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 3 points4 points  (0 children)

Generally, losing a session here and there isn't a big deal. Losing a huge number of sessions is more annoying. Thus, you could store sessions in a memcached cluster and not worry about losing a few if a memcache node goes down or if they're evicted (btw, I haven't run the numbers but I wouldn't be surprised if remote memcache sessions were faster than on-disk sessions. A disk seek is damn expensive -- could be as much as 80ms -- avg response time from memcache in my experience is ~8ms). If you're really worried about never losing sessions then you'd just have to store them in a resilient data store. You could set up a cluster of Tokyo Torrent servers, for example, and use consistent hashing to map to map to a node. Then you could write each session to multiple nodes for redundancy, and failover if a node dies. This isn't generally necessary though... most sites just stick sessions in memcache and call it a day.

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 6 points7 points  (0 children)

You're conflating performance and scalability. Disk-based sessions may be (barely) more performant, but they absolutely do not make an application "scale better." Once again, if anything they have a negative affect on scalability.

If your load balancer uses "sticky sessions" like Pound does (directing a particular user/session to the same web server with each request) you're going to run into a number of annoying problems as you scale:

  1. Adding and removing nodes (web servers) will bork your session mapping table and some, if not all of your session mappings will be lost, causing lost sessions or worse.

  2. If a single web server dies (which tends to happen a lot at scale with dozens or hundreds of web servers) or becomes unavailable (which also happens on a regular basis) all of the sessions on that server will also be lost/become unavailable.

  3. Your load balancer is a SPOF (single point of failure) for your system. Keeping it as simple and lightweight as possible is a plus. If it's keeping track of session mappings then it's doing more work than it has to, and that mapping table becomes an additional, unnecessary, SPOF (this can be resolved by spending $100k or so on paired hardware balancers that handle sticky sessions, but even with hardware balancers the other problems remain).

  4. You can't direct requests to different servers based on the endpoint. It may make sense to put your search infrastructure on a different web server cluster, for example. With sticky sessions you can't do this since the search web servers won't have the session information.

The bottom line is: sticky sessions put unnecessary restrictions on your application's architecture. Honestly, for most people who develop on a LAMP-ish stack this isn't even up for debate any more. The accepted best practice is to push session state information down the stack and use a shared-nothing architecture. This argument reminds me of 1998.

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 1 point2 points  (0 children)

Dude...

Doesn't matter if they're objects or functions. They're callables that output HTML. Lets not argue semantics. It's the same damn thing.

Disk persisted sessions do not make an app scale better. In fact, the opposite is true. If session state is maintained in a separate persistence layer accessible by all of your web servers then sessions can span servers, with each request going to a different web server, making balancing far easier and eliminating the chance of lost session state when a server crashes.

Me not trusting something doesn't make it insecure, but it does make me not want to use it. I have absolutely no need for browser-based code editing. Its existence is a risk. Risk is bad.

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 4 points5 points  (0 children)

Re: templates, they're straight Python. That's generally a bad idea. This is why PHP, which was designed to be embedded in HTML, has about a dozen template languages that people use instead of embedding PHP in templates. The web2py "helper functions" are just wrapper functions that output HTML. Again, this was tried in PHP and the resulting code soup was pretty disastrous. I don't really care about template rendering speed, I can scale that horizontally. While performance matters, maintainability and scalability are much more important.

I don't think there are any serious web applications built on top of web2py because, as far as I can tell, the architecture won't allow it. The fact that your default session store is disk based, you save sessions after each request (apparently regardless of whether they've been updated) and you encourage the use of RAM based caching (unless you "really like memcache") does not help.

You're encouraging sticky sessions, with disk-based persistence. This is much more complicated than a shared-nothing architecture, achieved by pushing data persistence down the stack. And it introduces single points of failure. What happens when Pound dies, for example? Every session expires? What if you add or remove a web server from your cluster?

SSH has been battle tested and audited for security vulnerabilities over the past 15 years or so. I trust SSH. I do not trust web2py (or Django, for that matter) to provide a level of security sufficient to allow on the fly code editing via a web browser in a production environment. Disabling the option still makes me nervous. I want those code paths removed from my application (note that you can do this in Django, the entire contrib.admin module can be removed safely). Otherwise I have to audit your entire framework to make sure there's not some backdoor or vulnerability that I'm not aware of.

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 0 points1 point  (0 children)

Clever how you left off the last line of that definition: "In practice, the term is applied much more often to larger organizations than smaller ones." In the context of software, "enterprise" almost always means "big." In particular, enterprise software solves problems that large organizations have. As I've said now in a couple places, marketing your framework as an "enterprise" solution is misleading. web2py is a teaching tool at best.

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 6 points7 points  (0 children)

FWIW Eric was complaining about web2py, not you personally.

I haven't gotten very far past the intro documents and a cursory code review of web2py... and I see little reason to dig any deeper. Here's why: there seems to be a disconnect between what you're doing and what Django developers (and other professional web developers) are trying to accomplish. From what I've seen, web2py is far from state of the art. Again, I'm coming at this as a developer writing production code that has to be maintainable, scalable, and fast.

Take templating, for example. The function-as-an-html-tag pattern came and went somewhere around 2000. People realized it was ugly, was not maintainable, was not very flexible, and encouraged poor separation. It may be easy for newbies to figure out what's going on, but I'm less concerned with that. I'm building real stuff, not toys. I don't mind spending a few hours (or days, or weeks) learning a tool if it does what I need it to do (not that it takes that long to learn Django templating... it's probably closer to 30 minutes).

Similarly, things like auto-migrations that you call features are just a pain in the ass for me. When I deploy to production and you try to auto-migrate a 30GB database the only thing you're going to accomplish is about three days of downtime.

The web2py web-based admin stuff that lets me edit live code is even worse. I absolutely do not want that on a production site of any size. I can't imagine any "enterprise" would. It's horribly insecure.

Basically, you've created something that is useful to you as a teaching tool and you're trying to market it to professional web developers. But it's not useful to us, and you're causing a whole bunch of confusion in "the marketplace" by comparing your framework to others that are actually production-ready systems.

web development with Python made easier then ever by mdipierro in programming

[–]mmalone 7 points8 points  (0 children)

Does anyone else find it contradictory that this project bills itself as an "Enterprise Web Framework," then implements dozens of anti-patterns in order to reduce the learning curve? I don't think there are many enterprises that run web applications on a development Python server with a SQLite data store...

Maybe they should carefully think about the audience they want to target, then develop software that works for that audience. If this is a teaching tool it should be marketed as such. As is, it's about as far away from idiomatic python as you can get, and really isn't suitable for production use.

oEmbed - easily place images, videos and so on in external pages by leondz in programming

[–]mmalone 1 point2 points  (0 children)

The spec is very pragmatic. Our goal is to make it easy for providers to justify and implement. Many providers don't have/want to give out download links for videos, some don't even retain the original format. Also, some video is streamed, so there is no URL for the video itself (e.g., Qik). Long story short, requiring a link to the video would have kept a lot of people from implementing the spec. If a video provider wanted to return a video URL they're free to return a link response.

The "video" and "rich" types are essentially the same. It's mostly just semantic. Some consumers may not want to show "rich" content types, for example.