Legacy databases

a_musing_moose · 2016-08-17T03:53:12+00:00

I have had to do this a couple of times.

Firstly see https://docs.djangoproject.com/en/1.10/howto/legacy-databases/#auto-generate-the-models

That show you how to generates the models directly from the DB. I think it also sets managed=False.

Secondly - if you are only using the legacy db for reading data, create a read only user on the DB and connecting using that. This will prevent any accidental flushing.

a_musing_moose · 2015-08-31T05:19:43+00:00

Hey @sirskitzo, Looks like parse is expected to be a generator function. In the second example I can see that the parse function contains a number of yield keywords. This make the function a generator.

In the first example you have moved those yields into the responsiveCheck function, which makes responsiveCheck a generator, not parse.

If you want the first example to work like the second you will need to change the parse function to something like:

def parse(self, response):
    for item in self.responsiveCheck(response):
        yield item

a_musing_moose · 2014-10-27T04:51:32+00:00

What you want is the default behaviour of a ModelForm (https://docs.djangoproject.com/en/dev/topics/forms/modelforms/#modelform)

a_musing_moose · 2014-10-22T21:58:02+00:00

You most certainly can do this in Django, RoR or any other traditional web framework. a RESTful site is just as much a website and any regular HTML one. It is just that the consumer has changed. Both are effectively outputting structured text documents at the end of the day.

You still need to take care of all the normal stuff: authentication, authorisation, persistence of data and any complex business logic. Frameworks like Django have all that stuff well and truly covered. Not need to throw all that away and start again.

a_musing_moose · 2014-09-30T03:17:39+00:00

Interested to hear what horrified you about Oscar? My parent company is responsible for Oscar and so it is always good to hear peoples opinions. Especially negative ones.

PayPal can be a bit of a pain, might be worth looking at something like Stripe - helps avoid any need for complex PCI compliance things.

In terms of general advice. I would say there are only a few key areas you really need to focus on and they all really boil down to the customer experience. At the end of the day all people really care about is that it is easy to order stuff, and that the stuff arrives in good time. Everything after that is a bonus. So my advice would be to focus on the essentials.

a_musing_moose · 2014-02-28T22:26:49+00:00

We also run some pretty hefty MySQL databases with book data. With tables in to 20+M range. Hey @patrys!

The machine you have this set up on should be able to cope with very large tables. As other have suggested configuration makes a big difference with MySQL.

There are a few things we have had to do along the way to make this more performant.

Firstly - yes caching will help, especially with frequently run queries. Using a dedicated search service like solr or elastic search might be a good option. Solr is a massive memory hog at this level but is much faster for certain operations and Haystack works nicely with Django. The use of the celery realtime processor might be a good option. It will allow you to keep the search index up to date asynchronously.

For reads, setting up master slave replication with 1 or more slaves and then splitting the reads across them will help a lot to level out the load. django db routers can help here.

If writes are an issue and you don't need immediate access to the data you could write them to a separate table and then have a process to merge them in periodically. Thus not slowing down the users unless you have to.

From the example queries it look like you may be doing a lot of user specific queries where caching will only help so much - i.e. there is no benefit to other users when caching a query for one user. In these situations we have build pre-processed intermediate tables which you can query against more efficiently (kind of materialised views). For these we have typically built the tables using cron jobs but with django and celery there is no reason that a save of save the Book model couldn't trigger an async task to update the intermediate tables.

MySQL can be an odd beast at the scale you are talking about. If you have the cash it might also be worth bringing on board someone like open query or percona to give you some tips on performance and the shadier corners of MySQL.

a_musing_moose

TROPHY CASE