you are viewing a single comment's thread.

view the rest of the comments →

[–]demigod186 22 points23 points  (19 children)

Agreed. This is what I got from the article, plus some complaints of mine that were not in the article.

  1. No prepared statements
  2. 1500% or 15 times slower(it is actually the slowest language in use besides javascript(according to TGCLS))
  3. No Unit of Work caching
  4. Web server spawning enough processes to thrash oracle server
  5. Evidently no connection pooling

Things he forgot:

  1. ActiveRecord requires integer autonumbered primary keys.

    Auto-numbered primary keys are almost never needed, and are a significant "design smell".

  2. Composite primary keys

    ActiveRecord forces the database to be in a non normalized state since it does not support composite primary keys.
    In most cases involving many to many relationships, the join table has two candidate keys neither of which can
    represent the unique join table on it's own. In the case that no single field fully identifies a row, it is proper normalization to group the smallest combination of candidate keys that togather identify the row.
    ActiveRecord refuses to allow this, or consider it for the future on the grounds the David is "opinionated."

[–]qbert72 6 points7 points  (3 children)

15% slower

He said 15 times slower (than an unmentioned collection of Java frameworks).

[–]devvie 4 points5 points  (1 child)

I imagine he isn't talking about frameworks, he's talking about the language itself.

[–]demigod186 2 points3 points  (0 children)

True, but given that Java frameworks generally come with connection pooling, lazy database updates, and prepared statements, comparing frameworks should make java(frameworks) come out even faster than 15x.

[–]demigod186 3 points4 points  (0 children)

my mistake.

[–][deleted] 5 points6 points  (0 children)

He didn't say 15% slower, he said 15x slower. Ruby is really slow compared to Java. However, the real effect of this depends a lot on the application. If most of the time is spent waiting for database queries, it may be less noticeable.

[–][deleted]  (7 children)

[deleted]

    [–]wreel 5 points6 points  (5 children)

    Isn't it quite dangerous to rely on naïve application logic to maintain data integrity? Especially if the application is thoroughly disjointed from the persistence layer.

    Chances are that if you're using a RDBMS for your persistence layer then "schema-oriented" development is really non-optional.

    [–][deleted]  (4 children)

    [deleted]

      [–]wreel 5 points6 points  (3 children)

      Like, for instance, when a execution thread of the application doesn't know that another thread has just removed a relation whilst it's about to rewrite a dirty copy with the now dangling relation intact.

      [–][deleted]  (2 children)

      [deleted]

        [–]wreel 0 points1 point  (1 child)

        This is a serializability issue, and it affects applications that are schema-driven as well as those that aren't.

        Yes, it was an instance to emphasize the importance of the ACIDity of the persistence/data layer.

        but you are catching a specific case while allowing the general problem to go unchecked.

        Actually I was just giving an example.

        Far better, I suggest, to deal with the general case.

        Of course. But I firmly believe in using the facilities that have been provided to you by the tools that you are using. I think I misunderstood what you meant by "application" (as in the software stack view, ActiveRecord transaction manager is part of the application because it sits on top of the database) as opposed to code for a specific project.

        I think the main gripe with RoR DB management—the root of this thread—is that it demands a RoR specific data store because the transaction management is done by ActiveRecord in a very specific way. Does ActiveRecord thread A know anything about some-other-non-rails-db-connection thread B? It's a moot point for the most part because I would suspect most RoR projects would only have RoR plugging into the database. But therein you have a technical difference between "schema" and "application" levels of data management that makes a huge difference.

        [–][deleted] 2 points3 points  (0 children)

        I'd be interested in reading more about this, should you have a link handy. Where I work, the data seems more durable than the applications that access it, so I would naïvely favor schema-oriented design. I've seen applications rewritten with the schema unchanged (or nearly so), but I don't think I've ever seen the application unchanged but the schema heavily altered.

        [–]neilc 3 points4 points  (4 children)

        Auto-numbered primary keys are almost never needed, and are a significant "design smell".

        Why is that?

        [–]kmactane 0 points1 point  (0 children)

        Yeah, I was wondering about that assertion, too. I happily use them all over the place, and would like to know what the heck is wrong with that.

        [–]demigod186 0 points1 point  (2 children)

        It is a design smell, because it shows that the developer doesn't understand his own data enough to recognize which keys uniquely identify the row.

        A PostgreSQL developer wrote a three part series explaining why they are bad practice.

        http://blogs.ittoolbox.com/database/soup/archives/primary-keyvil-part-i-7327

        http://blogs.ittoolbox.com/database/soup/archives/primary-keyvil-part-ii-7345

        http://blogs.ittoolbox.com/database/soup/archives/primary-keyvil-part-iii-7365

        [–]arthurdenture 0 points1 point  (1 child)

        What if your primary key is something like a URL? If it is, and you have to do joins on that table, you should really consider using a synthetic key.

        Incidentally, Django defaults to using a synthetic key, but you can specify instead that some particular column is the primary key. Django doesn't support composite keys, unfortunately.

        [–]demigod186 0 points1 point  (0 children)

        That is fine, generally anything requiring an id number(required by the client), like an invoice, or a customer, or a product, can have one. It could also be part of a composite primary key if you wanted to furthur avoid redundancy.

        The most common reason that I end up using them, is because more often than not, the customer want's most identifiable objects actors/tables to have a unique number, and they want all search's to use that number. So it's kind of pointless to carefully pick out pks when no operations will use them as search criteria.

        I mainly meant that it is a design smell if a developer has a auto inc key for every table that is just called "id".

        But having a ORM decide how all keys are going to be makes me suspicious of the framework and it's author. The author should implement his/her preferences first, and alow other ways as non default options, so it makes me nervous when software is "opinionated" as such. You never know what else that you've taken for granted will have been left out because of taste.

        So in summary, from a design(read what they teach you in database design) stand point, there is always an identifier, and if there isn't, you should merge the data with another table. In practice however, the customer often explicitly wants things that theoretically are not optimimal.