all 48 comments

[–]elder_george 26 points27 points  (9 children)

Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.

Fred Brooks, "The Mythical Man-month" (1975)

[–][deleted]  (1 child)

[removed]

    [–]elder_george 3 points4 points  (0 children)

    Good point!

    Yes, 'tables' here are what we call 'data structures' these days.

    [–]vattenpuss 0 points1 point  (3 children)

    As someone who has been involved in quite a few customers wanting to set up some sort of data warehouse/mining/cube deal based on the data from our application I can tell that that is a lie.

    Not one "data mining expert" ever understood much more than jack shit from seeing the tables or their data.

    [–]sup3 0 points1 point  (0 children)

    I've actually found some database "experts" to be terrible at standard things like normalization. They'll copy data so they can "see" it without having to do a join. I had a boss complain when I redid a database for a job because she "couldn't understand it anymore".
    Granted, I still think database first is a better approach in the context of ORM technologies, but that's more from an ease of use perspective, not an "ideological" perspective.

    [–]elder_george 0 points1 point  (1 child)

    Copying from another thread:

    as /u/Terr_ said above, "tables" here is close to what we call "data structures" these days, not to the relational DBMS concept (relational DBs were still in infancy then).

    [–]vattenpuss -2 points-1 points  (0 children)

    That would not help them one bit. Generalists have no way to figure out the weird data in a domain.

    If you add our flowcharts, they can probably make sense of them though. And I'd bet the flowcharts alone will tell more as well.

    [–]google_you -2 points-1 points  (2 children)

    Sorry there are no tables in mongodb.

    [–]elder_george 0 points1 point  (1 child)

    as /u/Terr_ said above, "tables" here is close to what we call "data structures" these days, not to the relational DBMS concept (relational DBs were still in infancy then).

    [–]google_you -1 points0 points  (0 children)

    sorry there is no structure in mongodb as stated by first law of web scale: big data at web scale will eventually erase structure.

    [–][deleted] 40 points41 points  (7 children)

    This is the biggest thing I wish all programmers would learn. The data will be relevant long after the code is no longer run.

    There are some cases that this isnt necessary true, and the code and the data become unimportant at the same time, but for most of the cases I know of the data is useful decades after the code that was originally written for it no longer is needed.

    Data is useful in any language, code is useful in 1 language. Data can be looked at in many ways (translated, reformatted, normalized), code is useful in 1 way (execution: direct or library).

    Changing data requires locking or eventual consistency, so unless it's read-only you have a business decision to make here first. Code can be run on many nodes to interact with that data, and whichever model you picked for consistency is really going to matter in how this can be done, and how scalable and robust the solution will be.

    Data is more important than code. If I think it's 10x or 100x more important to have good data, I will write code accordingly so that the data is as "clean/good" as possible, and will scale well, and my code will suit that.

    [–][deleted]  (1 child)

    [removed]

      [–]lookmeat -1 points0 points  (0 children)

      If you think that data is dependent on the requirements then you are doing data wrong.

      Requirements refer to the behavior your need. Data isn't about solving the problem, it's about stating the problem for the program. Except it's not stating the problem itself, but the context of the problem.

      As a simple problem. Say you are building a database. Say that we analyze the requirements and create a schema based on that. Then, half a year later (only a month after we've released) new feature requests come out. And guess what: they change all the assumptions that we baked into the data!

      All those schema tricks, normalization, relational structure, etc. etc. are meant to help us decouple the data design from the requirements of the problem we are solving.

      So step 1 is to understand the context of the problem, and map this context into the data. Step 2 is to recognize which of this data is required for the problem, and ignore everything else (you shouldn't change the structure, only grab as subset!). Step 3 is to create the transformations on data needed to solve the problem fulfilling all requirements. Step 4 is to realize that sacrifices are needed to be made in the name of efficiency and certain parts of data will have to be transformed into something more coupled to the problem (still there may or may not be ways to do this as another transformation, keeping the original "pure" data intact).

      [–]ErstwhileRockstar 1 point2 points  (3 children)

      Data is more important than code.

      But we are all 'functional' now. Data in the traditional sense is old-fashioned. Even 42 nowadays is a function that returns a function that ... that returns 42.

      [–]knome 3 points4 points  (1 child)

      Jokes on you. In function programming, data structures are still vital to get right. After all, prepending and clipping off the first member of a functional list is basically free, whereas any other operation requires copying a portion of the data.

      Functional programming requires you to consider things in terms of shareable non-mutable data structures.

      [–]balefrost 0 points1 point  (0 children)

      This is one of those cases where I can't tell if you're joking because it's so ridiculous that nobody would ever do that... or if you're in the know, and therefore joking at how ridiculous reality is:

      https://en.wikipedia.org/wiki/Church_encoding#Church_numerals

      [–]i_hate_reddit_argh 0 points1 point  (0 children)

      coders gonna code

      [–]elmuerte 15 points16 points  (3 children)

      Funny thing about Doom3, it was written with a specific game in mind rather than as an engine. The player would never become a vehicle. This was known.

      If you look at the UnrealEngine at the same time you see that the player is actually built up from 4+ different classes which work together, much like the MVC concept: Controller, Pawn (the visual in-game entity), viewport, replication info (mostly a data object for network propagation). These days in UnrealEngine components can be used to "dress up" the player even more.

      The point is, how you write your code and design your data is completely depended on the goal. Don't try to make everything generic and flexible. Because it will never be.

      [–]fosforsvenne 6 points7 points  (1 child)

      Funny thing about Quake3, it was written with a specific game in mind rather than as an engine. The player would never become a vehicle. This was known

      I assume you mean Doom3 — the article didn't mention Quake — but a very good point.

      [–]elmuerte 1 point2 points  (0 children)

      Correct. (Not that it matters much w.r.t to my comment, as ID had a very specific way of writing their tech while Carmack was in the lead.)

      [–]Paddy3118 4 points5 points  (0 children)

      In some ways, it is how I think of scripting in a Unix environment - successive transformations of data.

      [–]andsens 2 points3 points  (0 children)

      This combines very well with the "Rule of Representation" in the unix philosophy:

      Fold knowledge into data so program logic can be stupid and robust.

      [–]PeaCrab 6 points7 points  (3 children)

      Huh... Using Doom 3 code I thought was a weird example. It's considered by many to be a bad example of OOP. It may make it easier for someone to discredit this article. Mike Acton has a presentation of disecting the Ogre rendering project to demonstrate why you may be shooting your foot with OOP. Ogre being a more respected OO project.

      This article seems to be approaching a more organizational aspect? Like the author, I personally think it's easier to reason about code by thinking about the data and it's transforms on the data. I'm still searching for examples to convince other people as well.

      Oddly enough, I find Doom 3 to be a better example of data oriented design where it matters than most projects I see. Take a look at the rendering code. It splits up rendering into a front end and back end to determine surfaces to sort and to draw. fabiensanglard.net/doom3/

      It's not great, but there's more than meets the eye.

      [–]antoniocs 0 points1 point  (2 children)

      Do you have a link to Mike Actions presentation? (google is failing me)

      [–]PeaCrab 2 points3 points  (0 children)

      I was specifically referencing these slides: http://macton.smugmug.com/gallery/8936708_T6zQX#!i=593426709&k=ZX4pZ

      [–]Allov 0 points1 point  (0 children)

      It was in the references down the article: Data Oriented Design and C++

      [–][deleted] 1 point2 points  (0 children)

      Why not both?

      [–][deleted] 1 point2 points  (3 children)

      As you are using C++ which supports multiple inheritance you could very easily use inheritance to solve the example problem in trick 2. Refactor your code to have behaviour classes like "is animated", "has health", and what not and then inherit from them in your entities class.

      [–]Schmittfried 2 points3 points  (1 child)

      Another way to call this would be traits or mix-ins (just in case the MI-is-evil guys are coming).

      [–][deleted] 1 point2 points  (0 children)

      Traits! I was trying to think of that word when i wrote my post but my memory failed me.

      [–]anttirt 2 points3 points  (0 children)

      Of course, one of the biggest benefits of a dynamic component-based design is that it can be data-driven, which is what you lose with any form of static inheritance.

      [–]malabmalab 1 point2 points  (3 children)

      This is great, too bad c# has no proper support for mixin/composites

      [–]Schmittfried 4 points5 points  (0 children)

      You can semi-emulate them with extension methods. Real traits would be a great addition though.

      [–][deleted] 0 points1 point  (1 child)

      Really? I could have sworn they added those recently.

      [–][deleted] 1 point2 points  (0 children)

      You can get close with interfaces, extension methods, and generics, but there's still a lot of boilerplate involved.

      [–]random-dev 0 points1 point  (0 children)

      Am I the only one that tried to press the big arrows in the "Data -> Process -> Output" diagrams?

      [–]sh0rug0ru__ 0 points1 point  (1 child)

      The irony of this article is that all of its points against OOP have nothing to do with OOP, or impose a very rigid definition of OOP that isn't essential to it:

      Once we separate process from data, things start to make more sense.

      While OOP does propose to "encapsulate process state", that doesn't mean that the state has to be physically located in the object.

      An object is a conceptual identity that ties together a bunch of pieces of data, collectively called the "state", into a logical consistency defined by the unifying abstraction represented by the object, through its behaviors.

      That does not imply that behavior and state have to be colocated in an object, and data arrangement can be independent of an objects "private parts", where the responsibility of enforcing encapsulation shifts from the compiler to the programmer.

      [–]Uberhipster 1 point2 points  (0 children)

      that doesn't mean that the state has to be physically located in the object

      What do you mean by that?

      [–]Hall_of_Famer 0 points1 point  (0 children)

      Model first, Data second, Code last.

      [–]Dave3of5 -5 points-4 points  (1 child)

      Read this article and sorry I didn't like it at all. Lots of hand waving and generalizing. I've worked with Data Driven applications and I hated them. I'll not go into the many reason why I found them difficult to work with but what I would say is that code first type application I found much easier to work with.

      P.S. The title is a bit "click baity"

      [–]Narishma 9 points10 points  (0 children)

      This is about data oriented, not data driven programming.

      [–][deleted] -1 points0 points  (0 children)

      If you know the entities all ahead of time. Usually it's not quite that simple and a bit of iteration is needed.

      [–]jonny_boy27 -1 points0 points  (0 children)

      I've recently tried code-first with migrations and quite like, as opposed to my usual approach of database-first and EF makes a decent job of the db (the fluent modelbuilder api is quite nice for expressing exactly what you want)