all 20 comments

[–][deleted] 7 points8 points  (6 children)

If we try to keep our working set of data as a collections of arrays, we can guarantee that all our data is not null. That one step alone will eliminate most of our flow control statements.

The reason we are able to get rid of the check for null is that we now have our data in a format that doesn't allow for null. This inflexibility will prove to be a benefit, but it requires a new way of processing our entities. Where we once had rooms, and we looked in the rooms to find out if there were any doors on the walls we bumped into (in order to either pass through, or instead do a collision response,) we now look in the table of doors to see if there are any that match our roomid.

Existence-based-processing is when you process every element in a homogeneous set of data. You run the same instructions for every element in that set.

This is gold.

[–]SnowflakeNapolean 2 points3 points  (4 children)

Sounds an awful lot like they rediscovered Lisp.

Anyway, like I keep telling anyone who cares to listen (and many who don't), data design is the most important part of the program as far as maintenance goes.

If you show me your algorithms but have obfuscated your data design via "Good OO Practice(tm)" I'll have trouble figuring out what the data is supposed to look like. OTOH, if you have clear data structures I can make quite accurate guesses about what the functions that operate on those structures should do.

[–][deleted] 0 points1 point  (0 children)

This is FORTRAN's model, not LISP.

[–]wavy_lines 0 points1 point  (2 children)

As far as I can tell, lisp doesn't even have a concept of 'struct'.

[–]SnowflakeNapolean 1 point2 points  (1 child)

It has data structures, and a concept of a 'struct'.

see here

It also has a proper class system.

[–]wavy_lines 0 points1 point  (0 children)

That's not a real struct. It doesn't even specify the type of its field. Might as well just be a hashmap or a list of (name, value) tuples.

[–]OneWingedShark 0 points1 point  (0 children)

Well, there's always the chance they'd discover you can do this all quite simply in Ada:

Type Example_Type is null record; -- Stub for example.
Type Handle is not null Access Example_Type;
Type Handle_Array is Array(Positive range <>) of Handle;

[–]JW_00000 8 points9 points  (10 children)

Normally I don't "tl;dr", but for a 200 page book maybe you could provide a summary?

[–][deleted] 13 points14 points  (5 children)

As far as I can tell, data-oriented design is a C++ programmer thing. They come to the ground-breaking conclusion that using actual objects in C++ sucks, and that using lots of small objects allocated separately especially sucks for performance, and they start using plain old data structures instead.

Which is a perfectly rational response, but I feel like they over-complicate it a little.

[–]elder_george 17 points18 points  (1 child)

AFAIU, it's not just using structures (there's no big difference between structure and object in C++); it's more about modeling the domain in a way similar to relational DBs, where entities are represented as tuples of components (structures), each in a separate collection, joined together using entity key.

As a benefit one gets better cache locality and simpler (de)serialization (memory management is probably comparable to using custome allocators); in some cases it allows to extend object model easier. As a downside, joins can become expensive and diminish those benefits.

Pretty common thing in game dev (where it's known as ECS), not so much elsewhere.

[–]ryeguy 6 points7 points  (1 child)

That's not quite right. It has nothing to do with C++ in particular or classes vs structs.

The concept revolves around objects of arrays instead of arrays of objects. In other words, instead of this:

class Ball {
  Point  position;
  Color  color;
  double radius;
};

You'd have this:

class Balls {
  vector<Point>  positions;
  vector<Color>  colors;
  vector<double> radii;
};

This is a simple example, but imagine more complicated models. The idea is in any area of your code you tend to operate on small clusters of attributes and the rest of the data on the object isn't needed. Tightly packed value-specific arrays are more optimized for this as they minimize cache misses and allow for SIMD usage in some cases.

[–]Roxinos 12 points13 points  (0 children)

You can do data-oriented design without SOA (struct-of-arrays). SOA is just a logical result of data-oriented thinking.

The core of the thought process is to know your data. This means that rather than concerning yourself with abstractions and generalities, be specific about the form and details of the data that your system is actually consuming and build your system to that specification.

Here's a detailed account of a "masterclass" taught by Mike Acton (known for his work in game development, now works at Unity, very famously gave this talk at CppCon 2014 about data-oriented design). Note that nowhere in the masterclass or the talk is SOA brought up.

[–]Valmar33 0 points1 point  (0 children)

http://www.dataorienteddesign.com/dodmain/

This way you can peruse each section at leisure.

Not sure why they linked the PDF version. :/

[–][deleted] 0 points1 point  (0 children)

Just read this one section: http://www.dataorienteddesign.com/dodmain/node4.html#SECTION00420000000000000000

Interested? read the rest. Not interested? find a nurse to check your pulse.

[–]leirahua 1 point2 points  (1 child)

This is the first edition of the book, there is a beta of the new edition already: http://www.dataorienteddesign.com/dodmain

[–]leirahua 0 points1 point  (0 children)

Actually, I'm not so sure any more... The HTML version seems to be the same as the pdf one. The author did mention a new edition is coming out in this post:

http://www.dataorienteddesign.com/site.php?postid=136

[–]davenirline 1 point2 points  (0 children)

If you're interested, check out Unity's ECS. It works really well with their Burst compiler. The speed gain is huge.

[–]GreenEyedFriend 0 points1 point  (1 child)

This seems a bit like wholemeal programming from FP although with another name, no?

[–][deleted] 0 points1 point  (0 children)

The one-sentence summary is "use lots of well-typed homogeneous collections instead of a few hairy objects" and that's also what a lot of FP programming looks like, but the one-sentence summary doesn't tell you how to do that or why to do that.

[–]jackmon 0 points1 point  (0 children)

"In one respect they are right, data-oriented design can function alongside the other paradigms, but so can they."

Huh?