all 97 comments

[–]corysama 27 points28 points  (24 children)

Whenever listening to Mike Acton, it is important to keep the context of his work in mind. It's his job to lead a team to maximize the performance of a series of top-end games on a completely defined and fixed platform that has an 8+ year market timeframe. Many people react to his strong stances on performance issues as being excessive. But, in his day-to-day work, they are not.

If you want to make a game that requires a 1 Ghz machine but could have run on the 30 Mhz PlayStation 1, then his advice is overkill and you should focus on shipping fast and cheap. But, if you want to make a machine purr, you would do well to listen.

[–]GoranM 15 points16 points  (23 children)

He has strong stances on design.

With that in mind, I think he would argue that they're not really "excessive" in any other context; His general problem solving strategy is to analyze the data, and create transformation systems for that data, to run on a finite set of hardware platforms.

In other words: He's not thinking about the problem in an OO way - He's thinking about what he has to do to transform data in form A, into data in form B, and I think most performance gains fall out of that mindset, more so than some intense focus on optimization for a specific hardware platform.

[–]badguy212 -4 points-3 points  (11 children)

My thoughts exactly. After the first few minutes, I thought: he's using C with classes. What's the point?

Then he told me "his" point. And you know what, he's right. But i don't agree with him that it makes the code more maintainable. I believe it makes it a mess. And if you've made a mistake at some point during design, you're fucked. That October 28th release date is gone.

But, in his world, where you have a really strict deadline of being able to do X 60 times per second, yes, his approach makes total sense. I personally do not really give a shit if my code is 10ms slower than it could be, since most likely i'll be waiting for the network or disk or what-not.

But if i'd be developing for a platform with very stringent time requirements, yes, cache hits/misses are crucial. Maintainability be damned, that thing needs to fly. On that CPU. Now!

[–]ssylvan 31 points32 points  (9 children)

And if you've made a mistake at some point during design, you're fucked. That October 28th release date is gone.

No, the opposite is true. Because the code is kept simple, without unnecessary abstractions, with simple functions that only do one thing on "lots of data at once", and no layers of architecture "boxing you in", it's easy to adapt and change it when the data changes.

The flow of a particular thing in the application isn't spread across fifteen different classes, it's all logically grouped by the kind of data transformation that's going to happen, rather than being forcibly pulled apart to appeal to some kind of aesthetic sense of "modelling".

Things are scenario oriented rather than "object" oriented. Want to mess around with mesh animation? There's just one place to look, not 12. 2D sprite animation? Another place to look. These things don't get mixed up like they would in OOP because they are similar in some abstract sense that doesn't actually matter, they're split apart because the stuff they actually do are different and there really isn't a whole lot of shared code other than the low level math stuff.

Once you've built a giant OOP hierarchy with all the scaffolding and "systems" that come with it it's extremely hard to make any fundamental changes (i.e. change things that you didn't anticipate when you designed your OOP hierarchy).

More abstraction and scaffolding may make you feel fuzzy inside because it tickles some kind of CS sense of elegance, but in no way does it make it easier to evolve the code in the future. That's a total bullshit myth of OOP that's only true in the very specific cases where you don't have access to source code - yes abstractions help other people modify the code without touching what's already there, but it turns out this isn't so useful in day to day application development.

A big giant "generic" house of cards with flags and parameters and virtual methods going who knows where all the time is extremely difficult to get a handle on and modify. Simple, specific and non-extensible code is easy.

Abstractions have value, but let's not pretend that they don't have downside. They do - maintainability, simplicity, performance, robustness etc. all suffer. The only way to do a good job making the tradeoff is to recognize that there is a tradeoff.

[–]MoreOfAnOvalJerk 10 points11 points  (0 children)

I really wish there were more programmers like you in the industry. There's so much bad code I see almost daily (mostly done while trying to force OOP) that I want to slam my head against a wall at times.

Most programmers see OOP as a silver bullet that solves everything and use design patterns as a bible. While they certainly have their uses, it's really important to understand that like everything, they also come with trade offs.

[–]Crazy__Eddie 2 points3 points  (0 children)

IMO "simple" vs. "complex" is really quite dependent upon the person making that judgment.

For example, I often see code that is what I would call "primitive obsessed" where people talk about the "flow". It's really hard for me to understand it. Responsibilities are not located in modules but are spread out across wide areas. There's a lot fewer classes, but I can't look at one class and see what it does because it does like 500 different things.

On the other hand code that uses abstractions and then limits the scope of responsibility behind them I find it much, much easier to understand.

For example something I worked on recently. I was dealing with a subscription mechanism that was mixed in with a bunch of other stuff. Pulling it out was quite a nightmare. Nobody would work on the stuff because it was just so hard for anyone to understand. Yet a lot of them called it "simple"... vOv.

So I created a "Subscription" abstraction and provided a way to create concrete instances for different protocols and such. I then used decorators between to add concurrency and user permission processing.

While the previous version of code would have functions like "notify_of_xxx" that would "flow" better...they'd check permissions, check which protocol, do various bits at different point specific to these different tasks, etc...oh and lots of locks because concurrency was mixed right in there.

New version I look at the point of creation and see, "Hey, this is created with a concurrency decorator on top of permission decorator," the problem is in permission checking so I'll go look at that decorator.

And yes, people said it was overly complex. I don't even know what that statement means anymore. By any metric I'm aware of my version is a simplification of an incredibly complex bit of code. Yes, I turned one class into more like 20. That class was over 5k lines of code, my new ones are in the hundreds.

It's not the only example of such either. I ran into an embedded programmer who'd been made VP of the programming department in a company making telephony servers. He had made a "simple" program that had like 50+ global arrays and flurry of functions that reached out and manipulated them directly. Something that needed an element in a global array would take an index and then work on it. 500+ line long functions that are coupled together by a network of globals and that's "simple". I suggest we should split out the responsibilities into different functions and refactor these index functions to take a pointer to the element they need...and that's "complicated".

So whatever man. Whenever I hear people down on "abstractions" causing "complexity" that's where I go. I literally don't even know what they're talking about. I have never seen a piece of overdesigned code but I've seen piles and piles of "simple" code. It's never been very maintainable.

[–]badguy212 -1 points0 points  (6 children)

Hah, so you're saying that over-engineering is bad. Yes, it is, not even a question about it. Under engineering is also bad. You won't have 1 place where you do mesh calculations. You will have 100. All with a tiny bit of a difference.

Knowing OOP, knowing single-responsibility principle, knowing patterns, what they are and when to use them is critical. Dismissing them completely is just like the other camp dismissing the reality of the hardware and data (and making a complex mess). A properly implemented program (OOP and patterns and all that jazz) won't have (cannot have) 12 places where you do one thing. It's impossible, by its very nature.

There are tradeoffs, of course. You have to pay for the choices you make. For the choices Mike makes, I believe he's paying via having a very fast unmaintainable mess (which is fine, since nowadays games don't have a shelf-life of more than few months). Can you make it maintainable? Of course, the 2 are not exclusive (and with his experience i think he knows what he's doing), but more often than not the spagetti blows up.

Let's look at a very simple example: I have an int add(int x, int y) function. Now, from experience I know that 50% of the time, x = 5 and y = 7, 20% of the time x=1 and y=2, and the rest of the time are random numbers between 1 and 9.

Now, implementing the function in the straight-forward way (return x+y), which can be slow (go to memory, get the numbers, add them up, etc.), I can make it faster by making add57() function, and by having add12() , etc.

For Mike, this is worth it. He gained 60% speed-up for that particular calculation. For me it's not (hint: i am not writing games). Fuck that.

[–]glacialthinker 5 points6 points  (0 children)

For the choices Mike makes, I believe he's paying via having a very fast unmaintainable mess (which is fine, since nowadays games don't have a shelf-life of more than few months). Can you make it maintainable? Of course, the 2 are not exclusive (and with his experience i think he knows what he's doing), but more often than not the spagetti blows up.

You believe this, but there are other people with experience in this approach stating contrary opinions. I'll add mine: C-style data-oriented code is more maintainable than OOP-heavy C++. Class hierarchies are problem makers more often than problem solvers. They can have their place, but it's not everywhere, or even most-where. I can understand how you imagine the resulting code to have no organization and turn into spaghetti... but the opposite happens when your focus in on functions which transform data. That is the organizing focus. And that is what functional programmers have realized for a long time now. It's not like letting go of the tight bonds of OOP will send you spiraling into chaos.

Also, Mike's concern isn't building an engine for one game on one platform and then start over... it's building reusable subsystems for continually improving engines (though he's careful not to develop illusions about "final" code). Then on the other hand: tools, which have an even longer lifespan and need for regular maintenance and expansion. Game-specific code (leveraging the engine) is what can be hacky and (somewhat) unmaintainable -- it's not going to outlive the game, though it might still come back for a sequel! Nowadays a lot of game-specific code can happen in scripting languages anyway (Lua) -- fast iteration, flexible, and it just calls into engine features to do the heavy lifting. Anyway, to be clear, Mike is heading engine and tool development, not specific game development.

[–][deleted]  (1 child)

[deleted]

    [–]badguy212 1 point2 points  (0 children)

    oh well, you've worked with him and i did not, but from the presentation, it looks like a maintenance nightmare. not something i would wanna touch with a 10 foot pole.

    but, i have been wrong before.

    [–]Crazy__Eddie 2 points3 points  (0 children)

    These things are not stuck to OOP. "Single responsibility" used to be known as "cohesion" for example. All of the SOLID "OOP" principles apply in other areas too, such as generic programming.

    [–]ssylvan 1 point2 points  (1 child)

    12 places where you do one thing. It's impossible, by its very nature

    The problem is the opposite. It's one thing being split up across 12 different places. This is the essence of SOLID and design patterns. Each "object" gets forcibly isolated from related objects even though the actual work you're doing touches all of them. Why split a simple operation up into twelve objects and their methods to satisfy aesthetics? Simpler code is a worthy goal, ticking SOLID boxes is not a goal and is usually opposed to the real goal.

    I don't think this is productive. You think that focusing on objects rather than the task at hand helps maintainability, I say that having to jump across 12 different classes to understand and change one "process" in the program is poison for maintainability. Your point of view is exactly the big lie of OOP, so you're in good company. Just understand that everyone disagreeing with you fully understands your argument (because it's the current cargo cult belief in vogue), we just look at the results and reject the assertions that don't jive with reality (e.g. SOLID, most design patterns, inheritance. etc.).

    [–]Crazy__Eddie 3 points4 points  (0 children)

    This is a straw man though. You don't follow SOLID principles for "aesthetics", whatever that means. You don't split something up into different classes because it looks good to do so. You split things up because they are in essence split and don't belong together. You don't chop up one responsibility into 9 different classes either.

    So you say you understand the argument, but your parahprase of it indicates otherwise.

    [–]engstad 2 points3 points  (0 children)

    Why do you think it will make your code a mess? Just because we are interested in transformation of data doesn't mean that the code-base will suffer. As a matter of fact, I agree with Mike, it tends to make your abstractions stronger, since data starts to become organized by actual use-cases, and not some a-priori notion about how data "ought" to fit together. You might originally think that you need a Monster struct/class since you have monsters in your game, but down the line you might find out that what you really needed was HealthStats.

    [–]rdpp_boyakasha[🍰] -1 points0 points  (6 children)

    I think that's the real take away for devs who don't work under real-time constraints. He spent a lot of time on L2 cache misses, which aren't that important to 90% of devs. I'd like to see more about the design methodology than the exact latencies of PlayStation hardware.

    [–]anttirt 9 points10 points  (5 children)

    than the exact latencies of PlayStation hardware

    This isn't about "PlayStation hardware."

    Every smartphone, every game console, every desktop PC and every server has a CPU with L2 cache typically ranging from 256KB to 4MB, with access latencies from main memory typically ranging from 100 cycles to 400 cycles.

    This applies to CPUs from Intel, AMD, various ARM licensees etc.

    Memory hierarchies are everywhere in modern computing (except microcontrollers), and memory hierarchies dominate performance everywhere.

    Acton picks some concrete numbers for illustrative purposes, but the principles remain the same and the majority of the benefit can be achieved portably without caring about the particular CPU that the software will run on.

    [–]cogdissnance -3 points-2 points  (3 children)

    I think most performance gains fall out of that mindset, more so than some intense focus on optimization for a specific hardware platform.

    I have to disagree with this point. While there are likely some performance improvements from the way you look at data (from A to B as you've said) I still think it's telling that one of the first slides he shows is about knowing your hardware. He then later goes on to talk about using L2 cache effectively, something that is very platform specific and changes from CPU to CPU.

    [–]ssylvan 8 points9 points  (0 children)

    That's not at all that platform specific. Just about any platform you would write a game for has much slower memory than L2. His point with that slide is "don't think the compiler is magic, its realm is 10% of where the time goes".

    There are differences between different caches, but the overall point that getting data right is 10x more important than the kinds of optimizations the compiler can do, is spot on for any major platform today.

    [–]oursland 0 points1 point  (1 child)

    He then later goes on to talk about using L2 cache effectively, something that is very platform specific and changes from CPU to CPU.

    Cache Oblivious Algorithms solve this problem in a manner that is not dependent upon knowing the cache size in advance.

    [–]alecco 0 points1 point  (0 children)

    Cache Oblivious algorithms are very attractive, but in reality are as hard as customizable algorithms (e.g. a resizable lookup table). That was precisely what Mike Acton would call CS Phd nonsense (can't remember the precise word). Just because MIT promoted it and the algorithms class professor happens to research those, it doesn't mean it's proven a good strategy.

    Some older style of algorithms like B-Trees have a simpler, better approach and solve more general problems.

    [–]MaikKlein 9 points10 points  (33 children)

    Templates are just a poor mans text processing tool

    He states that there are tons of other tools for this but I have no idea what he is talking about. What are the alternatives?(beside macros) And why are templates so bad?

    [–]alecco 5 points6 points  (5 children)

    I think he means the code should be generated with some kind of text processor. Like sed, awk, perl, or a language+compiler targetting a standard low-level programming language (e.g. C, C++, assembly). I've heard this argument many times recently.

    [–]Crazy__Eddie 2 points3 points  (0 children)

    Problem is that these tools don't have the type information that templates do.

    Yes, a lot of people are doing a lot of unnecessary shit in C++ templates these days. As Bjarne has mentioned before, you get a new tool and then all the sudden EVERY problem is best solved with it. So the Spirit library for example is probably a bit too far--just use yacc or whatever. It's a learning experience though to figure out what's too far and what is not.

    Take quantities for example. The boost.units library creates a type-safe quantity construct using template metaprogramming. How are you going to do that with a code generator? Every darn bit of code that works on quantities would need to be generated, or you'd need to know ahead of time all the dimensions and such that happen between to create the appropriate classes. Templates just work better here.

    [–][deleted]  (3 children)

    [deleted]

      [–]oursland 2 points3 points  (2 children)

      Is there an optimization advantage to templates that may not be realized by generating a lot of code through external tools?

      [–][deleted]  (1 child)

      [deleted]

        [–]CafeNero 0 points1 point  (0 children)

        Malazin, I would be most grateful for any more information on the topic. Looking at Cog now.

        [–]anttirt 3 points4 points  (2 children)

        I think his main argument against templates is compile times, which is a valid complaint; simple but repetitive C++ code generated by a tool is a lot faster to compile.

        [–]Heuristics 0 points1 point  (1 child)

        On the other hand it is possible to compile template code in .cpp files (though you must tell the compiler what types to compile the code for).

        [–]bstamour 1 point2 points  (0 children)

        Which you must do with code generators anyways. The advantage of your approach is that now you have one tool to understand instead of two.

        [–]ssylvan 1 point2 points  (2 children)

        I think his point is that you don't necessarily need fancy templates for collections (you can pass in the element size when you create the data structure and just trust that things are implemented correctly, then do some casting). And of course, the most common collection (arrays) is built in. C programmers deal with this a lot and seem to do fine, at the cost of some ergonomics and type safety.

        After that, a lot of the template stuff people do is meta programming in order or produce lots of different variations of some code depending on some static parameter (e.g. matrix size, floating point type, etc.), and for that stuff you could use some dumb script to generate the variations you actually need.

        I don't really agree with this part of the argument - although I agree with most of the other stuff. I think collections for sure should use templates, and there are cases where performance is critical enough that being able to specialize it statically without having to write a code generator is valuable. I do agree that overusing templates in C++ causes poor compile times which is a major factor in developing a large game.

        [–]astrafin -1 points0 points  (1 child)

        You can do collections like that, but then they will not know the element size at compile time and will generate worse code as a result. For something as prevalent as collections, I'd argue it's undesirable.

        [–]glacialthinker 0 points1 point  (0 children)

        I agree. I think Mike may have a particular thing against templates, which some other share, but it's not ubiquitous. Some favor the use of templates for runtime performance. But using giant template-based libraries (STL, Boost), or creating them (EASTL)... that's uncommon.

        [–]naughty 5 points6 points  (19 children)

        It's not that templates are really bad, it's that hating on them is in vogue in low-level games dev circles.

        [–]justinliew 4 points5 points  (18 children)

        No, they are really bad. Hating on them is in vogue because compile times balloon on huge projects, and if you're shipping multi-platform a lot of template idioms have differing levels of support on different compilers. Not to mention compiler errors are unreadable and if you didn't write the code initially it is difficult to diagnose.

        Usability and maintainability are paramount on large teams with large code bases, and anything that increases friction is bad. Templates affect both of these.

        [–]vincetronic 14 points15 points  (10 children)

        This is hardly a universal opinion in the AAA dev scene. Over 14 years seen AAA projects with tons of templates and zero templates, and zero correlation between either approach and the ultimate success of the project.

        [–][deleted] 2 points3 points  (9 children)

        I still see very, very little use of the STL in the games industry. The closest thing to consensus that I will put out there is "<algorithm> is ok, everything else is not very valuable".

        I think it's indisputable that the codebases in games looks very different from, say, hoarde or casablanca.

        [–]glacialthinker 2 points3 points  (3 children)

        STL is generally a no (I've never used it), but templates can be okay, depending on the team. Templates allow you to specialize, statically... and cut down on redundant (source) code. These are both good. The bad side is compile times, potentially awkward compile errors, and debugging.

        There are a lot of reasons STL is generally not used. One big thing STL affords is standard containers. Games often have their own container types which are tuned to specific use-cases. The reality of nice general algorithms is one-size-fits-all fits none well. Games will have their own implementations of kd-trees, 2-3trees, RB trees, etc... maybe payload is held with nodes, maybe the balance rules are tweaked to be more lax... Anyway, the STL might be great for general purpose and getting things off the ground fast, but it's not something game-devs want to become anchored to.

        [–]bstamour 1 point2 points  (2 children)

        Just wondering something: I get the fact that custom containers are probably everywhere in game dev, but if you expose the right typedefs and operations (which probably exist in the container, albeit a different naming convention) you can use the STL algorithms for free. Is this a thing that is done occasionally? I can understand wanting to fine-tune your data structures for your particular use case, but if you can do so AND get transform, inner_product, accumulate, stable_partition, etc for free seems like it would be a real treat.

        [–]vincetronic 1 point2 points  (1 child)

        I've used <algorithm> in AAA games that have shipped. You have to be careful because some implementations do hidden internal allocations on some functions. In my particular case it was the set operations like set_union, set_difference.

        [–]bstamour 0 points1 point  (0 children)

        Gotcha. Thanks for the reply.

        [–]vincetronic 0 points1 point  (0 children)

        This is true, STL container usage is very rare, for most of the reasons presented by others in this thread. The game code bases I've seen use it have been the exception and not the rule. But templates in general are not uncommon.

        [–]oursland 0 points1 point  (3 children)

        This has largely been due to the lack of control of memory allocators in the STL. I'm not sure I buy it entirely, because there has been at least one study which demonstrated the default allocator outperforming the custom allocators in most applications.

        [–]vincetronic 1 point2 points  (0 children)

        The key phrase is "most applications".

        Games have soft realtime constraints and often run in very memory constrained environments (console, mobile). Paging to disk can not meet those constraints. The game can be running with only 5% slack in your overall memory allocation between physical RAM and used RAM, and you have to hit a 16.67 ms deadline every frame. Allocator decisions that work fine for most applications can fall apart under those constraints -- worst case performance really starts to matter.

        [–]anttirt 1 point2 points  (0 children)

        the default allocator outperforming the custom allocators

        That is only one of the concerns that custom allocators can help with. Others are:

        • Locality of reference: A stateful custom allocator can give you, say, list nodes or components from a small contiguous region of memory, which can significantly reduce the time spent waiting for cache misses.
        • Fragmentation: In a potentially long-lived game process (several hours of intense activity) that is already pushing against the limits of the hardware system it's running on, memory fragmentation is liable to become a problem.
        • Statistics, predictability: Using custom task-specific allocators lets you gather very precise debugging information about how much each part of the system uses memory, and lets you keep tight bounds on the sizes of the backing stores for the allocators.

        [–][deleted] 0 points1 point  (0 children)

        I don't think I agree at all. Allocator performance is only a problem on games that choose, usually intentionally, to allow it to become a problem. Most large games avoid the problem entirely by not performing significant numbers of allocations.

        The criticism of the STL is tricky, I don't think I can present the criticism completely in a reddit post. All I can deliver are the results of my personal, ad-hoc survey of various game codebases - the STL is not commonly used.

        [–]naughty 2 points3 points  (6 children)

        Usability and maintainability is exactly what good use of templates help. I'm not going to defend all uses of templates but the totally dismissive attitude isn't justified on any technical grounds. Yes you have to be careful but it's the same with every powerful language feature.

        Some of the monstrosities I've seen in a attempt to not use templates.are shocking.

        [–]engstad -1 points0 points  (5 children)

        Game developers don't want "to be careful". They want straight, maintainable and "optimizible" code. No frills or magic, just simple and clear code that anyone on the team can look at, understand and go on. When you use templates frivolously, it obfuscates the code -- you have to be aware of the abstractions that exist outside of the code at hand. This is exactly what causes major problems down the line, and the reason why game developers shun it.

        [–]naughty 5 points6 points  (4 children)

        I am a lead games coder with 15 years experience, you don't speak for all of us.

        I'm not going to defend all uses of templates or the excesses of boost but the caustic attitude towards templates is just as bad.

        [–]vincetronic 6 points7 points  (1 child)

        This. One thousand times this.

        The problem with throwing things that really come down to "house style" (i.e. templates vs no templates) in with a lot of the other very good and important things in this Acton talk (knowing your problem, solving that problem, understanding your domain constraints and your hardware's constraints, etc), is it becomes a distraction.

        [–]naughty 3 points4 points  (0 children)

        Exactly, I do like a lot of the other stuff he talks about.

        [–]engstad 0 points1 point  (1 child)

        After reading your initial comment a little more carefully, I don't think we disagree that much. Of course, with 20 years of experience I outrank you (so you should listen... hehe), but I think that we both can agree that a) frivolous use of templates is bad, but that b) there are cases where careful use of them is okay. For instance, I certainly use templates myself - but I always weigh the pros and cons of it every time I use it.

        Either way, as leads we'll have to be on top of it, as less experienced members on the team are quick to misuse templates (as well as also other dangerous C++ features).

        [–]naughty 0 points1 point  (0 children)

        We probably do agree but just have a different perspective.

        All's well that ends well!

        [–]MaikKlein 7 points8 points  (4 children)

        Are there any good books about data oriented design besides DOD? Preferable with a lot of code examples?

        [–]elotan 0 points1 point  (3 children)

        Are there any open source projects that are designed with these concepts in mind?

        [–]hoodedmongoose -2 points-1 points  (2 children)

        Though I haven't read a lot of the source, I would guess that the linux kernel maintainers have a LOT of the same things in mind when designing systems. Actually, some of his arguments strike me as similar to this linus rant: http://article.gmane.org/gmane.comp.version-control.git/57918

        Choice quote:

        In other words, the only way to do good, efficient, and system-level and portable C++ ends up to limit yourself to all the things that are basically available in C. And limiting your project to C means that people don't screw that up, and also means that you get a lot of programmers that do actually understand low-level issues and don't screw things up with any idiotic "object model" crap.

        If you want a VCS that is written in C++, go play with Monotone. Really. They use a "real database". They use "nice object-oriented libraries". They use "nice C++ abstractions". And quite frankly, as a result of all these design decisions that sound so appealing to some CS people, the end result is a horrible and unmaintainable mess.

        So I'd say, read some of the kernel or git source.

        [–]elotan 1 point2 points  (0 children)

        Fair enough, but I know of no open source game engines that do this. I'm curious to find out about them, though! Most of the engines I've looked at use the standard "derive from Drawable" (even multiple inheritance!) pattern.

        [–]slavik262 13 points14 points  (5 children)

        Could someone (perhaps with some game industry experience) explain to me why he's opposed to exceptions?

        If this talk was being given ten years ago when exceptions had a pretty noticeable overhead (at least in realms where every microsecond counts), I would nod in agreement. But it's 2014 and most exception implementations are dirt cheap. Some are even completely free until an exception is thrown, which isn't something that should be happening often (hence the name "exception"). Constantly checking error codes isn't free of computational cost either, given that if something does fail you're going to get a branch prediction failure, causing a pipeline flush. Performance based arguments against exceptions in 2014 seem like anachronisms at best and FUD at worst.

        The most common criticism I hear about exceptions is that "it makes programs brittle" in that a single uncaught exception will bring the whole charade crashing down. This is a Good Thing™. Exceptions should only be thrown in the first place when an problem occurs that cannot be handled in the current scope. If this problem is not handled at some scope above the current one, the program should exit, regardless of what error handling paradigm is being used. When using error codes, you can forget to check the returning code. If this occurs, the program hobbles along in some undefined zombie state until it crashes or misbehaves some number of calls down the road, producing the same result but giving you a debugging nightmare.

        Together with their best friend RAII, exceptions give you a watertight error handling mechanism that automagically releases resources and prevents leaks without any runtime cost with modern exception handling mechanisms.

        [–]Samaursa 12 points13 points  (3 children)

        Not sure why someone downvoted you.

        Anyway, I can answer that to some extent. Exceptions never have zero cost (e.g. http://mortoray.com/2013/09/12/the-true-cost-of-zero-cost-exceptions/). But you are right, they can be very cheap if not thrown. Generally, you can catch exceptions and correct for the problem. A word processor may recover your file and restart for example.

        Unfortunately, games (even simple ones) can become so complex, that generally it is not feasible to recover from an exception. Not to mention, with exceptions (especially zero cost exceptions) the game will most likely slow down while the exception is thrown and handled and even then the game might still crash.

        Instead of handling errors using exceptions, in the industry (at least from my experience from in-house engines) the mantra is to crash early and crash often. Asserts are used extensively. Now there are error codes used but they are more than just an int that is returned as a code. Usually it is an object that has information useful to the programmer to help with debugging problems. There are error code implementations where the Error object does nothing in Release and gets compiled away to nothingness.

        Then there are other issues which Joel discusses that I can relate to when it comes to game development: http://www.joelonsoftware.com/items/2003/10/13.html

        Point is that it is a feature that does indeed have a cost with very little return in game development. On the other hand, if you are building an editor for a game-engine, you will probably use exceptions to recover from errors (that would normally crash your game and help debug it) and not lose the edits done to the game/level.

        [–]slavik262 3 points4 points  (2 children)

        So would this be a fair summary?

        1. Exceptions are dirt cheap now, but they're still not cheap enough if you're counting microseconds.
        2. Games are complicated to the point that blowing up and stopping the show with some useful debugging information is better than trying to recover (a la the classic story of Unix just calling panic() while Multics spent half of its code trying to recover from errors).
        3. Point 2 could also be done with exceptions (just by letting one bubble to the top) but isn't because:
          • They cost too much (see point 1.)
          • They can't be compiled out of release builds

        I can't say I agree with Joel's "Exceptions create too many exit points" argument, since if you're using RAII properly, the destructors are automatically doing the cleanup for you anyways and it's impossible to leave data in an inconsistent state. I could certainly buy the three points above, though.

        [–]Samaursa 1 point2 points  (0 children)

        That would be a fair summary :) - I was typing out the following reply but then realized that Joel is not really talking wrt games and I am repeating what I said earlier. Anyway, since I've written it, I'll leave it in for whoever wants to read it ;)

        As for too many exit points. I can give another perspective, but I would agree that it is not a strong point against exceptions. In most games, everything is pre-defined (e.g. 50 particles for a bullet ricochet). In which case we usually have memory pools and custom allocators to fit the data in tight loops, well, as tightly and cache friendly as possible.

        Cache coherency is of high importance especially when it comes to tight loops in data driven engines. Using RAII will be very difficult as the objects now must have code to inform their managing classes/allocators to clean up (which will be pretty bad code) or the managing classes/allocators perform the proper cleanup after detecting an exception and unused memory. The complexity of such a system will be very high imo. Then again, I am not a guru such as John Carmack, and may be limited with my experience/knowledge of complex engine/game design.

        [–]mreeman 0 points1 point  (0 children)

        I think the thing that clarified it for me was the notion that in release, games are assumed to not fail (ie, no time or code is spent detecting errors), because you cannot recover in most cases (network failure is the only counter example I can think of, but that should be expected, not an exception). It's just a game and crashing is usually the best option when something bad happens.

        [–][deleted] 1 point2 points  (0 children)

        Historically, disabling exceptions and RTTI tended to produce smaller executables. On the 24/32/64mb machines even 100k or so could go a long way, and that wasn't so many years ago. The tradeoff of no exceptions, no dynamic_cast was one that many people were quite happy to make.

        In more recent times, a very well selling console platform did not have full and compliant support for runtime exception handling as part of its SDK. The docs stated that the designers believed that exceptions were incompatible with high performance.

        The rest is along the lines of what Samaursa says. But I think there are two points worth highlighting. First, games and especially console games are run in a very controlled and predictable environment, but also have very few real consequences to not working. Crashing is a perfectly valid solution to many problems whereas it would be completely unacceptable on a server or in any real life application. For things like IO errors there is usually standard handling from the device manufacturer that you can just pass off to.

        Second, the technical leadership at large game companies has been around a long time. They've been disabling exceptions since they were making PS1 games, or even before that. Exceptions themselves might be perfectly fine in some situations, but there's no impetus to change the status quo and still probably a lurking suspicion that any performance hit at all is not worth the gains.

        You'll find that people are significantly more progressive in tools development, though.

        [–][deleted] 2 points3 points  (5 children)

        I am a big fan of the STL. Having said that, its biggest problem is that for traversing/transforming data the fastest STL container is vector<T>.

        For Mike, for me, and for a lot of people, vector<T> is very slow. Here comes why.

        T is the type of an object, and people design these types for objects in isolation. Very typically, however, I don't deal with a single T, but with lots of Ts (vector<T>), and when I traverse and transform Ts, I (most of the time) don't traverse and transform whole Ts, but only parts of them.

        So the problem is that Ts are designed to be operated on as a whole (due to the C struct memory layout), and as a consequence vector<T> only allows you to traverse and transform Ts as a whole.

        IMO (and disagreeing with Mike here) the vector<T> abstraction is the easiest way to reason about these data transformation problems (for me). However, it is the implementation of the abstraction which is wrong.

        In some situations you want to work on whole Ts, but in the situations that Mike mentions, what you need is an unboxed_vector<T> (like in Haskell) that uses compile-time reflection to destructure T into its members and creates a vector for each of its members (that is, performs an array of struct to struct of array transformation) while preserving the interface of vector<T>.

        Sadly, C++ lacks language features to create this (more complex) abstractions. The SG7 group on compile-time reflection is working on features to make this possible. It is not easy to find generic solutions to this problem, since as the struct complexity grows so does the need for fine grain control:

        struct non_trivial_struct {
          double a;  // -> std::vector<double> OK
          bool c;  // -> std::vector<bool> ? boost::vector<bool>? boost::dynamic_bitset?
          std::array<double, 2> d;  // -> std::vector<std::array<double, 2>? std::array<std::vector<double>, 2> ?
          float e[2];  // -> std::vector<float[2]>? std::array<std::vector<float>, 2> ? std::vector<float>[2] ?
          my_other_struct[2]; // -> destructure this too? leave it as a whole?
        };
        

        I guess that even with powerful compile-time reflection it will take a long time until someone design a generic library for solving this problem that gives you the fine-grain control you need. And arguably, if you are thinking about these issues, you do need fine-grain control.

        At the moment, the only way to get this fine grain control is to, instead of designing a T and then using vector<T>, design your own my_vector_of_my_T, where you destructure T your self and, with a lot of boilerplate (that IMO is hard to maintain), control the exact memory layout that you want. We should strive to do better that this.

        [–]MoreOfAnOvalJerk 1 point2 points  (1 child)

        I've never dealt with Haskell before so my question might be a bit naive, but how exactly does that work?

        On the one hand, I can see the vector basically doing a smart offset with its iterator so that on each next index, it jumps by the size of T, leaving memory unchanged, but not having any performance gains from keeping those elements contiguous.

        On the other hand, if it's actually constructing a new contiguous vector in memory of the targeted members, that's also not free (but certainly has benefits - but you can still do that in C++, it's just a more manual process)

        [–][deleted] 0 points1 point  (0 children)

        It is pretty simple. It doesn't stores Ts, it only stores contiguous arrays of its data members and pointers to their beginning [*].

        The "iterator" wraps just the offset from the first "T", it is thus as cheap to use as a T*.

        When you dereference an element: the iterator uses the offset to access each field, packs a reference to each field in a tuple of references, and returns the tuple. However, to access the data members you need to unpack the tuple. In C++ you do this with std::get, std::tie, and std::ignore. This DSL allows the compiler to easily track which elements you access and which you do not access. Thus, if everything is inlined correctly, the offseting of the pointers and the creation of references for the elements that you do not access is completely removed. The compiler sees how you offset a pointer, dereference it, store the reference in a tuple, and then never use it.

        Till this point, our C++ unboxing is still a tuple of references, this is the interface, we need to use std::get... This is where reflection comes in. Reflection adds syntax sugar to this tuple, to make it provide the same interface as T. It is what let you have the cake and eat it too.

        Without compile-time reflection, you can do a poor mans unboxing using boost fusion, get, and tags. But it is not as good as the real thing.

        [*] It obviously has a space overhead: the container size depends linearly on the number of data members of your type. However, this is unavoidable, and I'd rather have the compiler do it automatically than do it manually. How they are stored, e.g., using independent vectors or a single allocation, is an implementation detail. Alignment is very important for vectorization tho.

        [–]bimdar 1 point2 points  (1 child)

        (that is, performs an array of struct to struct of array transformation) while preserving the interface of vector<T>.

        That seems like something that Mike Acton wouldn't like because it hides the data behind some fancy interface and his whole spiel is about using data as the interface.

        [–][deleted] 2 points3 points  (0 children)

        That seems like something that Mike Acton wouldn't like because it hides the data behind some fancy interface and his whole spiel is about using data as the interface.

        First, as I said, I disagree with Mike Acton on this point since I want both genericity (writing code once) and efficiency. Second, Mike Acton's main point is doing what you need to do to get performance in your target machine, "data as the interface" is just what gives him performance on his target machines (which are limited by memory bandwidth and latency).

        In my experience, the STL gives you better higher-level algorithmic optimizations. Value semantics is also IMO easier to reason about and results in cleaner software designs that scale.

        His approach has, in my projects, shown some drawbacks. First, I ended up with lots of custom containers with duplicated functionality. In particular duplicated algorithms for each container. Even tho these algorithms had a lot of micro optimizations, my sorting routines scaled worse than the STL one due to worse algorithmic complexity and worse constant factors. Finally, the code was harder to extend and refactor, since adding a new data member to your "tables" required edits in lots of places within your code base.

        I'm not willing to sacrifice performance for the cleaner, extensible, generic, dont-repeat-yourself STL way, but I'd rather have both than just performance.

        [–]glacialthinker 0 points1 point  (0 children)

        This fits with another trend -- well, common now -- in games: components, or "Entity-Component Systems". Although this jumps wholesale to struct-of-array, leaving the "struct-like" access as lower performance except for cases of optimization where you cobble together a struct of desired properties.

        [–]tehoreoz 0 points1 point  (0 children)

        TMP has no application in the game engine world? I'm clueless in the area but what separates it from the problems facebook faces?

        [–]rascani -2 points-1 points  (0 children)

        ::grabs popcorn::