My take on Data-Oriented Design
Jul
5
2013

Data-oriented design is a paradigm shift from the classic object-oriented way of doing things. Basically, instead of modelling game objects using object and class hierarchies, we shift our focus to the data that describes these objects, isolate each type of data, and treat all instances of this data type (also often referred to as properties) in one swoop. We're supposed to think of ways to transform data, rather than objects and classes. "Where there's one, there are many" is the mantra of data-oriented design, the question is, what does it mean to you? Chances are you probably already knew all this before coming here, so I'll focus on what it means to me, my attempts to translate it into the real world, and how I use it in Choco.

I'm sure people who are a lot more versed on the subject matter than I am will argue that what I describe has nothing to do with data-oriented design, and that this set up will cause more cache-misses than an object oriented design. I'll be happy to stand corrected and learn something new. For now though, this is how I see it, and to be honest, it feels like a good setup and it is working pretty well.

Introduction to properties

My first contact with data-oriented design, it was described to me as something I surely must've already seen, which I had of course, and I suppose most people in games and graphics programming have come across one way or another; particle effects, or particle systems. Each particle isn't modelled individually, but rather, the whole set is described collectively, and we never update just a single particle. Let's say we define a structure for each particle:

struct particle {
  float origin[3];
  float velocity[3];
} particles[MAX_PARTICLES];
int particle_count;
This structure feels natural. However, we can adjust it a little, to accommodate our two prevailing access patterns:
struct {
  float origin[MAX_PARTICLES][3];
  float velocity[MAX_PARTICLES][3];
  int count;
} particles;
We know that we either only need the origins of each particle, like when we're rendering, or we need origin and velocity in order to update, this structure accommodates both those patterns. We don't need to split individual x-y-z components, as they will always be accessed together. Like so:
for (int i = 0; i < particles.count; ++i) {
  particles.origin[i] += particles.velocity[i];
  particles.velocity[i] += gravity;
}
And equivalent for rendering, or perhaps even with a single draw call, if we can manage. And before you mention it, the code is of course more of the pseudo type as you'd naturally have to update all components, either individually or be a bit funky and play around with SIMD. Also, no Δt in there, as we are of course using fixed timesteps!

How you structure your data comes down to the number of different ways you want to access it. Of course, it might be more convenient to lump together origin and velocity in this case for other reasons, such as keeping other structures small by having to reference only one of our particles, instead of one origin and one velocity. It's a trade off, and when in doubt, give it a KISS.

Our first property

So back to data-oriented design. Or property-centric architecture. I'm sure there are other names hinting at the same idea. Take another look at the code above, what physical property does it actually describe? If we cut the gravity out of the equation[1], we're looking at linear momentum. Add impulse (being the change in momentum) and mass to the calculation, and you should be able to knock your objects around the room all day long.

I say linear momentum, cause right next to it in my code, there's angular momentum, which is essentially the same, only we use quaternions for orientation instead of an origin, torque instead of force, and moment of inertia instead of mass. Now we have our objects spinning and moving about, we can apply the impulse from our collision response with the appropriate arm from the center of mass, and we have the beginning of a nice little physics engine.

Add one more property, a redraw property, or mesh property, or something of a similar name and we start seeing the objects on screen as well.

Taking the bus

Up until this point, everything sounds pretty good, but there's one key piece missing from actually making this work. If all we need to do to make an object behave a certain way is to add a property of a certain type, how do we get these properties to work together? How do we connect the redraw property to the linear and angular momentum properties?

Unless we want every property type to know of every other property type (which would go against pretty much every software engineering principle there is), we need to establish some data exchange format. I went for simple arrays of floats, letting, while manually matching array lengths, so not entirely automatic, but I don't really need it to be, arrays of 3 and 6 floats (or 4 and 8 for quaternions) are common enough to not cause any bigger issues. This is not the place to worry about byte ordering, sure, we can have properties which tie our objects to server state, but we'll deal with that before putting it on the wire, not locally between properties.

So, we have a data exchange format, now it needs to be exchanged. As I see it, we have two viable options, the first one, the obvious one, is to simply use a pointer directly to the data we want to use. That is, when we create our redraw property, we simply give it a pointer to our float[3] for origin, our float[4] for orientation[2], and other frame relevant data.

I didn't like this set-up simply because I wanted to be able to reorder my property instances, either to pack active instances at the front or be able to sort them for whatever reason. This might be a case of premature optimization of course, and I might very well cut myself once or twice on Occam's razor, but going with my gut feeling has served me well so far, on average at least. Keeping track of everyone referencing a property instance to update their reference while reordering is not practical, so we need an indirection, a lookup table. We can use several look-up tables, one per property, or a big global lookup table serving them all. If we use a pointer to a pointer, either way will work. If we're using the pointer to a pointer set-up we'll need a separate entry for each piece of a property, like one for origin (as input/output), and a separate one for the impulse (input) in the linear momentum property, and all need to be fixed up will reordering.

I went for the handle approach (a handle essentially being a glorified array index), which means I'll need a global lookup table, much like a common data bus, or otherwise the properties would need to know which lookup table to use. I use a 24-bits to identify the property, and an 8 bit offset on top pointing to the field to use. The layout of the 24 bits is not really important for this discussion, but for the sake of completeness, I use 8 bit table identifier (to allow dynamic allocation of table space, and to not be forced to have all tables in one big chunk of contiguous memory), 6 bit object generation (to spot dangling pointers) and 10 bit index into the table. I still need to update the lookup table whenever I reorder, but only once per property instance, and if used as an opaque type, the handle is a 32 bit pointer (even saving a bit of memory on 64 bit systems) directly to the data, encapsulating the byte-shift in the lookup makes the code nice and clean too. Having 18 bits to determine the property gives us 260 thousand properties so we can still use them rather generously without running the risk of hitting the ceiling. I also don't think I'll ever need 8 bits for the offset, or I've obviously failed at keeping the properties small. With typical cache line sizes of 64 bytes or smaller, 5 bits would be sufficient if we keep our properties small enough to fit more than once in a cache line.

Conclusion

I embarked on this journey to figure out the secrets of data-oriented design. To be honest, I don't think I'm even near my goal, and I will keep looking. What I did find, however, was a design which appeals to how my brain is wired. Data-oriented design is also all about minimizing cache-misses, and that probably went out the window as soon as I introduced an indirection for moving data around, not to mention the fact that data is wired arbitrarily, as we shall see right at the end of this post. If you think I'm stumbling down the wrong path, feel free to give me a nudge in the right direction.

Don't get me wrong, I think I've understood the basic idea of data-oriented design, I just have issues trying to figure out how to really live it, and how it actually translates to the real world. I'll finish off this post with some small code snippets, what all this currently looks like in my code.

First off, the angular momentum property

struct angular_momentum {
  q4 orientation[2];  // q4 is union of float[4] and a struct of w, x, y, and z
  f3 impulse;         // impulse vector (float[3])
};
struct angular_momentum_private {
  float invmomint;    // inverse moment of inertia
};

static struct angular_momentum instances[MAX_PROPERTIES];
static struct angular_momentum_private pinstances[MAX_PROPERTIES];
static int ninstances = 0;

void angular_momentum_update() {
  for (int i = 0; i < ninstances; ++i) {
    q4 an = instances[i].orientation[0]; // current orientation
    q4 an1 = quinv(instances[i].orientation[1]); // inverse previous orientation
    float *impulse = instances[i].impulse;
    float invmomint = pinstances[i].invmomint;

    // an*an1 = difference between previous and current,
    // an*an1*an = continue with constant angular velocity

    q4 n = qmul(qmul(an, an1), an);

    // ... apply torque using the norm of the impulse vector and moment of
    // inertia to determine angle, and normalized vector as rotation axis.
    // Caution, quaternions are susceptible to aliasing effects if angular
    // velocity gets into the range of a revolution per frame.

    // reset impulse to zero for next frame
    set3(instances[i].impulse, 0.f, 0.f, 0.f);

    instances[i].orientation[1] = an;
    instances[i].orientation[0] = qnormalize(n);
  }
}

Second, how to create a grenade game object, using liner and angular momentum, along with collision and a redraw property. This code is still hypothetical though, I haven't gotten to this point yet.

  property linear = linear_momentum_create(mass, initial_location);
  property angular = angular_momentum_create(moment, initial_orientation);
  property collision = collision_response_create(
    PROP_FIELD(linear, linear_momentum, origin),
    PROP_FIELD(linear, linear_momentum, impulse),
    PROP_FIELD(angular, angular_momentum, impulse),
    collision_geometry);
  property redraw = redraw_create(
    PROP_FIELD(linear, linear_momentum, origin),
    PROP_FIELD(angular, angular_momentum, orientation),
    mesh, material);

The PROP_FIELD macro adds the field offset to the property handle to hide the structure of the property. And unless your collision geometry is always axis aligned (or symmetric), you'd need to pass in the orientation as well, or let the collision geometry be a property on its own. All properties are stored in a game object, which has only two purposes really, allowing us to clean up all properties related to an object when it is destroyed, and allowing us to inspect it in a debugger, everything else is handled by the properties themselves.

And last but not least, the redraw property. The redraw property orders objects into an appropriate structure for fast frustum culling, and comes with two functions being called regularly, one for refreshing objects that have moved about after every physics step, and one per frame, interpolating origins, orientations, animations, etc for redraw. I will not go into detail about spatial partitioning, I just wanted to show the property access

static struct redraw_property {
  property origin;
  property orientation;
} instances[MAX_REDRAW_ENTITIES];

static struct redraw_property_private {
  refresh_entity re;
} pinstances[MAX_REDRAW_ENTITIES];

void refresh_frame(float lerp) {
  refresh_entity queue[REFRESH_QUEUE_SIZE];
  int entities = 0;

  for(each object in potentially visible set) {
    refresh_entity *re = &queue[entities];
    *re = pinstances[object].re;

    float *o = property_get (instances[object].origin);       // f3[2]
    float *q = property_get (instances[object].orientation);  // q4[2]

    lerp3(lerp, re->origin, o);
    lerp4(lerp, re->orientation, q);

    entities++;
  }

  refresh_draw (queue, entities);
}

As you can see, this is where it starts to get hairy regarding cache-misses. Sure, the redraw property instances may be laid out in an efficient manner, but those origins and orientations are likely to be spread out all over the place. And even if they were to be mostly from the same source, they're unlikely to be sorted appropriately for this access pattern. I still like this set-up, I just don't think it's nearly as data-oriented design as I was hoping it would be. This is about as far as I've gotten, so this is where I'll stop. Thanks for reading all the way down here!

[1] I keep it in as a separate field, as it is unlikely to change between frames so I wont need to reset it every frame, but that's beside the point.
[2] I use 6 and 8, i.e. 2×3 and 2×4 and interpolate fractional steps between physics steps.
comments powered by Disqus

Categories

Archives