The game loop is the heart of every game engine. In this posting I explain how our loop is structured and how we manage to integrate a fixed time step with variable frame times. This is my third posting about the development of our in-house game engine Ginkgo. Ginkgo is a component based engine written in C++. I covered the inspirations for Ginkgo in my first posting and the component system in the second posting.

The textbook approach to game loops is to model them as simple loops that first process the input, update all game objects and then render all objects:

while (!quit)
{
	ReadJoypad();
	UpdateScene();
	DrawScene();
	FlipBuffers();
}

In an OOP system, the update function is usually found in the game object. After each object drew itself – or was drawn – the graphics buffer is switched. In a more sophisticated engine this buffer switching might be synced to the vertical refresh of the graphics card.

Jason Gregory of Naughty Dog described the pitfalls of this design in his excellent talk at Game Forum Germany 2011:

“It’s also terribly inefficient
- potential for duplicated computations
- possible reallocation of resources
- poor data and instruction cache coherence
- not conducive to parallel computation

In Ginkgo, every component registers for n update pools. N is usually 1, sometimes 0 and might be >1 in the future. The pools are triggered consecutively. Thus, we can order components, aspects of the same object. There are no game objects, but each assembled component group that forms a runtime game object receives several update calls as each component might be updated with a different pool. Rendering is done by one of the game object’s component among many. All rendering components are currently executed in the same update stage at the end of the loop, though this is not strictly necessary in the long run. Naughty Dog’s engine uses a dynamic bucket-based approach, which is something we might consider in the future, e.g. as soon as our number of update stages goes over a certain threshold.

In order to address the above issues we took the following actions:
- Less duplicated computations by building pools of similar objects and manager objects where it’s appropriate (e.g. the particle system, physics engine, etc.).
- We forbid reallocation during update. Let’s see how that ends ;-)
- We increased cache coherence by putting our components into pools and updating them pool-wise.
- We have not thought about parallel computation yet, but we have a very clear image where sync points will/might be.

But there’s an additional issue with game loops. There are a number of algorithms out there that profit significantly from predictable timing. The Runge-Kutta methods – which are widely used in physics calculations – are an example. Now you can’t change anything about the timing of the screen refresh. But in Ginkgo, the render timing is based on the vertical sync of the graphics card whereas the physics engine we’re using, Box2D, uses integration methods that demand fixed timesteps. Gladly, Glenn Fiedler wrote an excellent article describing how to merge these two time sources into a single game loop, “Fix your timestep!“.

Why not dive straight into the subject matter at hand and look at our game loop:

void Ginkgo::update()
{
	f32 timeScale = _paused ? 0.0f : _timeScale;

	f64 newTime = System::getTime();
	f64 dt = newTime - _systemTime;
	_systemTime = newTime;
	_deltaTime = (f32)dt;

	// For timescaling we need to consider 2 values:
	// 1) _deltaTime: We stretch the time that passed since the last frame with _timeScale
	_deltaTime *= timeScale;

	// 2) _currentTime:  We need to fake this value for all the components and systems that use currentTime() to update their state (TaskManager etc.) in order to reflect that time passes slower/faster than usual.
	// We can do this by moving the _startTime value backward/forward corresponding to the offset of the real vs. the scaled time per frame.
	// Thus currentTime advances slower/faster according to timeScale. This is the most problematic point, as numerical errors will add up here.
	f64 timeOffset = dt * (1.0 - (f64)timeScale);
	_startTime += timeOffset;

	// currenttime is the time since start
	_currentTime = (f32)(_systemTime - _startTime);
	
	// deltatime is the time since the last loop
	_accumulator += _deltaTime;

	u32 numTicks = (u32)glm::floor(_accumulator / TICK_DURATION);

	_accumulator -= numTicks * TICK_DURATION;
	numTicks = glm::min(numTicks, MAX_TICKS);

	for (u32 i = 0; i < numTicks; ++i)
	{
		// We use a fixed-step update function and accumulate between frames
		++_tickCount;
		tickStages();
	}

	// Rest of the accumulator, in percent of dt
	_interpolationFactor = _accumulator / TICK_DURATION;

	updateStages();
}

This game loop first determines a timeScale for this loop. If the game is pause, the timeScale is set to 0.0. The current time is requested from the class “System” that provides a uniform interface to platform-specific functions. “dt” – delta time – stores the seconds since the last iteration. It is scaled by the time scale. A time scale of 2 thus means that the system behaves as if double the time has passed sine the last iteration, doubling the game speed. In order to correctly scale the execution of e.g. scheduled tasks that demand absolute time, the start time of the engine needs to be adapted. The engine thus behaves as if time would always have been that slow or fast.

In order to maintain a constant tick rate, an accumulator is used. This variable gets the last time slice added. Then the number of ticks in the accumulator is calculated. The update() function of all pools registered for ticks is called via tickStages(). This function first updates all PreTick components, then Tick and at last PostTick. These are the three update stages in the tick section. Glenn Fiedler describes the accumulator as follows: “Any remainder in the accumulator is effectively a measure of just how much more time is required before another whole physics step can be taken. For example, a remainder of dt/2 means that we are currently halfway between the current physics step and the next. A remainder of dt*0.1 means that the update is 1/10th of the way between the current and the next state” [1]. This remainder has to be used for interpolating the actual position of e.g. a sprite when the frame needs to be drawn as the drawing occurs between ticks. It is stored in _interpolationFactor. At last, updateStages() is called. This function updates the pools that registered for the PreDraw, Draw and PostDraw update stages as well as all singletons that need frame-wise updates, e.g. the external sound engine.

While our approach to the game loop is not as elegant as Naughty Dog’s, it still solves a number of issues more naive approaches to game loops have. And it integrates nicely with the physics engine. The usual disclaimer: Ginkgo is a work in progress, and every line of code might change in the future. Also, I’m happy for all feedback!

I think I’ll take a leave from Ginkgo for a while and make my next posting about something entirely unrelated. Let’s see how that works out.