Week 1 — Engine Skeleton, Windowing, and the Deterministic Main Loop
Overview
This week sets the structural foundation for everything that follows: a deterministic simulation loop, a windowing layer, and the engine scaffolding that every subsequent system (renderer, physics, sensors, planner) will lean on. By the end you should have a C++20 application that opens a window, reads input, ticks a fixed-timestep simulation, and renders at a variable rate — even if the renderer is currently a clear-color stub. The goal is not visual output yet; it is correctness, modularity, and a main loop that other engineers (and your future self in Week 10) can trust.
A self-driving simulator has three properties that make engine architecture delicate: physics must be deterministic and reproducible so scenario regression tests are meaningful; sensors will be queried at fixed cadences that may differ from the render rate (a 30 Hz camera while rendering at 120 Hz); and the loop will eventually run headless from scenario scripts, with the same code path producing identical results. Decoupling simulation time from wall-clock render time is therefore the first design decision, not later polish. The coordinate and transform math this rests on is assumed from Course 5 — here we fix conventions and build the loop.
Readings
- FCG: graphics pipeline overview. Extract: the conceptual stages (model → world → view → clip → NDC → screen) so the loop knows what the renderer will eventually do.
- MGPV: Vulkan instance/device/swapchain overview (preview only — don’t implement yet). Extract: the taxonomy of objects Week 2 will instantiate.
- 3DMP (review): coordinate-system and transform conventions. Extract: row- vs column-vector convention and why \(M_{world\to camera}=M_{camera\to world}^{-1}\).
- (Linear maps, homogeneous transforms, and quaternions: assumed from Course 5.)
Key Concepts
Coordinate conventions, fixed now
Adopt a right-handed world frame and the ISO 8855 vehicle body frame (\(+x\) forward, \(+y\) left, \(+z\) up); never deviate. Represent rigid transforms as \(4\times4\) homogeneous matrices \(T=\begin{bmatrix}R&t\\0&1\end{bmatrix}\) and read composition right-to-left. Mismatched frame conventions are the single largest source of integration bugs in robotics codebases; one documented convention prevents most of them.
Fixed vs variable timestep
A naive loop ticks simulation by dt = now - last, which is non-reproducible (float accumulation differs run-to-run) and destabilizes integrators on frame spikes. The canonical accumulator pattern fixes it:
constexpr double kSimDt = 1.0 / 120.0; // 120 Hz simulation
double accumulator = 0.0;
auto last = clock::now();
while (running) {
auto now = clock::now();
double frame = std::chrono::duration<double>(now - last).count();
last = now;
accumulator += std::min(frame, 0.25); // clamp spiral-of-death
while (accumulator >= kSimDt) { sim.step(kSimDt); accumulator -= kSimDt; }
double alpha = accumulator / kSimDt; // [0,1) render interpolation
renderer.render(sim.state(), alpha);
}Two runs with the same input sequence then produce identical simulation states regardless of frame rate.
Why separate sim and render
Rendering is non-deterministic in wall-clock terms (GPU scheduling, thermal state); letting it drive physics couples the world to that noise. Sensors need fixed cadences (a 30 Hz camera fires every 4 ticks at 120 Hz). Headless scenario runs (Week 9) must advance simulation as fast as possible with no rendering. The architecture must support all three without scattered conditionals.
Theory Exercises
- Given \(T_{cam\to world}\), derive the view matrix and show the rigid-inverse shortcut \(T^{-1}=\begin{bmatrix}R^\top&-R^\top t\\0&1\end{bmatrix}\).
- Explain model, world, view, and projection spaces and what each transform is responsible for.
- Prove that composing transforms corresponds to changing coordinate frames.
- Contrast fixed and variable timestep; explain why a semi-implicit Euler integrator destabilizes on a
dtspike. - Explain why simulation and render updates must be separated, with a sensor-cadence example.
Implementation
Build the engine skeleton: a Window RAII wrapper over GLFW, a Logger, a latched input snapshot, and the accumulator loop above with std::chrono::steady_clock. Wrap math in GLM with GLM_FORCE_DEPTH_ZERO_TO_ONE and GLM_FORCE_RADIANS defined globally (Vulkan clip-space Z is \([0,1]\)). Add GoogleTest with transform/quaternion/accumulator unit tests, including associativity of transform composition to catch left/right-multiply errors.
Benchmark
Measure four numbers every frame in an ImGui overlay (stub now): total frame time, sim update time, render time, input latency. Use a ScopeTimer RAII helper; report mean + p99 over a sliding 1-second window. Disable V-sync (glfwSwapInterval(0)) during benchmarking.
Expected baselines: with a clear-color stub, total frame ~1 ms, sim update ~0.05 ms (empty world), input latency under ~2 ms. Determinism check: identical input sequences yield bitwise-identical sim state across runs.
Connections
This loop is the spine. Week 2 plugs Vulkan into the render slot; Week 3 feeds real meshes through the transform pipeline; Week 6’s vehicle physics is reproducible only because the timestep is fixed; Week 9’s scenario regression tests are deterministic only because the loop is. Course 5’s linear algebra is the math under the transform conventions established here.
Further Reading
- Glenn Fiedler, “Fix Your Timestep!” — the canonical accumulator essay.
- Jason Gregory, Game Engine Architecture — the game loop and time management.
- vkguide.dev introduction (Week 2 preview).