Notes on Replication – Source Engine

This article should not be considered as fact, it rather aims to explain the higher-level design of multiplayer networking in the Source Engine. It may not be accurate in modern Source games. It also targets programmers who have surface level multiplayer programmer experience.

The Source engine teaches us a number of important techniques that are used frequently by other games. It is a typical client/server model where each client receives the state of the world from the server periodically. The server is always the authority over the state of the world, and the clients have limited control over the state by only sending instructions across over the network.

The multiplayer design of the Source engine has been inspired by QuakeWorld and it has a lot to dissect. This article will be the foundation for future articles. It will go over the concepts such as prediction, compensation, and compression. My sources are listed at the bottom, but parts of this are different yet the concepts stay the same.

Tick Rate/Update Rate

Let’s start of the basics on how we run the game on both the client and server. Counter-Strike: Source runs their servers on a fixed update rate of 66 Ticks Per Second (tickrate). Their game logic is tied to this number and varying this number gives ‘server timing issues‘. I am not aware if this affects player movement or other gameplay aspects (grenade explosion delay, match round timer).

The client runs the game at the frame rate afforded by the player’s computer and settings. Frame rate and tick rate are most likely matching one to one and it sounds like the movement can run at arbitrary tick rates. This may not be true and interpolation could be used to make the game feel more responsive than it is by interpolating to the next movement state, but keeping camera motion immediate.

Interestingly, the server has a separate update rate it agrees upon with the client. A client knows best what their internet connection is capable of, and can limit the updates sent from the server to the client. A server has final authority over how many updates it will send per second to each player. It cannot be higher than the tick rate, since it does not contain any new information, but it can be less. Higher tick rate servers are only part of the problem, network speed and bandwidth are still the two major concerns.

The decoupling between tick rate and update rate really shines when it comes to optimizing game servers for responsiveness or cost, and it also allows serving players with poor connections or on mobile devices.

Command Packets and Client-side Prediction

This is part of a very popular pattern used in network games. Source games are no different in that they want to give players a satisfying playing experience. When you press the movement key, you move, because it is safe to assume that the server moves you in the exact way. This feels a lot better to play compared to the alternative where the server has to move the character, and then replicate to the client that the character is moving. That is choppy that has to be interpolated to feel smooth, but smoothing adds more latency on top of the roundtrip time of the command packet, and the processing done on both machines.
So what they do is the client forms a command packet, and then runs the exact code as the server does with that command packet.

More about the command packet later, first we are going to talk about timing. The server sends the current world state to the user at a very steady stream. It may choose to wait before sending the client an update (rate throttling), but even at the highest rate of sending data that data takes time to arrive on the client. Data may also appear out of order, or never arrive at all. And the same happens with the stream of data sent by the client to the server. A player may choose to fire their sniper rifle at this exact moment in time by pressing the mouse button, and their screens may flash less than 20ms later, but the command to fire may be received 150ms later on the server. More details about hiding latency and compensating for this lag later, for now just know that there is a discrepancy between when I start moving on my screen, when my character starts moving on the server, and when another player sees my character moving.
The client and server can agree on a shared timer that is agnostic of latency. The client and server actively measure latency, and the server can now share packets containing ‘the current time is X, and our latency is Y’. A client has to adjust their current time to be X+Y since the packet is likely to be sent Y ago.

A command packet contains a few bytes of data. Different games present different data, in the case of Counter-Strike it contains movement data (forward move, side move, up move), camera rotation (view angle), which buttons are pressed. But it also contains the time at which the command was generated, and for how long the command has been in effect. This is all the data used to drive the state of the character on both the client and the server. Given two in-sync characters, and giving them an identical command packet, they should end up in the exact same state, unless the character interacts with something dynamic that is not in sync in the world, such as other players that are moving.
The Source engine has highlighted a special folder in their source tree where they have code that is both intended for the server and the client. This folder includes character locomotion and guns.

It is entirely possible for a client to make a misprediction. Mispredictions happen when there is any kind of lag, poor network performance, poorly written code, or interaction with the latest unexpected change to world state. When this happens the client needs to correct itself as soon as possible to show the player again how it thinks the server will see this character be positioned at.
This is implemented by storing a queue of command packets. Whenever a world state update is received from the server it includes data of the character the player is controlling. The client will treat this data as gospel, and then replay any queued command packets that have been played out after the received data. The received data has a timestamp so it knows how far to go back.
The first half of the queue that has been replayed has probably already been executed by the server as well, but the second half are command packets still on their way towards the server machine. The client is back on track and should have made a proper prediction.
Any small discrepancy can be smoothed out so it’s not jarring for the player, larger discrepancies are typically resolved with a ‘teleport’ though.

Interpolation

The incoming world state is intended to be received as a very consistent stream of information, but the nature of networking is ugly, latency is noisy, and packets can be dropped at random. Showing the player the state of the world as received most recently is very jerky. Characters moving in a predictable straight line will jitter at varying speeds and sometimes teleport backwards. Without interpolation you may want to consider extrapolation. Given a position and a velocity you can move the character on location and then animate it in the direction of the velocity. If the character moves in a straight line and packets are received without jitter the character would line up with the position information of the next snapshot. There are too many assumptions for this to be true, and it doesn’t hold up when the velocity changes.

Interpolation is there to cover up the ugliness and gives some breathing room to receive world state out of order. A good interpolation also allows for occasional state packets to be dropped and still give a good visual representation. Source by default delays all state to be visualized 100ms after the shared time. This gives an upper limit for how much latency is acceptable for a smooth experience, any less latency would not give you an advantage.

Let’s give an example of this. We have the server who just computed snapshot #1000 and sends it to the client. S#1000 contains the current server time and the location of the enemy. Around 20ms later the snapshot is received by the client, the latency is (roughly) known, so the the client will display the enemy at the location given in S#1000 80ms later to sum to the 100ms interpolation buffer. This will be when the shared timer is at server time + 100ms.
The server update rate is 20hz so S#999 was 50ms earlier. We assume it wasn’t dropped. At S#1000’s time + 50ms we’ll be showing S#999’s location, so we interpolate from S#999 to S#1000 from S#1000’s server time + 50ms to S#1000’s server time + 100ms. After that we need to interpolate to the next snapshot, or extrapolate if we have none.

Because it takes 50ms before we have to start interpolating towards the snapshot that has a 100ms delay, at a latency of 20ms we are able to miss a single snapshot and all that would do is make our interpolation a lower quality. When interpolating movement you can imagine having a piecewise Bezier curve, and removing one of the vertices.

More on why interpolation is a good thing in the topic of lag compensation

Delta Compression

Let me start off by saying I have no idea how delta compression is actually done in the Source engine, all I know is that it is done and that packets are ack’ed and that information is used. So take this particular section with an extra grain of salt.

During gameplay the clients acknowledges (ACK) what packets they have received from the server. The server puts this knowledge to good use. First thing it does is determine the round trip latency by measuring when the ack’ed packet has been sent and when the ACK has been received. Second it now knows it can apply delta compression.

First we dive in a bit into quantization. Instead of sending over character’s location as three floats for X, Y and Z (12 bytes in total, or 96 bits), coordinates tend to be quantized. You typically determine the possible range of X Y and Z, and the accuracy you want it to be. Source uses a unit of distance that doesn’t make much sense to explain, modern engines use 1.0f for a meter, or for a single centimeter. When quantizing you want to determine how accurate you want to be and this also depends on your use case. For the sake of this example 1.0f represents 1cm, and we want our enemies to be sent accurate to the 0.1cm. DE_Dust2’s longest side is 79m, but let’s assume our quantization needs to have plenty of space for level designers and modders to make amazing content so we specify a playspace of 500 meters in all dimensions. That would need 500.000 possible positions in our quantized space so let’s round that up to 219 which offers 524.288 spaces. Instead of 32 bits per dimension we now use 19.

It is very likely that if we know the exact location of a character, that 100ms later the character is still very close by that original location. We can leverage that information by referring to the old location (a compressed index into a table of known older snapshots stored by the client), and by subtracting the new location from the old location we have a delta location, a location relative to the one in the snapshot. This delta location most likely only has a small magnitude and not in the range of our original 500 meters, so we do not need 19 bits to represent this delta. We can say that we can only do delta snapshots that are limited to 500ms, and we can say our maximum movement speed is 1 meter per second (very high, but consider mods). This will require us to encode 1000 possible values for the delta of X and Y, our Z probably needs a different range. We can round the 1000 to 1024 (210) and only use 20 bits for X and Y, and we probably need 12 more for Z. This is done on a napkin so the numbers are rough.

So without delta compression each packet would have 96 bits of location data. With delta compression we have a bit more overhead (acknowledging packets but they tend to server multiple purposes, sending information if we are delta compressed or not, and if we are which ack’ed packet it is in reference to), but for the next second it’s roughly 32 bits for location data. The overhead of declaring the delta snapshot is also offset for every other property we will now replicate.

Serverside Lag Compensation

Let’s hammer in the system right now and one weakness. We have an outdated state of the world (20ms) that we then add a delay onto for the sake of interpolation (80ms). When we line up our sniper rifle onto the enemy’s head and fire we see our crosshair and the head exist in the same space. There is no accuracy penalty due to movement, so we expect our bullet to hit our target exactly. So we send the command packet to the server (20ms), which receives the command packet and processes it. There is no desync and the location of your character on the server is the same as on the client, but the location of the enemy is very outdated and the head has moved a significant distance.

This offers a poor player experience and this is where lag compensation comes in from. The server has a trick up their sleeve. They know the exact timing information of your command packet, and it also knows where the client thinks the enemy was at the time of that command packet by looking at acknowledged packets and reversing time for the characters to see where they were situated according to the client’s view. At time T + 20ms the server knows what client was doing at time T, and that client was making the shot based on the world state of T – 100ms. Recording the hitboxes of all players, and rewinding them by 120ms the server knows fairly accurately where the enemy was located and can trace those hitboxes for collision.

This is not necessarily unfair, all players have the same advantage of serverside lag compensation. There are implications though. This makes killing feel very responsive and fair, but being killed is a different story. 120ms is quite a lot of time. Enough time for someone to crouch down behind a box or run around a corner. It is actually 140ms for the person that got shot, for it takes time for their crouch command packet to reach the server.
The best thing you can do to mitigate this is to not delay the damage information to the shot player and hope they understand. In high-stake games such as Counter-Strike this is the main reason for higher tickrate servers with a higher update rate is to mitigate the unfair disadvantage of compensation. Yet the biggest contributor is network latency which requires a better infrastructure.

The second method of mitigating the unfair disadvantage is by limiting the player with the slow connection’s advantage by hard clamping the amount it is able to compensate, forcing shorter interpolation times. Serious players know that connection speed matter, and may be willing to invest in a more expensive connection, and the rookies will need to live with their disadvantage.

Overwatch takes this to the extreme, which will be a future blog post.

Links

https://developer.valvesoftware.com/wiki/Source_Multiplayer_Networking
https://developer.valvesoftware.com/wiki/Latency_Compensating_Methods_in_Client/Server_In-game_Protocol_Design_and_Optimization

Edit 1: Updated based on feedback