Back-end optimizations: Part 1
Updated: Dec 8, 2018
There's always a trade-off when you decide to write your own stuff vs. using something that is already done. Like I said in previous posts, when we started working on our back-end code, Multiplayer As A Service companies where not where they are now in terms of features and thus we were forced to make our own custom thing. We already have basic functionality in place and unless it proves to be a non-viable solution we will stick to it. That said, when I started writing multiplayer code the last thing I cared about was optimization. Sure, you don’t start by doing obviously shitty things… but I did adhere here to the ethos that “premature optimization is the root of all evil”.
Let’s put some context here. There first thing to decide was whether to use TCP or UPD. I read a couple of blog posts, articles, reddit threads and watched a couple of videos. Given the nature of our games we don’t need the super responsiveness of a FPS that -to my knowledge- are mostly written using an UDP protocol. On one hand we have advice from the likes of Glen Fiedler and he tells you to favor UDP over TCP (unless your game is a turn based game). You can see a discussion about this here:
On the other hand lots of RTS games where done with just TCP even in the old days! (Warcraft I, II and Starcraft come to mind). It was a hard call! All bets where off and I did a gut feeling decision of going the TCP route. Fast-forward...some days ago I watched a very interesting video featuring Pat Wyatt:
He’s a veteran in network programming; probably one of the best in the world. He was the brain behind the network engineering of legendary games such as Guild Wars, Warcraft and Diablo. Guild Wars is a Massive Multiplayer Online Role-playing Game and when asked what protocol they used he said they only used TCP. Discussion about TCP vs UDP for Guild Wars starts at minute 47. So that was pretty much it. I feel confident now that if TCP was good for them then it should be good for us.
So… when I started coding the networking stuff I didn’t bother about the amount of packets being sent, or do client-side prediction, interpolation, tweening and etc. Since I was going to run this locally until the game play was validated there was no sense to worry about that. The game was sending 30 packets per second (which for TCP is ridiculously high). For a game match that lasts 180 seconds we're talking about 5400 packets per user, per game session. I was sending 30 packets per second to sync with the 30 frames per second the game is running (1 TCP packet per frame). For a local network with near zero latency this works fine most of the time. But take this out of my local network and things start to look very ugly.
First thing to do was reducing the amount of packets. For that I decoupled the per-frame update loop and processed the packets based on their ticks and not the current frame rendering loop.
Second, packets are now send only when there is a state change in the game. This dramatically reduced the amount of packets send down to a few hundred for all the game session! I also managed to synchronize the server ticks with the client ticks with an error margin of about 0-3 ticks (each tick is around 16 milliseconds). By just making the client work in the past by a couple of few ticks I could be confident that the client will never be in a “future” tick.
Third, I had to reduce the packet size. I had to do a painful refactoring for that because a lot of things where just done in the spirit of moving forward and not with efficiency in mind. By serializing the game state’s fingerprint (which was a string representation of the game map, traps and units) and other things... I managed to go down from more than 1.5 KB to just 220 bytes per packet (on average). For Ethernet, the maximum transmission unit (MTU) is 1500 bytes. That means packets bigger than that get chopped into pieces. I saw other MTU's for as low as 576. I then assumed that there are two routes your packet can take depending on size. If that is the case, by being below 576 bytes I can guarantee our packets can take both routes without being split (dunno if that will improve performance or not).
And that’s where we are now! For part 2 of the network optimization I am going to focus on client-side prediction and other techniques to smooth the latency as much as possible. I’ll keep updating. Have a nice week-end!