On most of the WoW Classic servers it should be through by now Opening of Ahn'Qiraj. And as suspected in advance, there were various realms in the course of the big event loads of dramaAfter all, an exclusive mount is involved in the associated quest series, but also the lag festival in Silithus that follows the strike of the gong.
In a new, very extensive "Behind the Scenes" article on the official WoW page Blizzard officials have now revealed in detail why the event was still a major challenge in 2020 and how they tried to technically prepare the servers for the onslaught of players.
War has come upon us. Earlier this month became one of the most eagerly awaited events of World of Warcraft (buy now) Classic published – the war effort for Ahn'Qiraj. Whole Classic-Realms – the concentrated power of the Horde and Alliance – have teamed up and allocated resources to open the gates and unlock the raids of Ahn'Qiraj. When the War of the Sandstorms first (and only) took place in 2006, thousands of players headed to Silithus to witness or participate in the chaos. The attendance exceeded even the wildest expectations of the development team many times over. They were simply not prepared for these masses. The servers were overloaded in no time and many players were caught in a seemingly endless cycle for twelve hours as they logged in, lost connection, and then tried to reconnect. And during all of this, our technicians were running around trying to fix bugs and allow players to log in.
Although we managed to stabilize the servers during the event and learned a lot from them, there was still plenty of room for improvement. Then 15 years later we were ready for one of the most epic moments in the history of Wow in WoW Classic to revive. To do this, we had to optimize our servers to prevent lags and server crashes – and that with twice as many players in Silithus as when the event first ran in 2006.
In this article, we want to shed some light on how we were able to recreate this long-awaited event. To this end, we look at how we use automated players and stress tests to identify weaknesses and develop tailor-made solutions for optimization, how we have used the software to find solutions to problems that were unsolvable for the hardware, and how we can were able to organize a worldwide event in which the server crashes were kept within limits and at the same time the gaming experience of WoW Classic was preserved.
The Sandstorm War – The Second
In considering how we were going to raise this event, we had three specific goals: we wanted to prevent a series of crashes from happening, increase the number of players that can be in a zone at the same time, and find out how high the pain threshold for lags was before players were teleported out of Silithus. Before we can look in detail at how we maximized server performance, we need to address the framework we were dealing with: the restrictions imposed by the code of WoW Classic, how population limits work and how it all affects gameplay.
Beyond the borders
The modern version of World of Warcraft was created based on the original code that was published 15 years ago. Since the game was released, we've developed several modern methods to get into Battle for Azeroth Dealing with a large number of players, especially sharding. Shards allow it Wow-Serve to accommodate many more players at the same time than was possible in 2006. In Battle for Azeroth we use shards to reduce the load on a server by creating a copy of the respective zone (e.g. Zuldazar) as soon as the number of players reaches a certain threshold. We avoid possible lags by distributing the players to different versions of the zone, because player interactions take up a lot of computing power.
This is due to the fact that large amounts of data packets have to be sent to the server at all times in order to accurately map the movements and cast spells. Sharding also mitigates potential lags that arise when players switch to a new zone where the number of players exceeds the threshold. It all sounds pretty simple, but there is a catch – WoW Classic was created to be as true to the original as possible of the original data from patch 1.12, and that includes the peculiarities of the gameplay. In rare cases, sharding can cause your target (e.g. an opposing player or an NPC) to disappear when you switch to another zone.
Maintaining sharding would mean losing some of the nostalgic gameplay moments where players chase NPCs or other players across zone boundaries. So we had to find a solution that didn't affect the original gameplay, but at the same time would allow us to have more players on one server without the game being unplayable due to lag.
To solve this problem, we decided to use layers – copies of entire regions (e.g. the Eastern Kingdoms) – to keep population density and lags under control without losing the unforgettable charm of the original game. Using this approach, players would be able to lure world bosses back into other zones and track opposing players within a region across borders without running the risk of being reassigned to another shard. But layers were never meant to be a permanent solution. Since the original version 1.12 used neither sharding nor layering, we promised players that layers would only be used for the release of WoW Classic to use and to deactivate over time when the players were more evenly distributed in the world.
There are a few cases where we are still using layers due to high active player counts (e.g. on the North American server Faerlina), but we have greatly reduced the number of active layers on these realms since the game was released. With 15 years of anticipation, the event surrounding Ahn'Qiraj is one of the most eagerly anticipated events of WoW Classic. We expected the largest number of players to be in an area yet, aside from the launch area on launch day, with no layers to mitigate the impact. Without the technological help of layering or sharding, we needed a creative solution, and it did it quickly.
An unforgettable experience made to measure
We started our search for a solution to the population density without layering or sharding by creating so-called "headless clients" – automated player characters – that we had to imitate the behavior of real players. So we let them cast spells, fight NPCs, and walk around. This gave us an idea of what server performance might look like with thousands of players playing in a single zone. After these simulations, we then organized stress tests with volunteers to look at realistic player behavior and compare this data with our previous results. This gave us an impression of where certain weak points were and which parts of our server code had the most problems with a high number of players. The data at the server's frame time was carefully analyzed to see how close it got to the state in which a server stopped responding (also known as deadlock).
The next step was to analyze what exactly was affecting server performance so that we could slowly break this daunting task into manageable goals. We were faced with a polynomial problem, which meant that we couldn't solve it using faster hardware because hardware doesn't improve exponentially. Instead we had to carry out the optimization by hand by consciously choosing which data was transmitted to the players and how often. Let's illustrate this problem: Let's say we have 20 players jumping in circles. The server transmits the actions of each character to the other 19 players using data packets. With this group of 20 players, the server processes 380 data packets (20 players in total * 19 recipients = 380 packets).
The whole thing becomes even more tricky when a large number of players in a zone perform the same action. If we extend our example to 500 players, then 249,500 packets are sent from the server. If we increase the number to 1,500 players, there are already 2,248,500 packages. Depending on the actions taken by the players, several data packets are transmitted per second – remember, the above examples only refer to one action at a time. The more data packets that are sent to the server, the longer the server needs to process the actions of a single player before it can tackle those of the other players. As this problem worsens, the servers are approaching a deadlock. In WoW Classic we have significantly more players on each realm than we did back in 2006, so it is expected that more players than ever will be near the gates.
Optimizing server performance
Our servers are designed to crash in the event of a deadlock and then restart. So we knew we had to do everything in our power to minimize processing time. After doing a few tests, it was clear that movement was the major part of the computing power that was troubling our servers. First, we stopped character alignment updates (which show which direction each character model is facing) and only sent player updates when the player initiated, stopped, or moved their character using the keyboard. With latency already at risk with a large number of players, using computational power for minor updates to character alignment only made the quality worse worse.
So it was better to stop it. We decided to put more characters in one zone and send out alignment updates less often. Keep in mind that our goal was to find the line before the servers crash and at the same time to let as many players as possible into Silithus. After all, it's better to get a few less movement updates than not being able to log in with your character in the first place. We also started throttling low priority data. Promotions that are rated as "less important" should not be sent at the same rate as "more important" promotions. Many messages were sent all at once regardless of their importance. So we tweaked the code so that less important information was just collected and transmitted less frequently.
Strengthening and weakening effects also had a negative impact on server performance. All over the world, strengthening and weakening effects are constantly used, especially in combat. That may sound trivial, but with so many player characters in a small space, this information must first be communicated to everyone. Similar to the throttling of data with low priority, we have bundled the strengthening and weakening effects in order to avoid several data packets being sent to players one after the other.
Coping with high numbers of players
While we were optimizing our servers to handle more players in each zone, we knew we couldn't possibly get the population of an entire realm (more than twice as many players as the original Wow-Realm from version 1.12) at the same time in Silithus. We had to make the tough decision of restricting access to this zone by determining who could go in and how many players could be there at the same time. We decided that only characters at level 60 were allowed to come to Silithus and that these characters would also be denied access once the maximum number of players was reached. This restriction was the right decision as the event in Silithus was known to be designed for maximum level characters.
In addition, low-level characters could still participate in the war effort in other zones, for example by fighting the Anubisath who roamed the Barrens and were designed for characters at level 20-30. The second issue was this: we knew the maximum number of players allowed in an area before the server went down. But how much should we lower that number to get the best server performance for everyone? Through tests, we found that the optimum was around 1,500 players when the characters were all in one place. But since the event took place in the full zone, it turned out that there were minimal problems with the players dispersing.
The event was supposed to take place in all regions, so we had to make sure everything worked across layers. This means that the person wearing the scepter who rang the gong on one layer should also trigger the event for all other layers of this realm. Since the trigger of the event was linked to a player action, we wanted to make sure that the wearer of the scepter was visible to all players on the same realm on every layer. This created an interesting problem because the servers now had to convey this information that they would not normally share with one another. There can be many complications as we compile updates and send them through the servers to make sure we are delivering the data on each layer to potentially thousands of players.
We began developing this technology with the introduction of the fishing competition in Stranglethorn. It was later used in the global buff related to Onyxia, Nefarian, Zul'Gurub and Rend. When we were finally certain that everything was working as intended, we were ready to test all of our technology for the Ahn'Qiraj event.
Experiments with possible solutions
Now that we had resolved our biggest technical problems and found several ways to optimize server performance, it was time to put our work to the test. We made a shortened version of the ten hour war that was supposed to last only an hour.
During the first stress test, we let almost all of the players into the zone to see what would happen. At one point we were almost 150% of the capacity of an entire 1.12 realm. And it was exactly at this point that our test realm crashed. We knew that we had chosen a very high number of players as the limit for the zone, and that number exceeded that limit many times over. When we investigated, we found that the code that allowed players to both enter and exit a zone was a queue that couldn't handle a lot of players at once.
As a result, players were not teleported out of the zone and were stuck on flight routes for an unusually long time. We brought the server back into shape and continued the stress test. And in the meantime, we adjusted some things. We reduced the limit to a point where the lag was still noticeable but bearable, and kept a much larger number of players in the zone than ever before. The event should only have lasted an hour and a half. In fact, it took us up to four hours because of the crashes.
The second stress test was carried out a week later. This enabled us to see whether our optimizations were having an effect. The improvements were noticeable when logging in to the stress test – no more players were stuck on the flight routes to Silithus! We were able to collect enough data to show how many players Silithus could handle without any major problems. After both tests, we were then able to settle on a number that we thought was best for dealing with lags and maintaining server stability. Through these tests we were able to determine whether our optimizations were working, and since we were able to find the optimal number of players per zone, they were a complete success.
Server solutions for all of Azeroth
Originally, the optimizations should only be active in Silithus during the Sandstorm War. After we had ensured that they could be used globally without any problems, we implemented these changes with patch 1.13.5 in the entire game world. As the war effort began, players began relinquishing resources and looting beetle carcasses en masse. The number of players soared not only in Silithus, but also in the capitals and the open world. These tweaks helped make the gaming experience feel smoother and enabled massive PvP battles across Azeroth. Some players even went so far as to summon the world boss Donneraan to help them drive the other faction from a swarm den.
Although the event to open the gates had not yet happened, some servers were experiencing strange bugs that prevented them from fueling the war effort. The speed at which some servers pushed the war effort forward was so rapid that the logic of every submission resulted in a race situation that prevented the five-day timer from starting. Because the likelihood of such an exception occurring was so small, we were able to fix the error for these servers manually and then ensure that it would not recur in future war efforts by other servers.
After the war effort was over and the five-day timer expired, we kept an eye on the Chinese realms that would be the first to open the gates. The first server in China for which the gong became active was Ouro. We found that most of the players were in Silithus on each layer. The event would start on several layers with a maximum occupation for several thousand players at the same time. We'd never tried anything like this before. Although there were significant lags, our servers did not crash the first time the gates were opened in China.
The gong sounds
On August 4th, it became clear that several realms in North America would be ready to ring the gong shortly after the server reboots. With the help of Game Master accounts and our observation tools, we watched over these realms with eagle eyes in order to be able to fix possible problems. All realms booted up and smoothly began the event. The wearers of the scepter received their prestigious black Qiraji armored drone as a mount, the players could compete against even larger beetles and we were pleased with the stability. While we waited for our first server to complete its five-day wait after the server restart, we noticed a serious problem: the event did not persist after the server restart. So should a server crash or restart, all event progress would be lost.
This error has existed since the development of WoW Classic, but so far there simply hadn't been many events that had to persist after the server restarted. Our team was able to fix the error quickly, but we had to make sure that no further server restart took place before we had applied the hotfix and saved the previous status of all war efforts in our database, all without disturbing the players.
Some would argue that the server crashes made the original war for Ahn'Qiraj so chaotic and therefore memorable. Instead, we wanted to evoke the same passion by creating a much more stable gaming experience that players could share with 1,500 other players in Silithus. We wanted the Ahn'Qiraj war in WoW Classic remembered as an event where as many players as possible could take part in the Ten Hour War without a break. There were a few server crashes, but the servers kept coming back online quickly. These realms fully recovered and were back online within minutes without further crashes.
Over 4,000 players worldwide have become Scarab Lords, and that number continues to grow as each server advances in the war effort. The enthusiasm and zeal of the players in WoW Classic since the beginning of the war effort for Ahn'Qiraj are indescribable to us. Thank you to everyone who joined us for the Second Sandstorm War!
All readers receive daily free news, articles, guides, videos and podcasts on World of Warcraft, Pokémon Go and other favorite games. So far we have financed this site through advertising and kept it as free of paid items as possible, but since COVID-19 this has become increasingly difficult. Many companies are cutting or cutting their advertising budgets for 2020. Budgets that we unfortunately have to rely on if we want to offer buffed free of charge as usual in the future.
For this reason we turn to you now. As a buffed supporter, you can support us so that we can continue to offer our content in the usual form for free, without introducing a paywall or publishing misleading news. Every contribution, large or small, is valuable. Supports buffed – it only takes a minute.
Read these interesting stories too Pc
WoW Shadowlands: Traditional armor in the future from level 50
WoW Shadowlands: Beta build brings further class changes
WoW Shadowlands: All artifact weapons class-wide transmoggable
(*) We have marked affiliate links with an asterisk. We receive a small commission for a purchase via our link and can thus partially finance the free website with this income. There are no costs for the user.