When running SmartFoxServer on machines with large amounts of RAM (e.g. 32/64GB or more) developers often have questions about strategies to make use of all resources, and how to optimize garbage collection.
At first these questions can seem intimidating, considering the vast amount of custom settings available for the Java Runtime and the multiple garbage collection options, each with its own set of configuration choices.
The good news is that navigating the complexity of the JVM is easier and less intimidating than expected (at least for SFS2X devs) and with this article we hope to simplify most SmartFoxServer users’ life.
» Memory do’s and don’ts
These days even mid range servers in the cloud offer 32/64GB of RAM (and more) and often developers assume that they need to manually force the largest heap possible to make use of those resources (via the -Xms / -Xmx JVM settings).
The reality is that in 95% of the cases letting the JVM manage the heap automatically is the wiser choice. The main reason being that SmartFoxServer 2X does not use much RAM out of the box, and it can handle thousands of connected players even with just 1GB of RAM.
Forcing tens of GBs of heap size is not only ineffective but it can backfire horribly by forcing the garbage collector to work harder for no good reason. This is particularly true when the server is deployed with a very high minimum heap size, such as 16GB (enforced via the -Xms setting).
For reference this is what the heap looks like on a dedicated 32GB server after booting up a vanilla (i.e. no custom JVM settings) SmartFoxServer 2X.
The first thing to note is that while the maximum heap size is 7.5GB there’s only 1.6GB allocated out of which only 200MB are actually used.
After a certain amount of activity the garbage collector (GC) will figure out how much RAM the server requires and you might be surprised to see that the allocated heap will likely decrease, sometimes substantially.
Below we provide a real-life example to show this behavior: we ran a test on the same 32GB server just mentioned, with 2500 CCU where players are grouped in Rooms of 8-10 players and each player broadcasts updates every 66ms (16 times per second).
What is interesting to note is how the allocated heap grows up to 3GB and then slowly and steadily decreases way below the initial 1.6GB mark we saw at start up, with no connected users.
Why is that? The default GC (the Parallel GC in Java 8) is keeping track of the memory usage and its own performance statistics at every cycle and, based on those metrics, it gradually shrinks (or grows) the different regions of the heap to optimize memory usage and GC efficiency.
The result is that 2500 realtime CCU in SFS2X use just ~500 MBytes and if we had manually imposed a larger heap size we’d have likely worsened the performance by forcing longer GC phases.
» The remaining 5%
We mentioned that default memory settings are usually the best option in 95% of the cases, so when exactly do we need to intervene manually?
Typically we’ll need to increase the heap size when working with memory intensive Extension code. Maybe because we’re running a large server-side cache of some sort (as part of the Extension) or when dealing with thousands of very large objects that have a long life cycle.
Generally speaking it’s always best to start with the default settings, monitor the server activity for a while and then decide if more memory is required or not.
» Signs of insufficient heap size
The main indicators of insufficient memory can be:
- OutOfMemoryError(s): the most obvious of red flags since the JVM crashes when it attempts to allocate new heap space.
- Used heap often reaching Maximum heap size: another red flag, which may precede an OutOfMemoryError, sometimes.
- Allocated heap reaching the Maximum heap size very often: this is less alarming than the previous states and it should be evaluated with the used heap and GC activity metrics before proceeding.
- Significant amount of CPU time spent on garbage collection: when monitoring the JVM with tools such as VisualVM you can notice a high amount of CPU time spent during GC (e.g. 30% or more).
- Long GC pauses: this is a bit more tricky as it may not directly indicate a memory issue but rather a GC configuration problem. Used in conjunction with the other parameters it can be used to evaluate memory performance issues.
» Heap is not everything
Up to this point we have exclusively talked about the heap size, which holds most of the objects created at runtime, but there’s more. Another memory area that is quite relevant for SmartFoxServer is Metaspace.
Metaspace contains class metadata loaded by the different class-loaders at runtime. The more classes get loaded, the larger this area of memory will grow.
Why is this important for SmartFoxServer 2X? It can be critical because every Extension loaded by the server uses a separate class-loader and Room-level Extensions load a new copy of the Extension classes for every Room created. This in turn can grow the metaspace significantly, especially if we don’t pay attention to how many dependencies we’re using.
This aspect is already discussed in depth in our documentation so we’re not going to repeat the same concepts here. If you would like to go deeper make sure to check the docs here.
However there’s one recommendation that we’d like highlight: always make sure to deploy your Extension code only as part of the Extension jar file. All the extra dependencies you might have added (support libs, APIs etc…) should be deployed separately under the extensions/__lib__/. If this sounds new to you, please make sure to read the docs linked in the previous paragraph.
» Wrapping up
Hopefully with this small guide we’ve provided you with a better understanding of the basics for memory configuration in SmartFoxServer 2X. The main takeaway from this article is that in the majority of cases you should let the JVM manage the heap size, especially if you’re hosting the server on a machine with 32GB+ RAM.
If you need to adjust the heap size manually do it only after having monitored the server for a while and having identified some of the red flags we have mentioned in the previous section.
In the next part of this article series we’ll dive deeper in the world of garbage collection, discuss the different options available and when you should consider using a different GC to improve the server’s performance.