SFS2X multi-threading demystified

Often times articles in this blog are inspired by questions and issues raised by our users and this new entry is no exception. One aspect of SmartFoxServer that seems to intimidate developers is the multi-threaded environment behind custom Extensions, and the relative implications in terms of concurrency, scalability and performance.

In this new entry we’re going to demystify the subject and demonstrate how simple and painless is writing server side code, even when many other things are running concurrently.

Before continuing, this article assumes you already know the basics of Extension development. If not, we suggest to hold your horses and take a look at this introductory tutorial first.

» Taming the multi-threaded beast


Many of the support questions we receive concern with how to run Extension code efficiently, how to avoid concurrency issues or how to manage threads in the system. The good news is that SmartFoxServer 2X deals with many of these aspects behind the scenes and there’s only a handful of cases where manual intervention is required.

For example most of the concurrency issues, such as thread safety and management, are handled for you by the system. The server monitors itself and is capable of adjusting the internal resources to sustain higher loads or deal with slow I/O calls that may drain the thread pools.

If you’re entirely new to programming in a multi-threading environment this article will clarify which situations needs a bit of extra care, should you encounter them during development. On the other hand, Java veterans can learn what exactly the server does for them and where they can intervene manually to get even better results.

» A quick view from the top

If you have consulted our documentation before you should be familiar with this diagram:

It shows the different layers of the server architecture and the flow of data from client to server side logic and viceversa. All these colored blocks manage a specific activity of the server and work in parallel using multiple threads, to provide maximum efficiency.

In essence SFS2X uses an event driven style of programming where your code responds to different types of occurrences:

  • Client requests: any request sent by connected clients via the API’s ExtensionRequest object
  • Server events: any event that the Extension has subscribed to and that is triggered by the server when a specific situation occurs. (e.g. a User has joined a Room in the current Zone)
  • Scheduled events: similar to the above, these events are scheduled by your own Extension code to run in the future (e.g. game timers, time based triggers, etc.)

Each of these events is going to call our code from a different thread at any time and thus, potentially, create a concurrency issue.

While this may sound a bit scary, in actuality there’s only a handful of cases that require our attention. The most popular one is accessing shared objects, i.e. objects referenced by multiple classes being accessed simultaneously by different threads.

Below is an example of a simple scenario, where the MyData instance looks like the potential candidate for multiple thread access and, without proper synchronization, it could lead to an incoherent state.

public class MyEventHandler extends BaseServerEventHandler
{
	private MyData data;
	
	@Override
	public handleServerEvent(ISFSEvent event) throws SFSException
	{
		// event logic...
		
		data.changeState(someValue);
	}
}

Before we do anything to fix this potential issue we need to remember that by default SFS2X creates a new instance of our event handler on every call, so in this case there is no concurrency problem. Each call will act on a different instance of MyData.

Let’s say however that we’re referencing one or more objects shared by the top Zone Extension. In this case we would be in a situation where multiple threads could call our objects concurrently, such as in this snippet:

public class MyEventHandler extends BaseServerEventHandler
{	
	@Override
	public handleServerEvent(ISFSEvent event) throws SFSException
	{
		GlobalScores scores = ((ParentExt) getParentExtension()).getGlobalScores();
		scores.add(userName, newScore);
		
		// more game logic...
	}
}

Here the GlobalScore instance needs to be treated with extra care as the add(…) method will be called concurrently. For example if the class manages data internally using collections, we should make sure to use concurrent collections.

public class GlobalScore 
{	
	private final Map<String, Integer> scoreTable;
	
	public GlobalScore()
	{
		scoreTable = new HashMap<>();
	}
	
	public void add(String userName, int value)
	{
		scoreTable.put(userName, value);
	}
	
	//...
}

Instead of using a regular HashMap, which is not thread safe, we should employ a ConcurrentHashMap from the java.util.concurrent package.

On the other hand if the class used a database to handle the scores we should probably not worry, as database calls don’t need any synchronization.

Let’s consider another example: imagine we have added a SFSEventType.ROOM_REMOVED listener on the server side to count the number of games that are completed. When the event triggers we invoke the parent Zone Extension and call its increment() method.

public class MyEventHandler extends BaseServerEventHandler
{	
	@Override
	public handleServerEvent(ISFSEvent event) throws SFSException
	{
		Room room = (Room) event.getParameter(SFSEventParam.ROOM);
	
		if (room.isGame())
			((MyZoneExt) getParentExtension()).increment();
		
		// more game logic...
	}
}

This is what happens in the the Extension increment() method:

public MyZoneExt extends SFSExtension
{
	private int counter = 0;
	
	// ...

	public void increment()
	{
		counter++;
	}
}

Even a simple operation such as counter++ can’t be considered thread safe and implemented this way it could cause issues. One solution would be to introduce a lock object and a synchronized block:

public MyZoneExt extends SFSExtension
{
	private int counter = 0;
	private final Object counterLock = new Object();
	
	// ...

	public void increment()
	{
		synchronized(counterLock)
		{
			counter++;
		}
	}
}

This is fine, but it’s also a bit of an antiquated style of synchronization and there’s a much better solution: using an atomic variable such as AtomicInteger from the Java SDK.

public MyZoneExt extends SFSExtension
{
	private final AtomicInteger counter = new AtomicInteger();
	
	// ...

	public void increment()
	{
		counter.incrementAndGet();
	}
}

Bottom line: objects referenced by multiple event handlers are potentially subject to concurrent calls and can interfere with each other. To deal with this correctly we can employ different solutions:

Please note that synchronization primitives are at the bottom of our list as we recommend to always prioritize any of the other options, if applicable.

» Concurrency and SmartFoxServer API

Since Extension code uses quite a lot of objects from the SmartFoxServer SDK such as Room, User, Buddy etc., we have already made sure that all server API calls are thread safe and so are most of the objects obtained from them.

There are exceptions to this rule with data wrapper classes such as SFSObject/SFSArray and those inheriting the Variable interface such as SFSRoomVariable, SFSBuddyVariable etc.

These are collection-type classes used for sending and receiving data but not for storing it long term. In other words these objects are used locally in methods to wrap data that needs to be sent out, but they are not recommended as replacement for standard collections.

Barring the exceptions just mentioned you can assume everything else is thread safe in the server API, and you can refer to the server side documentation for extra details on the specific classes/methods you need to use.

» Creating and managing threads

One misconception we have found is that sometimes developers assume they need to create and manage their own threads for slow I/O operations, such as database queries, or scheduling tasks for timers and delayed triggers.

In reality SmartFoxServer removes most of these responsibilities from the hand of the developer:

  • Slow I/O is handled by SFS2X via the internal auto load-balancing thread pools that can increase and decrease the number of threads on demand, without manual intervention. To learn more about this feature we recommend this article from our doc website.
  • Scheduling tasks is handled by a specific entity called the TaskScheduler which allows to run delayed and repetitive tasks. The server offers a central scheduler that can be used without any setup. If you plan to create hundreds of custom tasks it might be better to create a new instance of the TaskScheduler in your own Extension.
    See this tutorial if you want to see a scheduler code example.

Bottom line: it’s very unlikely you will ever need to deal with creating or managing threads manually when developing server side code. SFS2X can handle different workloads autonomously and if your project makes heavy use of slow I/O calls you will just need to fine tune the thread pools from the AdminTool.

Similarly if your code requires thousands of scheduled tasks, all you need is to create a custom TaskScheduler specifically configured for your needs.

» Excessive blocking I/O

Before wrapping up we’d like to spend a few more words on the topic of blocking I/O as we’ve helped many clients with projects relying too heavily on it. Calling databases and external web services is typically required in non trivial applications and it’s usually of no concern. As we have mentioned in the previous section, SmartFoxServer 2X is able to deal with slow I/O by monitoring the thread pools and resizing them accordingly.

There is however a broader consideration regarding the amount of blocking I/O that a project can use before it becomes a bottleneck and it affects the system’s scalability. For instance calling remote web services via the internet is usually a bad practice, as each call incurs in significant lag. It would be much better to organize the servers that need to interact with each other in the same private network, to minimize latency.

Other recurring issues are too frequent calls to a database resulting in poor Extension response times, or too heavy SQL queries that take a long time to execute causing in scalability issues.

How to evaluate the impact on scalability?
There is no simple answer to this question other than: by testing!

Testing is fundamental. Too often we’ve seen customers launch a multi-server project in production without having run a single stress test, only to find later that there are major performance issues.

We always emphasize that it is well worth the time and money invested in a few weeks of proper testing, to at least verify that no major bottlenecks are present in the system. Plus, basic testing is relatively simple and cheap as we have outlined in this article.

Running a few stress tests while monitoring the server’s dashboard will already provide lot of useful information:

  • CPU usage: gives you an idea of how many CCUs you can run per machine and can highlight potential issues in your code, such as calls that are too demanding or not optimized enough.
  • RAM usage: gives you clues on the amount of RAM necessary for your application and can uncover potential memory leaks.
  • Network usage: provides a ballpark figure of the bandwidth needed per user and overall requirements for an expected peak time. It may also help finding potential areas of improvement to reduce the network usage.
  • Threads and queues: this is an indicator of good vs. bad scalability. If you find the server’s queues are often busy and threads are increasing rapidly, that’s an indication of a performance issue that should be further investigated.

» Conclusion

We hope to have clarified the fundamental approach to building your game code with SmartFoxServer 2X. If you have any questions about what we have discussed so far, feel free to post your comments in our support board.