{"id":177,"date":"2015-03-06T17:13:31","date_gmt":"2015-03-06T17:13:31","guid":{"rendered":"http:\/\/smartfoxserver.com\/blog\/?p=177"},"modified":"2015-03-06T17:13:31","modified_gmt":"2015-03-06T17:13:31","slug":"building-a-simple-stress-test-tool","status":"publish","type":"post","link":"https:\/\/smartfoxserver.com\/blog\/building-a-simple-stress-test-tool\/","title":{"rendered":"Building a simple stress test tool"},"content":{"rendered":"<p>One of the questions that often pops up in our forums is &#8220;how do I run a stress test on my game&#8221;?<\/p>\n<p>There are several ways in which this can be done. A simple way to stress test your server side Extension is to build a client application that acts as a player, essentially a &#8220;bot&#8221;, which can be replicated several hundreds or thousands of times to simulate a large amount of clients.<!--more--><\/p>\n<h3>\u00bb Building the client<\/h3>\n<p>For this example we will build a simple Java client using the standard <strong>SFS2X Java API<\/strong> which can be <a title=\"Get the latest client API\" href=\"http:\/\/www.smartfoxserver.com\/download\/sfs2x#p=client\">downloaded from here<\/a>. The same could be done using C# or AS3 etc&#8230;<\/p>\n<p>The simple client will connect to the server, login as guest, join a specific Room and start sending messages. This basic example can serve as a simple template to build more complex interactions for your tests.<\/p>\n<h3>\u00bb Replicating the load<\/h3>\n<p>Before we proceed with the creation of the client logic let&#8217;s see how the &#8220;Replicator&#8221; will work. With this name we mean the top-level application\u00a0that will take a generic client implementation and will generate many copies at a constant\u00a0interval, until all &#8220;test bots&#8221; are ready.<\/p>\n<pre class=\"brush: java; title: ; notranslate\" title=\"\">\r\npublic class StressTestReplicator\r\n{\r\n\tprivate final List&lt;BaseStressClient&gt; clients;\r\n\tprivate final ScheduledThreadPoolExecutor generator;\r\n\r\n\tprivate String clientClassName;\t\t\/\/ name of the client class\r\n\tprivate int generationSpeed = 250; \t\/\/ interval between each client is connection\r\n\tprivate int totalCCU = 50;\t\t\t\/\/ #\u00a0of CCU\r\n\r\n\tprivate Class&lt;?&gt; clientClass;\r\n\tprivate ScheduledFuture&lt;?&gt; generationTask;\r\n\r\n\tpublic StressTestReplicator(Properties config)\r\n    {\r\n\t\tclients = new LinkedList&lt;&gt;();\r\n\t\tgenerator = new ScheduledThreadPoolExecutor(1);\r\n\r\n\t\tclientClassName = config.getProperty(&quot;clientClassName&quot;);\r\n\r\n\t\ttry { generationSpeed = Integer.parseInt(config.getProperty(&quot;generationSpeed&quot;)); } catch (NumberFormatException e ) {};\r\n\t\ttry { totalCCU = Integer.parseInt(config.getProperty(&quot;totalCCU&quot;)); } catch (NumberFormatException e ) {};\r\n\r\n\t\tSystem.out.printf(&quot;%s, %s, %s\\n&quot;, clientClassName, generationSpeed, totalCCU);\r\n\r\n\t\ttry\r\n\t\t{\r\n\t\t\t\/\/ Load main client class\r\n\t\t\tclientClass = Class.forName(clientClassName);\r\n\r\n\t\t\t\/\/ Prepare generation\r\n\t\t\tgenerationTask = generator.scheduleAtFixedRate(new GeneratorRunner(), 0, generationSpeed, TimeUnit.MILLISECONDS);\r\n\t\t}\r\n\t\tcatch (ClassNotFoundException e)\r\n\t\t{\r\n\t\t\tSystem.out.println(&quot;Specified Client class: &quot; + clientClassName + &quot; not found! Quitting.&quot;);\r\n\t\t}\r\n    }\r\n\r\n\tvoid handleClientDisconnect(BaseStressClient client)\r\n\t{\r\n\t\tsynchronized (clients)\r\n        {\r\n\t        clients.remove(client);\r\n        }\r\n\r\n\t\tif (clients.size() == 0)\r\n\t\t{\r\n\t\t\tSystem.out.println(&quot;===== TEST COMPLETE =====&quot;);\r\n\t\t\tSystem.exit(0);\r\n\t\t}\r\n\t}\r\n\r\n\tpublic static void main(String[] args) throws Exception\r\n    {\r\n\t\tString defaultCfg = args.length &gt; 0 ? args[0] : &quot;config.properties&quot;;\r\n\r\n\t\tProperties props = new Properties();\r\n\t\tprops.load(new FileInputStream(defaultCfg));\r\n\r\n\t    new StressTestReplicator(props);\r\n    }\r\n\r\n\t\/\/=====================================================================\r\n\r\n\tprivate class GeneratorRunner implements Runnable\r\n\t{\r\n\t\t@Override\r\n\t\tpublic void run()\r\n\t\t{\r\n\t\t\ttry\r\n            {\r\n\t            if (clients.size() &lt; totalCCU)\r\n\t            \tstartupNewClient();\r\n\t            else\r\n\t            \tgenerationTask.cancel(true);\r\n            }\r\n            catch (Exception e)\r\n            {\r\n\t            System.out.println(&quot;ERROR Generating client: &quot; + e.getMessage());\r\n            }\r\n\t\t}\r\n\r\n\t\tprivate void startupNewClient() throws Exception\r\n\t\t{\r\n\t\t\tBaseStressClient client = (BaseStressClient) clientClass.newInstance();\r\n\r\n\t\t\tsynchronized (clients)\r\n            {\r\n\t\t\t\tclients.add(client);\r\n            }\r\n\r\n\t\t\tclient.setShell(StressTestReplicator.this);\r\n\r\n\t\t\tclient.startUp();\r\n\t\t}\r\n\t}\r\n}\r\n<\/pre>\n<p>The class will startup by loading an external <em>config.properties<\/em> file which looks like this:<\/p>\n<pre class=\"brush: java; title: ; notranslate\" title=\"\">\r\nclientClassName=sfs2x.example.stresstest.SimpleChatClient\r\n\r\ngenerationSpeed=500\r\n\r\ntotalCCU=20\r\n<\/pre>\n<p>The properties are:<\/p>\n<ul>\n<li>the name of the class to be used as the client logic (<em>clientClassName<\/em>)<\/li>\n<li>the total number of clients for the test (<em>totalCCU<\/em>)<\/li>\n<li>the interval between each generated client, expressed in milliseconds (<em>generationSpeed<\/em>)<\/li>\n<\/ul>\n<p>Once these parameters are loaded the test will start by generating all the requested clients via a thread-pool based scheduled executor (<a title=\"javadoc\" href=\"http:\/\/docs.oracle.com\/javase\/7\/docs\/api\/java\/util\/concurrent\/ScheduledThreadPoolExecutor.html\" target=\"_blank\">ScheduledThreadPoolExecutor<\/a>)<\/p>\n<p>In order for\u00a0the test class to be &#8220;neutral&#8221; to the Replicator we have created a base class called\u00a0<strong>BaseStressClient\u00a0<\/strong>which defines a couple of methods:<\/p>\n<pre class=\"brush: java; title: ; notranslate\" title=\"\">\r\npublic abstract class BaseStressClient\r\n{\r\n\tprivate StressTestReplicator shell;\r\n\r\n\tpublic abstract void startUp();\r\n\r\n\tpublic void setShell(StressTestReplicator shell)\r\n\t{\r\n\t\tthis.shell = shell;\r\n\t}\r\n\r\n\tprotected void onShutDown(BaseStressClient client)\r\n\t{\r\n\t\tshell.handleClientDisconnect(client);\r\n\t}\r\n}\r\n<\/pre>\n<p>The <strong>startUp()<\/strong> method is where the client code gets initialized and it must be overridden in the child class. The <strong>onShutDown(&#8230;)\u00a0<\/strong>method is invoked by\u00a0the client implementation to signal the Replicator that the client has disconnected, so that they \u00a0can be disposed.<\/p>\n<h3>\u00bb\u00a0Building the client logic<\/h3>\n<p>This\u00a0is the code for the client itself:<\/p>\n<pre class=\"brush: java; title: ; notranslate\" title=\"\">\r\npublic class SimpleChatClient extends BaseStressClient\r\n{\r\n\t\/\/ A scheduler for sending messages shared among all client bots.\r\n\tprivate static ScheduledExecutorService sched = new ScheduledThreadPoolExecutor(1);\r\n\tprivate static final int TOT_PUB_MESSAGES = 50;\r\n\r\n\tprivate SmartFox sfs;\r\n\tprivate ConfigData cfg;\r\n\tprivate IEventListener evtListener;\r\n\tprivate ScheduledFuture&lt;?&gt; publicMessageTask;\r\n\tprivate int pubMessageCount = 0;\r\n\r\n\t@Override\r\n\tpublic void startUp()\r\n\t{\r\n\t    sfs = new SmartFox();\r\n\t    cfg = new ConfigData();\r\n\t    evtListener = new SFSEventListener();\r\n\r\n\t    cfg.setHost(&quot;localhost&quot;);\r\n\t    cfg.setPort(9933);\r\n\t    cfg.setZone(&quot;BasicExamples&quot;);\r\n\r\n\t    sfs.addEventListener(SFSEvent.CONNECTION, evtListener);\r\n\t    sfs.addEventListener(SFSEvent.CONNECTION_LOST, evtListener);\r\n\t    sfs.addEventListener(SFSEvent.LOGIN, evtListener);\r\n\t    sfs.addEventListener(SFSEvent.ROOM_JOIN, evtListener);\r\n\t    sfs.addEventListener(SFSEvent.PUBLIC_MESSAGE, evtListener);\r\n\r\n\t    sfs.connect(cfg);\r\n\t}\r\n\r\n\tpublic class SFSEventListener implements IEventListener\r\n\t{\r\n\t\t@Override\r\n\t\tpublic void dispatch(BaseEvent evt) throws SFSException\r\n\t\t{\r\n\t\t    String type = evt.getType();\r\n\t\t    Map&lt;String, Object&gt; params = evt.getArguments();\r\n\r\n\t\t    if (type.equals(SFSEvent.CONNECTION))\r\n\t\t    {\r\n\t\t    \tboolean success = (Boolean) params.get(&quot;success&quot;);\r\n\r\n\t\t    \tif (success)\r\n\t\t    \t\tsfs.send(new LoginRequest(&quot;&quot;));\r\n\t\t    \telse\r\n\t\t    \t{\r\n\t\t    \t\tSystem.err.println(&quot;Connection failed&quot;);\r\n\t\t    \t\tcleanUp();\r\n\t\t    \t}\r\n\t\t    }\r\n\r\n\t\t    else if (type.equals(SFSEvent.CONNECTION_LOST))\r\n\t\t    {\r\n\t\t    \tSystem.out.println(&quot;Client disconnected. &quot;);\r\n\t\t    \tcleanUp();\r\n\t\t    }\r\n\r\n\t\t    else if (type.equals(SFSEvent.LOGIN))\r\n\t\t    {\r\n\t\t    \t\/\/ Join room\r\n\t\t    \tsfs.send(new JoinRoomRequest(&quot;The Lobby&quot;));\r\n\t\t    }\r\n\r\n\t\t    else if (type.equals(SFSEvent.ROOM_JOIN))\r\n\t\t    {\r\n\t\t    \tpublicMessageTask = sched.scheduleAtFixedRate(new Runnable()\r\n\t\t\t\t{\r\n\t\t\t\t\t@Override\r\n\t\t\t\t\tpublic void run()\r\n\t\t\t\t\t{\r\n\t\t\t\t\t\tif (pubMessageCount &lt; TOT_PUB_MESSAGES)\r\n\t\t\t\t\t\t{\r\n\t\t\t\t\t\t\tsfs.send(new PublicMessageRequest(&quot;Hello, this is a test public message.&quot;));\r\n\t\t\t\t\t\t\tpubMessageCount++;\r\n\r\n\t\t\t\t\t\t\tSystem.out.println(sfs.getMySelf().getName() + &quot; --&gt; Message: &quot; + pubMessageCount);\r\n\t\t\t\t\t\t}\r\n\t\t\t\t\t\telse\r\n\t\t\t\t\t\t{\r\n\t\t\t\t\t\t\t\/\/ End of test\r\n\t\t\t\t\t\t\tsfs.disconnect();\r\n\t\t\t\t\t\t}\r\n\r\n\t\t\t\t\t}\r\n\t\t\t\t}, 0, 2, TimeUnit.SECONDS);\r\n\t\t    }\r\n\r\n\t\t}\r\n\t}\r\n\r\n\tprivate void cleanUp()\r\n\t{\r\n\t\t\/\/ Remove listeners\r\n    \tsfs.removeAllEventListeners();\r\n\r\n    \t\/\/ Stop task\r\n    \tif (publicMessageTask != null)\r\n\t\t\tpublicMessageTask.cancel(true);\r\n\r\n    \t\/\/ Signal end of session to Shell\r\n    \tonShutDown(this);\r\n\t}\r\n}\r\n<\/pre>\n<p>The class extends the BaseStressClient parent and instantiates the SmartFox API. We then proceed by setting up the event listeners and connection parameters. Finally we invoke the <strong>sfs.connect(&#8230;)<\/strong> method to get started.<\/p>\n<p>Notice that we also declared a <strong>static ScheduledExecutorService<\/strong> at the top of the declarations. This is going to be used as the main scheduler for sending public messages at specific intervals, in this case one message every two second.<\/p>\n<p>We chose to make it static so that we can share the same instance across all client objects, this way only one thread will take care of all our messages. If you plan to\u00a0run thousands of clients or use\u00a0faster message rates\u00a0you will probably need to increase the number of threads in the constructor.<\/p>\n<h3>\u00bb\u00a0Performance notes<\/h3>\n<p>When replicating many hundreds \/ thousands of clients we should keep in mind that every new instance of the <strong>SmartFox<\/strong> class (the main API class) will use a certain amount of resources, namely RAM and Java threads.<\/p>\n<p>For this simple example each instance should take ~1MB of heap memory which means we\u00a0can expect 1000 clients to take approximately\u00a01GB of RAM. In this case you will probably need to adjust the heap settings of the JVM by adding the usual <strong>-Xmx<\/strong> switch to the startup script.<\/p>\n<p>Similarly the number of threads in the JVM will increase by 2 units for each new client generated, so for 1000 clients we\u00a0will end up with 2000 threads, which is a pretty\u00a0high number.<\/p>\n<p>Any relatively modern machine (e.g 2-4 cores, 4GB RAM) should be able to run at least 1000 clients, although the complexity of the client logic and the rate of network messages may reduce this value.<\/p>\n<p>On more powerful\u00a0hardware, such as a dedicated server, you should be able to run several thousands of CCU without much effort.<\/p>\n<p>Before we start running the test let&#8217;s make sure we have all the necessary monitoring tool to watch the basic performance parameters:<\/p>\n<ul>\n<li>Open the server&#8217;s <strong>AdminTool<\/strong> and select the <strong>Dashboard<\/strong> module. This will allow you to check\u00a0all vital parameters of the server runtime.<\/li>\n<li>Launch your <strong>OS resource monitor<\/strong> so that you can keep an eye on CPU and RAM usage.<\/li>\n<\/ul>\n<p>Here are some important suggestions to make sure that a stress test is executed successfully:<\/p>\n<ul>\n<li><strong>Monitor the CPU and RAM usage after all clients have been generated<\/strong> and make sure you never pass the 90% CPU mark or 90% RAM used. This is of the highest importance to avoid creating a bottleneck between client and server. (NOTE: 90% is meant of the whole CPU, not just a single core)<\/li>\n<li><strong>Always run a stress test in a ethernet cabled LAN<\/strong> (local network) where you have access to at least a 100Mbit low latency connection. Even better if you have a 1Gbps or 10Gbps connection.<\/li>\n<li>To reinforce the previous point: <strong>never run a stress test over a Wifi connection<\/strong> or worse, \u00a0a remote server. The bandwidth and latency of a Wifi are horribly slow and bad for these kind of tests. Remember the point of these stress tests is assessing the performance of the server and custom Extension, not the network.<\/li>\n<li>Before running a test <strong>make sure the ping time between client and server is less or equal to 1-5 milliseconds<\/strong>. More than that may\u00a0suggest an inadequate network infrastructure.<\/li>\n<li>Whenever possible <strong>make sure not to deliver the full list of Rooms to each client<\/strong>. This can be a major RAM eater if the test involves hundreds or thousands of Rooms. To do so simply remove all group references to the &#8220;Default groups&#8221; setting in your test Zone.<\/li>\n<\/ul>\n<h3>\u00bb\u00a0Adding more client machines<\/h3>\n<p>What happens when the dreaded 90% of the machine resources are all used up but we need more CCU for our performance test?<\/p>\n<p>It&#8217;s probably time to <strong>add another dedicated machine to run more clients<\/strong>. If you don&#8217;t have access to more hardware you may consider running\u00a0the whole stress test <strong>in the cloud<\/strong>, so that you can choose the size and number of &#8220;stress clients&#8221; to employ.<\/p>\n<p>The cloud is also convenient as it lets you clone one machine setup onto multiple servers, allowing a quick way for deploying more instances.<\/p>\n<p>In order to choose the proper cloud provider for your tests make sure that <strong>they don&#8217;t charge you for internal bandwidth costs<\/strong> (i.e. data transfer between private IPs) and have a fast ping time between servers.<\/p>\n<p>We have successfully run many performance\u00a0tests using <a href=\"https:\/\/jelastic.com\/\">Jelastic<\/a> and <a href=\"http:\/\/www.rackspace.com\/cloud\">Rackspace\u00a0Cloud<\/a>. The former is economical and convenient\u00a0for medium-size tests, while the latter is great for very large scale tests and also provides physical dedicated servers on demand.<\/p>\n<p><a href=\"http:\/\/aws.amazon.com\/ec2\/\">Amazon EC2<\/a> should also work fine for these purposes and there are probably many other valid options as well. You can do\u00a0a quick google\u00a0research, if you want more options.<\/p>\n<h3>\u00bb\u00a0Advanced testing<\/h3>\n<p><strong>1) Login: i<\/strong>n our simple example we have used an anonymous login request and we don&#8217;t employ a server side Extension to check the user credentials. Chances are that your system will probably use a database for login and you wish to test how the DB performs with a high traffic.<\/p>\n<p>A simple solution is to pre-populate the user&#8217;s database with index-based names such as User-1, User-2 &#8230; User-N. This way you can build a simple client side logic that will generate these names with an auto-increment counter and perform the login. Passwords can be handled similarly using the same formula, e.g. Password-1, Password-2&#8230; Password-N<\/p>\n<p><strong>TIP:<\/strong> When testing a system with an integrated database always monitor the Queue status under the AdminTool &gt; Dashboard. Slowness with DB transactions will show up in those queues.<\/p>\n<p><strong>2) Joining Rooms:\u00a0<\/strong>another problem is how to distribute clients to multiple Rooms. Suppose we have a game for 4 players\u00a0and we want to distribute a 1000 clients into Rooms for 4 users. A simple solution is to create this logic on the server side.<\/p>\n<p>The Extension will take a generic &#8220;join&#8221; request\u00a0and perform a bit of custom logic:<\/p>\n<ul>\n<li>search for a game Room with free slots:\n<ul>\n<li>if found it will join the user there<\/li>\n<li>otherwise it will create a new game Room and join the user<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>A similar logic <a title=\"AutojoinerExtension example\" href=\"http:\/\/smartfoxserver.com\/forums\/viewtopic.php?p=71861#p71861\" target=\"_blank\">has been discussed in details<\/a> in this post in our support forum.<\/p>\n<h3>\u00bb\u00a0Source files<\/h3>\n<p>The sources of the code discussed in this article are available for download as a zipped project for Eclipse. If you are using a different IDE you can unzip the archive and extract the source folder (<strong>src\/<\/strong>), the dependencies (<strong>sfs2x-api\/<\/strong>) and build a new project in your editor.<\/p>\n<p><a href=\"http:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2015\/02\/StressTestExample.zip\">Download the sources.<\/a><\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the questions that often pops up in our forums is &#8220;how do I run a stress test on my game&#8221;? There are several ways in which this can be done. A simple way to stress test your server side Extension is to build a client application that acts as a player, essentially a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[23],"tags":[33,12,34,7,35],"_links":{"self":[{"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/posts\/177"}],"collection":[{"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/comments?post=177"}],"version-history":[{"count":18,"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/posts\/177\/revisions"}],"predecessor-version":[{"id":200,"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/posts\/177\/revisions\/200"}],"wp:attachment":[{"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/media?parent=177"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/categories?post=177"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/tags?post=177"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}