{"id":1274,"date":"2020-03-02T11:24:04","date_gmt":"2020-03-02T11:24:04","guid":{"rendered":"https:\/\/smartfoxserver.com\/blog\/?p=1274"},"modified":"2020-03-02T17:30:20","modified_gmt":"2020-03-02T17:30:20","slug":"sfs2x-2-15-0-udp-update","status":"publish","type":"post","link":"https:\/\/smartfoxserver.com\/blog\/sfs2x-2-15-0-udp-update\/","title":{"rendered":"SFS2X 2.15.0, UDP update"},"content":{"rendered":"\n<p>SmartFoxServer 2.15.0 provides several improvements for scaling the UDP protocol to uber-high packet rates, which can be useful for real-time games with massive traffic that use very fast updates (e.g. 30 or 50 packet\/s).<\/p>\n\n\n\n<!--more-->\n\n\n\n<h2>\u00bb Background<\/h2>\n\n\n\n<div class=\"wp-block-image noShadow\"><figure class=\"alignright size-large is-resized\"><figure><img loading=\"lazy\" src=\"https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/Funnel.jpg\" alt=\"\" class=\"noShadow wp-image-1285\" width=\"100\" height=\"93\"><\/figure><\/figure><\/div>\n\n\n\n<p>During the month of February 2020 we were contacted by a customer, running a large online multiplayer game, reporting a potential <strong>bottleneck<\/strong> when scaling UDP traffic over a certain point. In particular the problem seemed related more with packet rates (or pps) rather than bandwidth.<\/p>\n\n\n\n<p>After some investigation we confirmed the customer&#8217;s findings that threads were blocking on the same <strong>DatagramChannel<\/strong> object, causing UDP writes to work serially rather than in parallel (a DatagramChannel is essentially a Java abstraction for a UDP connection between two endpoints).<\/p>\n\n\n\n<p>We found this aspect particularly surprising because of the general design of the Java NIO API and because it seems a largely overlooked &#8220;feature&#8221; of the Java non-blocking UDP implementation, even among experts (if you want to learn more details about <a rel=\"noreferrer noopener\" aria-label=\"this post from our forum (opens in a new tab)\" href=\"https:\/\/www.smartfoxserver.com\/forums\/viewtopic.php?p=92089#p92089\" target=\"_blank\">this post from our forum<\/a>).<\/p>\n\n\n\n<h2>\u00bb Multiple iterations<\/h2>\n\n\n\n<div class=\"wp-block-image is-style-circle-mask\"><figure class=\"alignright size-large\"><img loading=\"lazy\" width=\"100\" height=\"100\" src=\"https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/iterations.jpg\" alt=\"\" class=\"wp-image-1311\"\/><\/figure><\/div>\n\n\n\n<p>To fix this issue we had to dive deep into the guts of Java&#8217;s non-blocking API to see exactly which of the many possible solutions would be the best to get rid of potential bottlenecks.<\/p>\n\n\n\n<p>We stacked up the standard SFS2X implementation against multiple prototypes to see the differences:<\/p>\n\n\n\n<ol><li>Non-blocking UDP w\/ DatagramChannel cache (to avoid thread contention)<\/li><li>Non-blocking UDP w\/DatagramChannel cache + dedicated thread pool and message queue (implementing a more aggressive write-policy for UDP, compared to TCP)<\/li><li>Old school blocking UDP + dedicated thread pool and message queue<\/li><\/ol>\n\n\n\n<p>Running all these tests on a dedicated quad-core Xeon server we found that the standard SFS2X implementation (v2.14.0)  would max out at ~215Kpps while there was plenty of CPU available to keep going.<\/p>\n\n\n\n<ul><li><strong>Solution #1<\/strong> provided twice as much the throughput with still some spare CPU available (<a rel=\"noreferrer noopener\" aria-label=\"more details here (opens in a new tab)\" href=\"https:\/\/www.smartfoxserver.com\/forums\/viewtopic.php?p=92110#p92110\" target=\"_blank\">more details here<\/a>)<\/li><li><strong>Solution #2<\/strong> was able to max out all the CPU and push over 1 Million pps, which is the kind of result we were looking for (for more details check the last section of this article)<\/li><li><strong>Solution #3<\/strong> provided almost the same results as <strong>#2<\/strong> <\/li><\/ul>\n\n\n\n<p>At the end of all our tests, <strong>Solution #2<\/strong> emerged as the most efficient way to solve all throughput issues, via the dedicated thread pool and cache that can be fine tuned based on the use case and hardware available.<\/p>\n\n\n\n<h2>\u00bb Revamped server engine<\/h2>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"alignright size-large\"><figure><img loading=\"lazy\" width=\"100\" height=\"99\" src=\"https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/engine.jpg\" alt=\"\" class=\"wp-image-1313 noShadow\"><\/figure><\/figure><\/div>\n\n\n\n<p>As we&#8217;ve mentioned, <strong>SmartFoxServer 2.15<\/strong> comes with a dedicated thread pool and message queue for UDP communications alongside a new DatagramChannel queue.<\/p>\n\n\n\n<p>What does this all mean for developers? <\/p>\n\n\n\n<p>The good news is that you get all the benefits of the revamped UDP engine out of the box, without any extra requirements. By default the server will scale better than before without the need to tweak the configuration or updating your code.<\/p>\n\n\n\n<p>If you&#8217;re running a high traffic real-time game that relies on the UDP protocol you will be interested in learning about a few new settings available in SFS2X 2.15<\/p>\n\n\n\n<h3>New settings<\/h3>\n\n\n\n<p>We have introduced several new low-level settings that can be tweaked in certain scenarios. These settings can be added to <strong>SFS2X\/config\/core.xml<\/strong>:<\/p>\n\n\n\n<ul><li><strong>udpSocketWriterThreadPoolSize<\/strong>: sets the size of the dedicated UDP thread pool size. If not specified it uses the same value as&nbsp;<strong>socketWriterThreadPoolSize<\/strong><\/li><li><strong>udpSocketWriterQueueMaxSize<\/strong>: sets the size of the dedicated UDP message queue. Default size = 250000<\/li><li><strong>datagramChannelCacheSize<\/strong>: set the size of the DatagramChannel cache. Default value = 80<\/li><\/ul>\n\n\n\n<p>For high traffic real-time games with a high UDP packet rate you may want to fine tune the thread pool setting it to the number of cores available on your machine. <\/p>\n\n\n\n<p>In case you&#8217;re running a massively multi-core server (e.g. 32+ cores) you may want to also increase the cache size to something like <em>nThreads * 4<\/em>. <br>For instance for a 96-cores machine, with an extremely high UDP traffic,  you could set the UDP thread pool size to 64 and the cache to 64*4 = 256.<\/p>\n\n\n\n<h2>\u00bb Conclusions<\/h2>\n\n\n\n<p><strong>SmartFoxServer 2.15 <\/strong>comes as an update for any previous installation of <strong>2.14.x<\/strong>, which makes it particularly easy to upgrade existing setups. If you&#8217;re running a game based on the UDP protocol for thousands of players we highly recommend this update, otherwise it can be skipped for the time being.<\/p>\n\n\n\n<p>Finally, for those interested in the fine details, we provide below the results of our <strong>1M+ pps<\/strong> (1 million packets\/sec) stress test done with <strong>SFS2X 2.15<\/strong>, running on a relatively cheap quad core server.<\/p>\n\n\n\n<h3>2.15 Test results<\/h3>\n\n\n\n<ul><li><strong>Server hardware<\/strong>:<ul><li>Quad-core Xeon E3-1578L (w\/ hyper-threading) @2.0Ghz<\/li><li>32GB RAM<\/li><li>240GB SSD<\/li><li>10 Gbps network<\/li><li>SFS2X 2.15, JRE 8<\/li><\/ul><\/li><\/ul>\n\n\n\n<ul><li><strong>SmartFoxServer custom settings<\/strong>:<ul><li>Extension thread pool: 12<\/li><li>UDP thread pool: 8<\/li><li>DatagramChannel cache: 128<\/li><\/ul><\/li><\/ul>\n\n\n\n<ul><li><strong>Stress test parameters<\/strong>:<ul><li>Client packet rate: 30pps<\/li><li>Players per room: 16<\/li><li>Total CCU: 2200<\/li><li>Total Rooms: 138<\/li><li>Client generation speed: 30ms<\/li><\/ul><\/li><\/ul>\n\n\n\n<p>This means that every Room generates:<\/p>\n\n\n\n<p>16 players * 30 pps = <strong>480 pps<\/strong> (messages sent to server)<br>480 pps * 16 = <strong>7680 pps<\/strong> (updates sent back from server to single Room)<\/p>\n\n\n\n<p>Global outgoing packet rate is: 7680 pps * 138 Rooms = <strong>~1.07 Mpps <\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"800\" height=\"535\" src=\"https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-global.jpg\" alt=\"\" class=\"wp-image-1305\" srcset=\"https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-global.jpg 800w, https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-global-300x201.jpg 300w, https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-global-768x514.jpg 768w, https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-global-624x417.jpg 624w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"800\" height=\"535\" src=\"https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-queues.jpg\" alt=\"\" class=\"wp-image-1306\" srcset=\"https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-queues.jpg 800w, https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-queues-300x201.jpg 300w, https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-queues-768x514.jpg 768w, https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-queues-624x417.jpg 624w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"800\" height=\"489\" src=\"https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-bmon.jpg\" alt=\"\" class=\"wp-image-1307\" srcset=\"https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-bmon.jpg 800w, https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-bmon-300x183.jpg 300w, https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-bmon-768x469.jpg 768w, https:\/\/smartfoxserver.com\/blog\/wp-content\/uploads\/2020\/03\/udp-bench-bmon-624x381.jpg 624w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><\/figure>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>SmartFoxServer 2.15.0 provides several improvements for scaling the UDP protocol to uber-high packet rates, which can be useful for real-time games with massive traffic that use very fast updates (e.g. 30 or 50 packet\/s).<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[23],"tags":[34,117],"_links":{"self":[{"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/posts\/1274"}],"collection":[{"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/comments?post=1274"}],"version-history":[{"count":20,"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/posts\/1274\/revisions"}],"predecessor-version":[{"id":1343,"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/posts\/1274\/revisions\/1343"}],"wp:attachment":[{"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/media?parent=1274"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/categories?post=1274"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/smartfoxserver.com\/blog\/wp-json\/wp\/v2\/tags?post=1274"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}