MMO idle disconnection error

Post here your questions about SFS2X. Here we discuss all server-side matters. For client API questions see the dedicated forums.

Moderators: Lapo, Bax

User avatar
moccha
Posts: 112
Joined: 13 Feb 2014, 16:09

MMO idle disconnection error

Postby moccha » 19 Jul 2021, 20:21

This issue happens when I create a server-side MMO room on the fly via getApi().createRoom. If the client does not send an entry point, the server outputs an error and causes disconnections. Steps I take are:

1. I login Player1 to a zone and join the user in an MMORoom created dynamically, named "Player1MMO".
2. I do not send a entry event for either client, which makes the timer for userMaxLimboSeconds start.
3. After about 50 seconds, the server tries to disconnect the user, but the server outputs and error:

Code: Select all

[https-jsse-nio-8443-exec-7] websocket.SFS2XWSService     - Error writing to client: { Id: 4, Type: WEBSOCKET, Logged: Yes, IP: 127.0.0.1:52842 }
java.lang.InterruptedException
   at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1367)
   at java.base/java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:415)
   at org.apache.tomcat.util.net.SocketWrapperBase.vectoredOperation(SocketWrapperBase.java:1451)
   at org.apache.tomcat.util.net.SocketWrapperBase.write(SocketWrapperBase.java:1403)
   at org.apache.tomcat.util.net.SocketWrapperBase.write(SocketWrapperBase.java:1374)
   at org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.doWrite(WsRemoteEndpointImplServer.java:91)
   at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.writeMessagePart(WsRemoteEndpointImplBase.java:494)
   at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.startMessage(WsRemoteEndpointImplBase.java:381)
   at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendBytesByCompletion(WsRemoteEndpointImplBase.java:151)
   at org.apache.tomcat.websocket.WsRemoteEndpointAsync.sendBinary(WsRemoteEndpointAsync.java:65)
   at sfs2x.ws.tomcat.websocket.SFS2XWSService._write(SFS2XWSService.java:456)
   at sfs2x.ws.tomcat.websocket.SFS2XWSService.write(SFS2XWSService.java:449)
   at com.smartfoxserver.bitswarm.websocket.tomcat.WebSocketBinaryProtocolCodec.onPacketWrite(WebSocketBinaryProtocolCodec.java:121)
   at com.smartfoxserver.bitswarm.core.BitSwarmEngine.writeToWebSocket(BitSwarmEngine.java:427)
   at com.smartfoxserver.bitswarm.core.BitSwarmEngine.write(BitSwarmEngine.java:408)
   at com.smartfoxserver.bitswarm.io.Response.write(Response.java:70)
   at com.smartfoxserver.v2.api.response.SFSResponseApi.notifyRoomRemoved(SFSResponseApi.java:144)
   at com.smartfoxserver.v2.api.SFSApi.removeRoom(SFSApi.java:389)
   at com.smartfoxserver.v2.api.SFSApi.removeRoom(SFSApi.java:371)
   at com.smartfoxserver.v2.entities.managers.SFSRoomManager.removeWhenEmpty(SFSRoomManager.java:633)
   at com.smartfoxserver.v2.entities.managers.SFSRoomManager.handleAutoRemove(SFSRoomManager.java:611)
   at com.smartfoxserver.v2.entities.managers.SFSRoomManager.removeUser(SFSRoomManager.java:512)
   at com.smartfoxserver.v2.entities.SFSZone.removeUserFromRoom(SFSZone.java:992)
   at com.smartfoxserver.v2.api.SFSApi.leaveRoom(SFSApi.java:1087)
   at com.smartfoxserver.v2.api.SFSApi.leaveRoom(SFSApi.java:1037)
   at com.smartfoxserver.v2.mmo.MMORoomCleaner.kickUserOut(MMORoomCleaner.java:69)
   at com.smartfoxserver.v2.mmo.MMORoomCleaner.run(MMORoomCleaner.java:51)
   at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
   at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
   at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
   at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)


The way the client responds to this error varies. For HTML5, a disconnection event is received by the client, but the server doesn't disconnect the user properly sometimes. For other clients, such as Flash, the user seems to be disconnected, but the client doesn't receive it and bugs out.

If I send an XY entry point when logging in, this issue does not occur. However, if there are multiple users logged in and any of the clients do not send an entry point, this server error happens, and causes all logged in users clients to bug out and/or disconnect.

I'm wondering if something is wrong with userMaxLimboSeconds, because when I try to set it to a lower value via setUserMaxLimboSeconds (such as 5 seconds), it still seems to wait 50 seconds before disconnecting the user. Here's what the config settings look like for the MMO room:

Code: Select all

cfg.setName(user.getName() + "MMO");
        cfg.setHidden(true);
        cfg.setDefaultAOI(new Vec3D(100,100,0));
        cfg.setUserMaxLimboSeconds(5);
        cfg.setProximityListUpdateMillis(500);
        cfg.setSendAOIEntryPoint(true);
        cfg.setMaxVariablesAllowed(7);
        cfg.setMaxUsers(8);
        cfg.setDynamic(true);
        cfg.setAutoRemoveMode(SFSRoomRemoveMode.WHEN_EMPTY);
        cfg.setRoomVariables(roomVars);
User avatar
Lapo
Site Admin
Posts: 22999
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: MMO idle disconnection error

Postby Lapo » 19 Jul 2021, 21:22

Hi,
what version of SFS2X are you running exactly?

3. After about 50 seconds, the server tries to disconnect the user, but the server outputs and error:

It's not disconnecting the user, it's just removing it from the MMORoom if no position is set for 50-60secs after joining. (I think the problem might be connected to the fact that a Task is calling a the Tomcat's write method, but then the Task ends, and the relative thread "dies" before the operation is complete).

Generally speaking you're supposed to set a position right after joining the Room otherwise there's no point in staying in the MMORoom. The position is necessary for the system to locate the User in 2D/3D space and determine which events the player can receive.

As regards setting the maxLimboSeconds property to a low value such as 5secs, I think the problem is that the resolution at which the timer (that checks the existence of a position) runs. It's not so high priority to run every second and I think it's set to run every minute.

Let us know.
Lapo
--
gotoAndPlay()
...addicted to flash games
User avatar
moccha
Posts: 112
Joined: 13 Feb 2014, 16:09

Re: MMO idle disconnection error

Postby moccha » 19 Jul 2021, 21:50

Lapo wrote:Hi,
what version of SFS2X are you running exactly?

3. After about 50 seconds, the server tries to disconnect the user, but the server outputs and error:

It's not disconnecting the user, it's just removing it from the MMORoom if no position is set for 50-60secs after joining. (I think the problem might be connected to the fact that a Task is calling a the Tomcat's write method, but then the Task ends, and the relative thread "dies" before the operation is complete).

Generally speaking you're supposed to set a position right after joining the Room otherwise there's no point in staying in the MMORoom. The position is necessary for the system to locate the User in 2D/3D space and determine which events the player can receive.

As regards setting the maxLimboSeconds property to a low value such as 5secs, I think the problem is that the resolution at which the timer (that checks the existence of a position) runs. It's not so high priority to run every second and I think it's set to run every minute.

Let us know.


I'm currently running 2.17.0.

I tested again and the issue also only appears when there's a Websocket client, which is what my HTML5 client is. The problem persists also when I don't set the timeout and leave it at the default value, but thanks for the tip about the minute interval.

The thing that is strange is that if another client is removed from a different room, unrelated to the HTML5 client, all HTML5 clients are still affected and are disconnected.

For example, if I connect with an HTML5 client and a Flash client and then the Flash client is idle too long, the "remove from room" code runs and the HTML5 client disconnects/crashes. I think there's just some sort of problem happening with the websocket. Maybe related to the room being removed (SFSRoomRemoveMode.WHEN_EMPTY) and the user being removed (limbo timeout) at the same time?
User avatar
Lapo
Site Admin
Posts: 22999
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: MMO idle disconnection error

Postby Lapo » 20 Jul 2021, 10:32

Thanks for the details, we'll investigate and let you know.

Cheers
Lapo

--

gotoAndPlay()

...addicted to flash games
User avatar
Lapo
Site Admin
Posts: 22999
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: MMO idle disconnection error

Postby Lapo » 20 Jul 2021, 14:39

I've tested the scenario and was not able to reproduce it.
Connecting a client via Websocket, joining MMORoom without setting any position. The result is that the client is auto-removed from the Room after a while (50 secs) but still connected until the idle timer kicks in and the inactive user is finally disconnected.

No errors found.
So the question now is, are you running any server side code in the Zone (or Room)? If so it looks like the problem is caused by a Scheduled Task. Can you let us know if you are running any and what they do?

Thanks
Lapo

--

gotoAndPlay()

...addicted to flash games
User avatar
moccha
Posts: 112
Joined: 13 Feb 2014, 16:09

Re: MMO idle disconnection error

Postby moccha » 21 Jul 2021, 20:07

I tried commenting out all of my code within the main server functions and it still appears to be happening. I am not running any scheduled tasks when this error happens. I have narrowed down the thing that triggers the issue for me to "setAutoRemoveMode(SFSRoomRemoveMode.WHEN_EMPTY)".

I create a room server-side, as follows:

Code: Select all

Room home = getApi().createRoom(user.getZone(), homeConfig(user), user);
getApi().joinRoom(user, home, null, false, null); // Home



And here is my homeConfig function:

Code: Select all

private CreateMMORoomSettings homeConfig(User user)
    {
        CreateMMORoomSettings cfg = new CreateMMORoomSettings();

        cfg.setName(user.getName() + "Home");
        cfg.setHidden(true);
        cfg.setDefaultAOI(new Vec3D(100,100,0));
        cfg.setProximityListUpdateMillis(500);
        cfg.setSendAOIEntryPoint(true);
        cfg.setMaxVariablesAllowed(7);
        cfg.setMaxUsers(8);
        cfg.setDynamic(true);
        cfg.setAutoRemoveMode(SFSRoomRemoveMode.WHEN_EMPTY);
        cfg.setRoomVariables(roomVars);
       
        return cfg;
    }


If I comment out "cfg.setAutoRemoveMode(SFSRoomRemoveMode.WHEN_EMPTY)", the server does not generate an error and my client functions as normal. The limbo timer removes them and nothing bad happens.

But when another non-websocket client has their room removed, it also causes the websocket issue and forces the websocket client to disconnect. On my websocket client it's receiving a "disconnect" event.

It seems like if there is a websocket client connected to the server when the WHEN_EMPTY code runs, but only if someone is kicked by the limbo timer, it causes an issue for websocket clients.

Another test: I tried changing the setUserMaxLimboSeconds to a very high value, but the code still disconnected me after 50 seconds.

Note: Perhaps this problem is caused by Java 11.
User avatar
moccha
Posts: 112
Joined: 13 Feb 2014, 16:09

Re: MMO idle disconnection error

Postby moccha » 22 Jul 2021, 15:54

Further testing has revealed the following error shows up after the first error I posted:

Code: Select all

Exception: java.lang.IllegalStateException
Message: The remote endpoint was in state [BINARY_FULL_WRITING] which is an invalid state for called method
Description: Error during websocket packet write


I'm under the impression that tomcat is writing a packet to the websocket client and then the "room kick" function (or whichever comes after) also tries to write and causes a packet write issue. You can see in the original trace log that there wasn't any errors caught from the extension, only backend stuff. If I make the non-websocket client disconnect in any other way, no error are thrown for websocket.

My fix for now is to set the rooms to NEVER_REMOVE and then in a server-side USER_LEAVE_ROOM handler:

Code: Select all

if(room.isDynamic() && room.isEmpty())
        {
            getApi().removeRoom(room);
        }
User avatar
Lapo
Site Admin
Posts: 22999
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: MMO idle disconnection error

Postby Lapo » 23 Jul 2021, 09:58

Thanks for the added details.
We have reproduced the issue using the same Room settings. The problem is caused by a Task that is responsible for removing users in "Limbo" (who joined but didn't send an initial position).

When the Room is destroyed (via the WHEN_EMPTY rule) the task is also interrupted, which in turn interrupts the Websocket work. In order to fix this issue we just changed the policy that stops the task. Instead of interrupting it immediately we let it finish whatever activity is going on. And the problem doesn't appear anymore.

If you want we can send an update that you can test on your side and confirm that it works for you as well. We'll then add the fix and a number of other changes to the next update.

To do that, send an email to our support@... email with a reference to this thread.

Cheers
Lapo

--

gotoAndPlay()

...addicted to flash games
User avatar
moccha
Posts: 112
Joined: 13 Feb 2014, 16:09

Re: MMO idle disconnection error

Postby moccha » 23 Jul 2021, 15:11

thank you. i'll have the guy who originally found the bug contact you. you can credit him too if you do that
User avatar
moccha
Posts: 112
Joined: 13 Feb 2014, 16:09

Re: MMO idle disconnection error

Postby moccha » 02 Sep 2021, 20:14

Hi again,

Was this fix patched into 2.17.3? If not, I will hold off on updating.
User avatar
Lapo
Site Admin
Posts: 22999
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: MMO idle disconnection error

Postby Lapo » 03 Sep 2021, 08:23

moccha wrote:Was this fix patched into 2.17.3? If not, I will hold off on updating.

Yes it is fixed in 2.17.3

Cheers
Lapo

--

gotoAndPlay()

...addicted to flash games

Return to “SFS2X Questions”

Who is online

Users browsing this forum: No registered users and 41 guests