Potential memory leak in websocket

Post here your questions about SFS2X. Here we discuss all server-side matters. For client API questions see the dedicated forums.

Moderators: Lapo, Bax

htbn_hoang
Posts: 38
Joined: 07 Oct 2013, 19:33

Potential memory leak in websocket

Postby htbn_hoang » 03 Dec 2017, 04:35

Hi,

Our production server keeps crashing every 7~10 days due to OutOfMemoryError. I quite believe there's a memory leak whenever we receive following exception
04:12:42,302 WARN [pool-1-thread-3] websocket.SFS2XWSService - I/O Exception writing text to websocket: org.eclipse.jetty.websocket.api.WebSocketException: RemoteEndpoint unavailable, current state [CLOSING], expecting [OPEN or CONNECTED], { Id: 12, Type: WEBSOCKET, Logged: Yes, IP: 14.169.41.202:58427 }
04:13:14,901 WARN [pool-1-thread-2] compress.CompressExtension -
java.io.EOFException: Connection has been closed locally
at org.eclipse.jetty.websocket.common.io.FrameFlusher.enqueue(FrameFlusher.java:339)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.outgoingFrame(AbstractWebSocketConnection.java:614)
at org.eclipse.jetty.websocket.common.extensions.AbstractExtension.nextOutgoingFrame(AbstractExtension.java:182)
at org.eclipse.jetty.websocket.common.extensions.compress.PerMessageDeflateExtension.nextOutgoingFrame(PerMessageDeflateExtension.java:116)
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension.access$1000(CompressExtension.java:42)
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension$Flusher.compress(CompressExtension.java:534)
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension$Flusher.deflate(CompressExtension.java:435)
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension$Flusher.process(CompressExtension.java:415)
at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241)
at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:224)
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension.outgoingFrame(CompressExtension.java:218)
at org.eclipse.jetty.websocket.common.extensions.ExtensionStack$Flusher.process(ExtensionStack.java:395)
at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241)
at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:224)
at org.eclipse.jetty.websocket.common.extensions.ExtensionStack.outgoingFrame(ExtensionStack.java:296)
at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.uncheckedSendFrame(WebSocketRemoteEndpoint.java:310)
at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendAsyncFrame(WebSocketRemoteEndpoint.java:240)
at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendStringByFuture(WebSocketRemoteEndpoint.java:412)
at sfs2x.websocket.SFS2XWSService$WSChannelListener.write(SFS2XWSService.java:369)
at com.smartfoxserver.bitswarm.websocket.jetty.WebSocketTextProtocolCodec.onPacketWrite(WebSocketTextProtocolCodec.java:106)
at com.smartfoxserver.bitswarm.core.BitSwarmEngine.writeToWebSocket(BitSwarmEngine.java:417)
at com.smartfoxserver.bitswarm.core.BitSwarmEngine.write(BitSwarmEngine.java:398)
at com.smartfoxserver.bitswarm.io.Response.write(Response.java:71)
at com.smartfoxserver.v2.api.response.SFSResponseApi.sendExtResponse(SFSResponseApi.java:91)
at com.smartfoxserver.v2.api.SFSApi.sendExtensionResponse(SFSApi.java:1515)
at com.smartfoxserver.v2.extensions.BaseSFSExtension.send(BaseSFSExtension.java:487)
at com.smartfoxserver.v2.extensions.BaseSFSExtension.send(BaseSFSExtension.java:462)
at com.exoty.event.EventHandler.test(EventHandler.java:416)
at com.exoty.event.EventHandler.lambda$test$1(EventHandler.java:420)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:299)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.base/java.lang.Thread.run(Thread.java:844)
04:13:52,447 WARN [qtp1638472859-62] compress.CompressExtension -
java.nio.channels.ClosedChannelException
at org.eclipse.jetty.io.WriteFlusher.onClose(WriteFlusher.java:498)
at org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:113)
at org.eclipse.jetty.io.ChannelEndPoint.close(ChannelEndPoint.java:118)
at org.eclipse.jetty.io.SelectChannelEndPoint.close(SelectChannelEndPoint.java:276)
at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.shutdownOutput(SslConnection.java:935)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.disconnect(AbstractWebSocketConnection.java:311)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.access$300(AbstractWebSocketConnection.java:61)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection$OnDisconnectCallback.writeFailed(AbstractWebSocketConnection.java:101)
at org.eclipse.jetty.websocket.common.io.FrameFlusher.notifyCallbackFailure(FrameFlusher.java:391)
at org.eclipse.jetty.websocket.common.io.FrameFlusher.enqueue(FrameFlusher.java:339)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.outgoingFrame(AbstractWebSocketConnection.java:614)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onConnectionStateChange(AbstractWebSocketConnection.java:473)
at org.eclipse.jetty.websocket.common.io.IOState.notifyStateListeners(IOState.java:184)
at org.eclipse.jetty.websocket.common.io.IOState.onReadFailure(IOState.java:498)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:666)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:511)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:202)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.base/java.lang.Thread.run(Thread.java:844)
04:15:42,300 WARN [qtp1638472859-45] compress.CompressExtension -
java.io.EOFException: Connection has been closed locally
at org.eclipse.jetty.websocket.common.io.FrameFlusher.enqueue(FrameFlusher.java:339)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.outgoingFrame(AbstractWebSocketConnection.java:614)
at org.eclipse.jetty.websocket.common.extensions.AbstractExtension.nextOutgoingFrame(AbstractExtension.java:182)
at org.eclipse.jetty.websocket.common.extensions.compress.PerMessageDeflateExtension.nextOutgoingFrame(PerMessageDeflateExtension.java:116)
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension.access$1000(CompressExtension.java:42)
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension$Flusher.compress(CompressExtension.java:534)
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension$Flusher.deflate(CompressExtension.java:435)
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension$Flusher.process(CompressExtension.java:415)
at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241)
at org.eclipse.jetty.util.IteratingCallback.succeeded(IteratingCallback.java:365)
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension$Flusher.writeSuccess(CompressExtension.java:557)
at org.eclipse.jetty.websocket.common.io.FrameFlusher.notifyCallbackSuccess(FrameFlusher.java:407)
at org.eclipse.jetty.websocket.common.io.FrameFlusher$Flusher.succeedEntries(FrameFlusher.java:242)
at org.eclipse.jetty.websocket.common.io.FrameFlusher$Flusher.succeeded(FrameFlusher.java:232)
at org.eclipse.jetty.io.WriteFlusher$PendingState.complete(WriteFlusher.java:269)
at org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:394)
at org.eclipse.jetty.io.ssl.SslConnection$1.run(SslConnection.java:101)
at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint$1.succeeded(SslConnection.java:292)
at org.eclipse.jetty.io.WriteFlusher$PendingState.complete(WriteFlusher.java:269)
at org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:394)
at org.eclipse.jetty.io.SelectChannelEndPoint$3.run(SelectChannelEndPoint.java:107)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.base/java.lang.Thread.run(Thread.java:844)


I've tried to reproduce this issue on our staging environment with following code

Code: Select all

private void handleUserJoinZone(ISFSEvent evt) {
   ....
   if (user.getSession().getType() == SessionType.WEBSOCKET) {
      SmartFoxServer.getInstance().getTaskScheduler().schedule(() -> this.test(user), 1, TimeUnit.SECONDS);
   }
}
         
private void test(User user) {
   List<Integer> list = IntStream.range(0, 1000).boxed().collect(Collectors.toList());

   ISFSObject data = new SFSObject();
   data.putIntArray("t", list);

   for (int i = 0; i < 1000; i++) {
      extension.send("hoang", data, user);
   }

   if (user.isConnected()) {
      SmartFoxServer.getInstance().getTaskScheduler().schedule(() -> this.test(user), 1, TimeUnit.SECONDS);
   }
}

I close browser tab to generate the exception. Please take a look at attachment for the heap dump.
heapdump.png
(103.51 KiB) Not downloaded yet


SmartfoxServer 2X v2.13
CentOS 7
JDK 9.0.1

I'm not sure if this issue is same as https://stackoverflow.com/questions/34578235/jetty-possible-memory-leak-when-using-websockets-and-bytebuffer

I've tried to upgrade jetty to Release 9.4.8.v20171121 but failed (cannot make webservice works).
So, what do you think about this issue? And how can I upgrade jetty to its latest version?

Thanks,
Hoang
User avatar
Lapo
Site Admin
Posts: 23008
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: Potential memory leak in websocket

Postby Lapo » 04 Dec 2017, 09:40

Hi,
the exception you have posted doesn't indicate an OOM error, in fact it is referring to a generic network error, like a sudden disconnection.
Do you have other log data that shows the OOM caused by Jetty?

What is the average CCU count in your server?
Does the problem appear after you reach a certain amount of CCU?
Can you describe what your application does? Is it a turn based/chat app? Or is it fast/realtime with high packets per second?

Have you tried adding more heap memory to the JVM?

As regards the example code you have sent I'd like to understand this better:
1- the client logs into the server
2- on successful login the server sends an array of 1000 Ints
3- the client receives the data
4- client close the browser tab

This causes an OOM?

Finally, SFS2X has not been tested using JDK 9 so I would recommend sticking with the default JRE 8 as much as possible, especially due to the many changes introduced in JDK 9.

Let us know
Lapo
--
gotoAndPlay()
...addicted to flash games
htbn_hoang
Posts: 38
Joined: 07 Oct 2013, 19:33

Re: Potential memory leak in websocket

Postby htbn_hoang » 04 Dec 2017, 10:08

Hi,

For the exception, I know that it happens because of network error. I make a test to check whether there's a memory leak or not. Every time I can produce this exception (closing browser tab), memory usage will go up and stay there.

On production, memory usage keeps going up hour after hour until the server crash. I'm not really sure but I think we have this issue with SFS 2X 2.13 even on JDK 8. We use SFS 2X for more than 4 years now and we only experience this issue recently.

For your questions:
- Our CCU is go up and down between 800 and 6500 on daily basic.
- Our server crash after ~10 days
- We use SFS for Card Game (game rule only) with simple chat service.
- Our client is written in Unity3D and we support iOS, Android and WebGL.
- Our heap configuration is -Xms3g -Xmx8g (I think it should be enough)

As regards the example code you have sent I'd like to understand this better:
1- the client logs into the server
2- on successful login the server sends an array of 1000 Ints
3- the client receives the data
4- client close the browser tab

This causes an OOM?


You are right about this, but you need to close browser tab while SFS is sending the data. Maybe you can increase to 10K or 100K Ints to make the test a bit easier.

Hope it help,
Hoang
User avatar
Lapo
Site Admin
Posts: 23008
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: Potential memory leak in websocket

Postby Lapo » 04 Dec 2017, 10:53

Thanks for the info.
From your heap dump screenshot it's not very clear what the problem might be. The most numerous objects in memory are Strings and byte[].

Following the Stacktrace article you posted and the links in it I am not too sure this is the same thing. Those bug reports say that there are lots of jetty object leaking into memory. In this case the jetty objects don't seem to be taking much memory. In fact in the screenshot they take ~2% of the heap space.

We'll try to reproduce the disconnection issue and see what happens.

Thanks
Lapo

--

gotoAndPlay()

...addicted to flash games
htbn_hoang
Posts: 38
Joined: 07 Oct 2013, 19:33

Re: Potential memory leak in websocket

Postby htbn_hoang » 04 Dec 2017, 11:36

Hi,

Just to be clear, the screenshot is taken from heap dump of my test, not from our production's. I think I generate 4 or 5 WebSocketException before taking heap dump.
User avatar
Lapo
Site Admin
Posts: 23008
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: Potential memory leak in websocket

Postby Lapo » 04 Dec 2017, 15:35

Okay.
More data from your production server would be very useful to understand what is going on.
Here's an article that could help gathering more info:
http://docs2x.smartfoxserver.com/Advanc ... ive-server

Thanks
Lapo

--

gotoAndPlay()

...addicted to flash games
User avatar
Lapo
Site Admin
Posts: 23008
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: Potential memory leak in websocket

Postby Lapo » 04 Dec 2017, 17:16

One more observation. In your test code you run this:

Code: Select all

   for (int i = 0; i < 1000; i++) {
      extension.send("hoang", data, user);
   }

But this is a non realistic use case. No one does this. You don't send data in a for loop ... so I don't understand what you're really testing here.

I would highly recommend to investigate the production server logs, especially what is causing the OOM and, if possible, a heap dump too.

Thanks
Lapo

--

gotoAndPlay()

...addicted to flash games
htbn_hoang
Posts: 38
Joined: 07 Oct 2013, 19:33

Re: Potential memory leak in websocket

Postby htbn_hoang » 04 Dec 2017, 17:42

I just want to send a huge data to client so that I can generate the exception easier.

Best,
User avatar
Lapo
Site Admin
Posts: 23008
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: Potential memory leak in websocket

Postby Lapo » 05 Dec 2017, 10:13

I understand the huge packet, but this is different.
You're literally bombarding the client with a massive packet flood. I am not surprised to see some kind of exception happening on the server side, because of course Jetty is force to write a bagillion packets per second, but this is no indication of a problem.

On the live/production server there's no such scenario, so it doesn't make much sense to run this test locally and then infer that the same is happening online.

In any case, I am not dismissing your report. I just want to better understand what's the origin of the OOM.
As I've asked before, haven't you investigated the logs to see the stack trace of the OOM? That would be a good place to start.

Thanks
Lapo

--

gotoAndPlay()

...addicted to flash games
htbn_hoang
Posts: 38
Joined: 07 Oct 2013, 19:33

Re: Potential memory leak in websocket

Postby htbn_hoang » 07 Dec 2017, 04:49

Hi,

I have a chance to take a heap dump from our production server today. Below are some useful information. Let me know if you need something else.

leak-overview.png
(21.08 KiB) Not downloaded yet

leak-detail.png
(35.79 KiB) Not downloaded yet

dominator-classes.png
(32.4 KiB) Not downloaded yet


Best,
Hoang
User avatar
Lapo
Site Admin
Posts: 23008
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: Potential memory leak in websocket

Postby Lapo » 07 Dec 2017, 08:56

Thanks.
I am not sure to interpret the data, though. The diagrams are showing that a certain amount of heap is used by jetty classes, but I am not sure why this should be considered a leak. What software are you using to create those graphs?

Are you monitoring the state of the JVM memory? You can do it via the AdminTool's Dashboard, just by keeping it open for, say, a couple of hours and see what diagram comes out. If there's a memory leak you should see a continuous increase in heap memory, even across multiple GC cycles.

Let us know
Lapo

--

gotoAndPlay()

...addicted to flash games
User avatar
Lapo
Site Admin
Posts: 23008
Joined: 21 Mar 2005, 09:50
Location: Italy

Re: Potential memory leak in websocket

Postby Lapo » 07 Dec 2017, 10:01

UPDATE:
I've run a simulation with 600 CCU running fast websocket messages and disconnecting and re-connecting them after a 50 message cycle.
This way a lot of sessions gets created and destroyed. Also I forced the heap size to 180MB to see how the GC behaves.

Monitoring the heap state doesn't show any anomalies: the GC successfully clears the memory when needed and there's no trace of leaks.
Take a look at the attached picture.

cheers
Attachments
heap.png
(79.75 KiB) Not downloaded yet
Lapo

--

gotoAndPlay()

...addicted to flash games
User avatar
rewb0rn
Posts: 177
Joined: 26 Jan 2009, 12:54
Location: Berlin, Germany

Re: Potential memory leak in websocket

Postby rewb0rn » 19 Dec 2017, 13:08

Hey htbn_hoang,

where exactly do you find the OutOfMemoryError, is it logged in the smartfox.log or do you store the console output somewhere? Our server also keeps dropping users massively after a couple of days, and it's only happening on SFS2X v2.13, not on v2.11 or v2.09 that we also use. I have not seen a OutOfMemoryError though, therefore my question where you have seen it.

Thanks,
Ruben

Return to “SFS2X Questions”

Who is online

Users browsing this forum: No registered users and 64 guests