现象

线上服务机器,没有生产日志,JSF接口状态正常

问题

通过jstack发现异常

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
"JSF-BZ-22000-22-T-552" #908 daemon prio=5 os_prio=0 tid=0x00007f925c02c000 nid=0x28dcf waiting for monitor entry [0x00007f8cc82c0000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.logging.log4j.core.appender.OutputStreamManager.writeBytes(OutputStreamManager.java:352)
- waiting to lock <0x00000006c09d6730> (a org.apache.logging.log4j.core.appender.OutputStreamManager)

// ...

"JSF-BZ-22000-22-T-335" #691 daemon prio=5 os_prio=0 tid=0x00007f9270016800 nid=0x28811 runnable [0x00007f9187af9000]
java.lang.Thread.State: RUNNABLE
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
- locked <0x00000006c056d8a0> (a java.io.BufferedOutputStream)
at java.io.PrintStream.write(PrintStream.java:480)
- locked <0x00000006c052f620> (a java.io.PrintStream)
at org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.java:53)
at org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStreamManager.java:250)
- locked <0x00000006c09d6730> (a org.apache.logging.log4j.core.appender.OutputStreamManager)
at org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager.java:283)
- locked <0x00000006c09d6730> (a org.apache.logging.log4j.core.appender.OutputStreamManager)
at org.apache.logging.log4j.core.appender.OutputStreamManager.drain(OutputStreamManager.java:343)
at org.apache.logging.log4j.core.layout.TextEncoderHelper.drainIfByteBufferFull(TextEncoderHelper.java:258)
- locked <0x00000006c09d6730> (a org.apache.logging.log4j.core.appender.OutputStreamManager)
at org.apache.logging.log4j.core.layout.TextEncoderHelper.writeAndEncodeAsMuchAsPossible(TextEncoderHelper.java:197)
at org.apache.logging.log4j.core.layout.TextEncoderHelper.encodeChunkedText(TextEncoderHelper.java:157)
- locked <0x00000006c09d6730> (a org.apache.logging.log4j.core.appender.OutputStreamManager)

Locked ownable synchronizers:
- <0x0000000703465430> (a java.util.concurrent.ThreadPoolExecutor$Worker)

可以看到几乎所有杰夫线程在都被阻塞在OutputStreamManager.writeBytes上,等待锁<0x0000000703465430>释放。

https://issues.apache.org/jira/browse/LOG4J2-880 上找到了答案,大概意思是
记录日志的时候,如果往控制台打印输出日志的话,会把日志写入缓存,控制台会从缓存中取,但比如控制台没取,这时候控制台会暂停输出,不从缓存中取东西,缓存内容就不会清,日志程序会一直往里写,直到写满,线程就会停止写入,等待缓存可用,表现在程序里,就是writeBytes函数不返回,持有的锁不释放。

部署在docker中的时候,因为docker容器会一直获取标准输出的内容,自己记录docker日志,但是当缓存中的东西比较多的时候,比如日志长度特别长,docker没办法及时清空缓存,也会导致log4j出现这个问题。

解决方案

生产系统中无需往标准输入输出写日志,将日志配置文件中ConsoleAppender组件去除,问题得以解决。