Java JVM On Docker

Introduction

你是否曾经经历过在Docker中运行基于JVM的应用程序时出现“随机”故障?或者一些奇怪的死机?两者都有可能是由于Java 8中的糟糕的Docker支持引起。

Docker使用控制组(cgroups)来限制对资源的使用。在容器中运行应用程序时限制其对内存和CPU的使用绝对是一个好主意,它可以防止应用程序占用全部可用的内存和/或CPU,因而导致在同一系统上运行的其他容器无法响应。限制资源的使用可以提高应用程序的可靠性和稳定性。它还为硬件容量的规划提供了依据。在像诸如Kubernetes或DC/OS这样的编排系统上运行容器时,这一点尤为重要。

JVM 可以“看到”系统上所有可用的内存和 CPU 内核,并保持与这些资源的一致。在默认情况下,JVM 会将 max heap size 设置为系统内存的 1/4,并将一些线程池个数(比如GC)设置为与物理 CPU 内核的数量一致。

本文中使用的是遵循GNU GPL v2 许可授权的OpenJDK官方Docker镜像。这里描述的对Docker的支持在Oracle Java SE 开发工具包(JDK)版本8的更新191中被引入。Oracle在2019年4月修改了Java 8更新的许可政策,自Java SE 8更新211后的商业使用不再免费。


Practice

我们一起来看看下面的例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import java.util.Vector;
public class MemoryEater
{
public static void main(String[] args)
{
Vector v = new Vector();
while (true)
{
byte b[] = new byte[1048576];
v.add(b);
Runtime rt = Runtime.getRuntime();
System.out.println( "free memory: " + rt.freeMemory() );
}
}
}

We run it on a system with 64GB of memory, so let’s check the default maximum heap size:

1
2
3
4
5
6
7
8
docker run -ti openjdk:8u181-jdk
java -XX:+PrintFlagsFinal -version | grep MaxHeap
uintx MaxHeapFreeRatio = 100
uintx MaxHeapSize := 16819159040 {product}

openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

As said — it’s 1/4 of physical memory — 16GB. What will happen if we limit the memory using docker cgroups? Let’s check:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
docker run -ti -m 512M openjdk:8u181-jdk
javac MemoryEater.java

Note: MemoryEater.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

java MemoryEater

free memory: 1003980048
free memory: 1003980048
free memory: 1003980048
[...]
free memory: 803562640
free memory: 802514048
free memory: 801465456
free memory: 800416864
Killed

The JVM process was killed. Since it was a child process — the container itself survived, but normally when java is the only process inside a container (with PID 1) the container will crash.

Let’s look into system logs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
dcos-agent-1 kernel: java invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
dcos-agent-1 kernel: java cpuset=eca214e0fcd4b245eecb2a80c05e9d7f8688fc36979c510d2fb9afab2ce55712 mems_allowed=0
dcos-agent-1 kernel: CPU: 6 PID: 4142 Comm: java Tainted: G ------------ T 3.10.0-693.17.1.el7.x86_64 #1
dcos-agent-1 kernel: Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
dcos-agent-1 kernel: Call Trace:
dcos-agent-1 kernel: [<ffffffff816a6071>] dump_stack+0x19/0x1b
dcos-agent-1 kernel: [<ffffffff816a1466>] dump_header+0x90/0x229
dcos-agent-1 kernel: [<ffffffff81187dc6>] ? find_lock_task_mm+0x56/0xc0
dcos-agent-1 kernel: [<ffffffff811f36a8>] ? try_get_mem_cgroup_from_mm+0x28/0x60
dcos-agent-1 kernel: [<ffffffff81188274>] oom_kill_process+0x254/0x3d0
dcos-agent-1 kernel: [<ffffffff812ba2fc>] ? selinux_capable+0x1c/0x40
dcos-agent-1 kernel: [<ffffffff811f73c6>] mem_cgroup_oom_synchronize+0x546/0x570
dcos-agent-1 kernel: [<ffffffff811f6840>] ? mem_cgroup_charge_common+0xc0/0xc0
dcos-agent-1 kernel: [<ffffffff81188b04>] pagefault_out_of_memory+0x14/0x90
dcos-agent-1 kernel: [<ffffffff8169f82e>] mm_fault_error+0x68/0x12b
dcos-agent-1 kernel: [<ffffffff816b3a21>] __do_page_fault+0x391/0x450
dcos-agent-1 kernel: [<ffffffff816b3b15>] do_page_fault+0x35/0x90
dcos-agent-1 kernel: [<ffffffff816af8f8>] page_fault+0x28/0x30
dcos-agent-1 kernel: Task in /docker/eca214e0fcd4b245eecb2a80c05e9d7f8688fc36979c510d2fb9afab2ce55712 killed as a result of limit of /docker/eca214e0fc
5e9d7f8688fc36979c510d2fb9afab2ce55712
dcos-agent-1 kernel: memory: usage 524180kB, limit 524288kB, failcnt 314788
dcos-agent-1 kernel: memory+swap: usage 1048576kB, limit 1048576kB, failcnt 6
dcos-agent-1 kernel: kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
dcos-agent-1 kernel: Memory cgroup stats for /docker/eca214e0fcd4b245eecb2a80c05e9d7f8688fc36979c510d2fb9afab2ce55712: cache:28KB rss:524152KB rss_huge
:0KB swap:524396KB inactive_anon:262176KB active_anon:261976KB inactive_file:8KB active_file:4KB unevictable:0KB
dcos-agent-1 kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
dcos-agent-1 kernel: [ 1400] 0 1400 4985 418 14 139 0 bash
dcos-agent-1 kernel: [ 4141] 0 4141 4956003 126966 606 137837 0 java
dcos-agent-1 kernel: Memory cgroup out of memory: Kill process 4162 (java) score 1012 or sacrifice child
dcos-agent-1 kernel: Killed process 4141 (java) total-vm:19824012kB, anon-rss:495748kB, file-rss:12116kB, shmem-rss:0kB

Failures like these can be very difficult to debug — there is nothing in the application logs. It can be especially difficult on managed systems like AWS ECS.

And how about CPUs? Let’s check it again running a small program which displays the number of available processors:

1
2
3
4
5
6
public class AvailableProcessors {
public static void main(String[] args) {
// check the number of processors available
System.out.println(""+Runtime.getRuntime().availableProcessors());
}
}

Let’s run it in a docker container with cpu number set to 1:

1
2
3
4
$ docker run -ti --cpus 1 openjdk:8u181-jdk
javac AvailableProcessors.java
java AvailableProcessors
12

Not good — there are 12 CPUs on this system indeed. So even the number of available processors is limited to 1, the JVM will try to use 12.

for example the GC threads number is set by this formula: On a machine with N hardware threads where N is greater than 8, the parallel collector uses a fixed fraction of N as the number of garbage collector threads. The fraction is approximately 5/8 for large values of N. At values of N below 8, the number used is N.

In our case:

1
2
java -XX:+PrintFlagsFinal -version | grep ParallelGCThreads
uintx ParallelGCThreads = 10 {product}

Solution

The new Java version (10 and above) docker support is already built-in. But sometimes upgrading is not an option — for example if the application is incompatible with the new JVM.

The good news: Docker support was also backported to Java 8. Let’s check the newest openjdk image tagged as 8u212.

We’ll limit the memory to 1G and use 1 CPU: docker run -ti –cpus 1 -m 1G openjdk:8u212-jdk

The memory:

1
2
3
4
5
6
7
8
9
10
java -XX:+PrintFlagsFinal -version | grep MaxHeap
uintx MaxHeapFreeRatio = 70 {manageable}
uintx MaxHeapSize := 268435456 {product}
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-8u212-b01-1~deb9u1-b01)
OpenJDK 64-Bit Server VM (build 25.212-b01, mixed mode)
# It’s 256M — exactly 1/4 of allocated memory.

java AvailableProcessors
1

Moreover, there are some new settings:

  • -XX:InitialRAMPercentage
  • -XX:MaxRAMPercentage
  • -XX:MinRAMPercentage

If for some reason the new JVM behaviour is not desired it can be switched off using -XX:-UseContainerSupport.

On Kuberntes
imits.cpu <==> –cpu-quota # docker inspect中的CpuQuota值
requests.cpu <==> –cpu-shares # docker inspect中的CpuShares值


Reference