Alpaquita Linux: Tuning JDK for resource constrained containers
1. Overview
Worldwide trend to stay with application/services in containerized environment is not going to slow and will continue to drive upcoming software and IT infrastructure updates.
When speaking about containers people always think about microservices, microcontainers, or minimized resource usage. It is all true to some extent and indeed helps to manage resources more wisely using small containers tuned for particular application and cgroups functionality for controlling resources utilized by running containers. Control groups is a Linux kernel feature that helps to organize processes into hierarchical groups with usage of various resources that can be limited and monitored. It is supported now in a lot of Linux distributions including newer ones like Alpaquita. Since Alpaquita Linux provides container management engines, users can choose either one and enable cgroups to manage resources on a container host machine.
2. Control groups overview
Control groups v1 and v2 continue to exist in parallel and both are supported. This article provides examples based on Control groups v1 as the most mature technology that works well with Podman and Docker. Control groups v1 implementation provides a resource specific controller hierarchy. In other words, each resource, such as CPU, memory, I/O, and so on, has its own control group hierarchy.
Let’s set up cgroups in Alpaquita Linux as an example. You can download Alpaquita Linux from the BellSoft website. It is a lightweight, small, and secure distribution tuned to run Java workloads. Any other Linux distribution is also suitable as long as it provides cgroups support and Podman/Docker packages.
sudo rc-update add cgroups
sudo rc-service cgroups start
The cgroups v1 can be set by editing /etc/rc.conf
, assigning the
legacy
option to rc_cgroup_mode
. If you choose to set up cgroups v2,
you should assign unified
option to rc_cgroup_mode
. You can enable
the controllers for cgroups v2 in the rc_cgroup_controllers
parameter
with the following options: cpuset cpu io memory hugetlb pids
.
Example of rc.conf
file modification:
# This sets the mode used to mount cgroups.
# "hybrid" mounts cgroups version 2 on /sys/fs/cgroup/unified and
# cgroups version 1 on /sys/fs/cgroup.
# "legacy" mounts cgroups version 1 on /sys/fs/cgroup
# "unified" mounts cgroups version 2 on /sys/fs/cgroup
rc_cgroup_mode="legacy"
You might need to run the following command for the changes to take effect or reboot the host:
sudo rc-service cgroups restart
For proper resource management with cgroups, use the corresponding OCI
runtime in the setup. We recommend setting up Podman’s crun
runtime as
the default runtime since it works perfectly with cgroups v1. Edit
/etc/containers/containers.conf
file and change Default OCI runtime
value to be always crun
.
Typical usage of resource management is to provide corresponding options
in the podman run
command. Few suitable options are listed in the
table below that you can use right away in your experiments and setups.
Option | Description |
---|---|
–cgroup-conf=KEY=VALUE | When running on cgroups v2, specify the
cgroup file to write to and its value. For example
–cgroup-conf=memory.high=1073741824 sets the |
–cpu-period=limit | Set the CPU period for the Completely Fair Scheduler (CFS), which is a duration in microseconds. Once the container’s CPU quota is completely used, it will not be scheduled to run until the current period ends. Defaults value is 100000 microseconds. |
–cpu-quota=limit | Limit the CPU Completely Fair Scheduler (CFS) quota.Limit the container’s CPU usage. By default, containers run with the full CPU resource. The limit is a number in microseconds. If a number is provided, the container will be allowed to use that much CPU time until the CPU period ends (controllable via –cpu-period). |
–cpus=number | Number of CPUs. The default is 0.0 which means no limit. This is shorthand for –cpu-period and –cpu-quota, therefore the option cannot be specified with –cpu-period or –cpu-quota. |
–cpuset-cpus=number | CPUs in which to allow execution. Can be specified as a comma-separated list (e.g. 0,1), as a range (e.g. 0-3), or any combination thereof (e.g. 0-3,7,11-15). |
–memory, -m=number[unit] | Memory limit. A unit can be b (bytes), k (kibibytes), m (mebibytes), or g (gibibytes).Allows the memory available to a container to be constrained. |
For more information, see podman run
documentation.
The following is an example of resource assignment options for a running container.
Typical container hosts have a lot of memory and CPU power, however, it is quite common to set limits for running containers, so all containers get enough resources they need. Here is an example of Podman container execution with cpu=1 and mem=1G limits.
podman run --cpus=1 --memory=1G -it --rm docker.io/bellsoft/liberica-runtime-container:jdk-all-17-glibc catm/sys/fs/cgroup/memory/memory.limit_in_bytes
1073741824
podman run --cpus=1 --memory=1G -it --rm docker.io/bellsoft/liberica-runtime-container:jdk-all-17-glibc cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us
100000
But if you want to increase the CPU limit, update the option value accordingly.
podman run --cpus=2 --memory=1G -it --rm docker.io/bellsoft/liberica-runtime-container:jdk-all-17-glibc cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us
200000
3. JDK performance
When it comes to JDK we should consider some specifics to get the best of containerized applications or services. Even though it is all about microservices, microcontainers, and minimal resource usage, Java containers would suffer from overly limited resources provided to a running container. Let’s explore one example related to Java behavior under reduced CPU and memory budget. The following example is about garbage collection that influences overall performance results.
Java version
Modern JDKs are designed to be container environment aware and friendly. Certain JDK versions are able to detect enforced resource quotas with Linux control group (cgroup) support. At the moment all current JDK versions, such as Liberica JDK 17, Liberica JDK 11.0.16+, and Liberica JDK 8u372+ as well as newer JDK versions support both cgroups v1 and cgroups v2 configurations. This support allows to detect that certain resource quotas are set up when running in a container, so that those quotas can be used for Java operations. Resource limits affect the garbage collector type activated by a JVM, the sizes of thread pools, the default size of the heap, and so forth.
Even though JDK knows that it is running in a container, some default
options are not suitable for Java applications in a container. This is
why it makes sense to additionally tune those options, namely
-XX:InitialRAMPercentage
, -XX:MaxRAMPercentage
, and
-XX:MinRAMPercentage
.
These options are specified in percent, which is more preferable than
setting maximum and minimum heap size for an application via -Xmx
and
-Xms
options respectively. The -XX RAMPercentage
options can set the
heap size relative to the memory allocated to a container with the limit
parameters and be updated automatically on redeployment.
In other words starting a container with either of the following
commands makes it unnecessary to redefine -Xmx
and -Xms
in
Dockerfile if they were supplied in a container. The
-XX:MaxRAMPercentage
, and -XX:MinRAMPercentage
parameters set up in
a Dockerfile stay unchanged.
podman run --memory=1G -it --rm docker.io/bellsoft/liberica-runtime-container:jdk-all-17-glibc java -
XX:MaxRAMPercentage=50 -XX:MinRAMPercentage=50 -XX:+PrintFlagsFinal -version | grep MaxHeapSize
size_t MaxHeapSize = 536870912 {product} {ergonomic}
size_t SoftMaxHeapSize = 536870912 {manageable} {ergonomic}
podman run --memory=2G -it --rm docker.io/bellsoft/liberica-runtime-container:jdk-all-17-glibc java -
XX:MaxRAMPercentage=50 -XX:MinRAMPercentage=50 -XX:+PrintFlagsFinal -version | grep MaxHeapSize
size_t MaxHeapSize = 1067450368 {product} {ergonomic}
size_t SoftMaxHeapSize = 1067450368 {manageable} {ergonomic}
Garbage Collector behavior
Garbage collection is an important and inevitable part of JVM that has
effect on the overall performance of an application. It is very useful
to know which Garbage Collector (GC) is used by the JVM running in a
container. To check the type of GC, use the -Xlog:gc=info
command. For
example, when container limits the use to a single CPU, the Serial GC is
selected. If more than one CPU is active and sufficient memory (at least
2GB) is allocated to the container, the G1 GC is selected in a container
friendly Java version, such as version 11 or later. Note the selected GC
depending on the CPU settings in the following examples:
podman run --cpus=1 --memory=2G -it --rm docker.io/bellsoft/liberica-runtime-container:jdk-all-17-glibc java -Xlog:gc=info -version
[0.003s][info][gc] Using Serial
openjdk version "17.0.6" 2023-01-17 LTS
OpenJDK Runtime Environment (build 17.0.6+10-LTS)
OpenJDK 64-Bit Server VM (build 17.0.6+10-LTS, mixed mode)
podman run --cpus=2 --memory=2G -it --rm docker.io/bellsoft/liberica-runtime-container:jdk-all-17-glibc java -Xlog:gc=info -version
[0.003s][info][gc] Using G1
openjdk version "17.0.6" 2023-01-17 LTS
OpenJDK Runtime Environment (build 17.0.6+10-LTS)
OpenJDK 64-Bit Server VM (build 17.0.6+10-LTS, mixed mode)
Serial GC is the oldest garbage collection mechanism existing from the early days of Java. It can be suitable for memory and CPU constraint devices, but there will be long pauses in application work especially if significant amount of memory is involved. If you want your application or service to have lower response latency and to run longer without interruption, assign more resources to take advantage of better performing GC types. JDK can enable G1 GC automatically if resources reach certain limits. G1 garbage collector has more predictable pause time while achieving higher throughput.
Performance impact
Running G1 garbage collector provides performance benefits over Serial
GC. Benchmark results can help to see how it all works. The
single-threaded benchmark is suitable to run in both configurations
with --cpus=1
and --cpus=2
as long as the memory limit is the same.
For simplicity, we chose the JMH project and corresponding sample
benchmarks. The container with JDK 17 (Liberica JDK build 17.0.6+10-LTS)
was used to set up JMH and pre-built benchmarks in the default location.
You can build the jmh-samples
benchmarks from the Java Microbenchmark Harness (JMH) project.
Clone the repository and build the existing sample benchmarks as follows:
git clone https://github.com/openjdk/jmh
cd jhm
mvn clean verify
The container with JHM benchmarks was created, and we can run it multiple times to collect necessary data for comparison choosing one benchmark for this purpose since the complete set would execute for too long.
Since we want to see the difference in the single-threaded benchmark with different garbage collector types, 2GB of memory was specified as the least possible for G1 to be activated. Selecting either 1 CPU or 2 CPUs activates Serial GC or G1 respectively. Specify 2 CPUs to activate G1 Garbage Collector.
podman run --cpus=2 --memory=2G -it --rm 5d412591e12c java -jar /root/jmh/test/target/benchmarks.jar org.openjdk.jmh.samples.JMHSample_25_API_GA
Specify 1 CPU with the same memory for the JVM to use Serial GC.
podman run --cpus=1 --memory=2G -it --rm 5d412591e12c java -jar /root/jmh/test/target/benchmarks.jar org.openjdk.jmh.samples.JMHSample_25_API_GA
CPU option / GC type | Benchmark results |
---|---|
–cpus=1 –memory=2G, SerialGC | 646749.890 ops/s |
–cpus=2 –memory=2G, G1GC | 694589.369 ops/s |
The benchmark results show that G1 performs better than Serial GC. Assigning 2 CPUs even for a single-threaded applications or services can result in better performance. Overall performance difference is about 4-7% measured multiple times under different conditions. The absolute numbers can vary slightly depending on the system condition and CPU governor setup at the time.
Native Image
Liberica Native Image Kit (NIK) is based on GraalVM CE Open Source project and delivers a utility capable of converting your JVM-based application into a fully compiled native executable ahead-of-time under the closed-world assumption with an almost instant startup time. GraalVM is a high-performance JDK designed to accelerate Java application performance while consuming fewer resources. GraalVM offers two ways to run Java applications: on the HotSpot JVM with Graal just-in-time (JIT) compiler or as an ahead-of-time (AOT) compiled native executable. Liberica Native Image Kit (NIK) is delivered as a package that could be installed on your build host since it will not be needed at runtime once an application is built as a native image.
The GraalVM Community Edition (the current latest 22.3.2 version)
supports Serial GC and Epsilon GC that performs slower even if more
resources are assigned to a container with an application or service.
The upcoming GraalVM 23.x releases will include "parallel" GC
implementation that can be enabled using the
--gc=parallel at build time
option at the image build time, and the
number of worker threads can be set at runtime using the
-XX:ParallelGCWorkers
option. Worker threads are started early during
application startup.
For more information about Parallel GC for GraalVM, see Parallel garbage collector.
Performance impact
Lower pause times have effect on application performance. If a particular service needs to have the least possible response times, SerialGC is not your choice.
The new "parallel" GC implementation extends a variety of GC types available in GraalVM Community Edition and allows to improve response time.
Corresponding pause results were measured by natively compiling and running HyperAlloc benchmark from Heapothesys project. Numbers in the chart below are GC pause times in milliseconds. The benchmark was executed on Ubuntu, 8-core i7 CPU with 8 worker threads and incremental collection turned off.
The picture and calculated mean values show that "parallel" GC implementation provides less pause time, thus improving performance and latency times for application/service running in similar configurations with two or more available threads or CPU.
4. Conclusion
Containerized applications and services do not have to be always running with minimal resources assigned to a container unless it is strictly required. In many cases, Java applications or services perform better and have better response time if slightly more resources are allocated for a running container, so that a more suitable GC type is eventually used. We recommend having at least two CPU and 2GB memory limits set for a running container with Java workloads. In some cases, you might want to choose Native Image Kit to get faster startup and lower memory consumption at runtime, but this is a topic for another article.