Java application performance management

Milan Brankovic
6 min readJul 22, 2021


Building application and writing code is one thing. It is challenging, but also rewarding. Once you have your application finished you often forget to pay attention how does it behave under real world pressure. There are times when you wish to optimize some workflows or check why application is using so much memory or examine why application is becoming unresponsive; thus you start with performance management. There are two main influence factors:

  • memory consumption
  • total program runtime/application speed

The performance of a compiled Java program depends on how optimally its given tasks are managed by the host Java virtual machine (JVM), and how well the JVM exploits the features of the computer hardware and operating system in doing so. Thus, any Java performance test or comparison has to always report the version, vendor, OS and hardware architecture of the used JVM.

Let's check the different aspects how to asses and improve application performance.

Code compilation

It is quite revealing to watch the JIT compiler work while a Java application is running. By setting the flag -XX:+PrintCompilation we can enable some simple output regarding the bytecode to native code compilation.

Whenever a method is compiled, a line is printed to the output. Each line consists of (1) a number of milliseconds since VM started, (2) order in which method/code block is compiled, (3) compilation level cache, (4) method name.

If you want to see the same information on some remote machine where you cannot see the console output you can use the following command:

java -XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation <class name>

which will produce log file with compilation results in it.

One of the interesting things is to find the size of the code cache. This can be done by using -XX:+PrintCodeCache. The maximum number of code cache is dependent on which version of java you are using. We can change the code cache size with these flags:

  • -XX:InitialCodeCacheSize=n: initial size
  • -XX:ReservedCodeCacheSize=n: max size
  • -XX:CodeCacheExpansionSize=n: how quickly the cache can grow

the size can be in bytes, kilobytes (by adding suffix k), megabytes (by adding suffix m),…

There is also an application support for monitoring the code cache size. By using JConsole application you will get quite nice graph representation.

Additionally, there are two types of compiler optimizations which are part of JDK — a client-side offering (-client), and a VM tuned for server applications (-server). Although the Server and the Client VMs are similar, the Server VM has been specially tuned to maximize peak operating speed. The Client VM compiler serves as an upgrade for both the Classic VM and the just-in-time (JIT) compilers used by previous versions of the JDK. The Client VM compiler does not try to execute many of the more complex optimizations performed by the compiler in the Server VM, but in exchange, it requires less time to analyze and compile a piece of code. The Server VM contains an advanced adaptive compiler that supports many of the same types of optimizations performed by optimizing C++ compilers, as well as some optimizations that cannot be done by traditional compilers.

By typing -XX:+PrintFlagsFinal the command will print the values of all of the jvm configuration parameters and/or values. This list is rather huge, so you can pipe this command with grep in order to find the configuration you are interested in.

By running previous command you can see how many threads can be used for compilation (check CICompilerCount configuration property). You can increase or decrease the number of threads by setting CICompilerCount property. Minimum number of threads is two. There is one additional flag CompileThreshold which is quite interesting. The value set for this flag says it will need to do x number of calls to a method, in order to get it compiled. You can decrease this value and tweak it for your purposes.

How is memory organized?

Java separates its memory into two areas: heap and stack.


New objects are created and placed on the heap. Once your application has no reference anymore to an object the Java garbage collector is allowed to delete this object and remove the memory so that your application can use this memory again. The heap is logically divided in three spaces:

  • young generation: eden & survivor. All the new objects are allocated in this part of the memory. Whenever this memory gets filled, the garbage collection is performed
  • old generation: all of the long lived objects which have survived many rounds of minor garbage collection (cleanup of young generation space) are stored in this area. Whenever this memory gets filled, the garbage collection is performed.
  • PermGen/metaspace: special heap space separated from the main memory heap. The JVM keeps track of loaded class metadata in the PermGen alongside with all static content. This space is replaced with Metaspace from Java 1.8


Stack is where the method invocations and the local variables are stored. If a method is called then its stack frame is put onto the top of the call stack. The stack frame holds the state of the method and the values of all local variables. The method on the top of the stack is always the current running method for that stack. Threads have their own call stack.

Garbage collector

The JVM automatically re-collects the memory which is not used any more. The memory for objects which are not referenced any more will be automatically released by the garbage collector. The garbage collector is doing sweeping from time to time automatically.

Memory settings

The PermGen space size can be set with the following flags:


Strings are placed inside the string pool (which is part of the heap). The string pool is implemented as hash map. If the size of the bucket is too small or too big, querying for specific string can be inefficient. We can use -XX:+PrintStringTableStatistics flag to see the usage of string pool. In order to change string pool size we can use the following option -XX:StringTableSize=n which will set how big string table will be (how many buckets will be there). The value which should be set should be prime number. But as string pool lives on the heap we must ensure that the heap size is big enough to hold other objects (except strings) as well. We can set heap size with

-XX:InitialHeapSize=n   (alternative -Xms)
-XX:MaxHeapSize=n (alternative -Xmx)

In order to see the graphical representation and monitor heap size usage over time you can use (J)VisualVM.

Another useful properties which can be set are:


This will generate heap dump, and write it to the file, once OutOfMemoryException is thrown. This is useful when checking memory leaks. The same dump can be fetched from VisualVM tool. In order to analyze heap dump you can use this tool.

Garbage collector tuning

In order to monitor how often the garbage collection is taking place you can use the following argument:


As mentioned previously heap is separated in several spaces which are dynamically sized. There is an argument which can turn off automated heap allocation sizing


This flag is enabled by default.

You can use the following flags to fine tune the heap size

  • setting how much bigger old generation will be compared to young generation space
  • setting how much of the young generation space should be taken up by survivor spaces
  • setting how many generation the object needs to survive before becoming part of the old generation space (max value is 15)

There are three types of garbage collectors:

  • serial: all of the threads are paused in order to enable garbage collection to run
  • parallel: will perform garbage collection in parallel. This is the default option for Java8 and below
  • mostly concurrent: close to real-time garbage collection where the application is not paused while the garbage collection is taking place. There are two sub-types of this type:

In Java9 MarkSweepGC was used as default, and from Java10 G1GC is chosen as default. This garbage collector has been available in Java8, but the performance of it was drastically improved in Java10.

Assessing performance