JVM Running Time
线程共享区域:
- Java对象实例存放在Heap中;
- 常量存放在方法区的常量;
- 池虚拟机加载的类信息、常量、静态变量、即时编译器编译后的代码等数据放在方法区。
线程独享区域:
- 栈是线程私有的,存放该方法的局部变量表 (基本类型、对象引用)、操作数栈、动态链接、方法出口等信息。
一个Java程序对应一个JVM,一个方法(线程)对应一个Java栈。
Heap
- 被所有线程共享的一块内存区域,在虚拟机启动时创建
- 用来存储对象实例
- 可以通过-Xmx和-Xms控制堆的大小
- OutOfMemoryError:当在堆中没有内存完成实例分配,且堆也无法再扩展时
Heap是垃圾收集器管理的主要区域。
新生代(New/Young):新建的对象都由新生代分配内存。
常常又被划分为Eden区和Survivor区。Eden空间不足时会把存活的对象转移到Survivor。
新生代的大小可由-Xmn控制,也可用-XX:SurvivorRatio控制Eden和Survivor的比例。旧生代/年老代(Old/Tenured):存放经过多次垃圾回收仍然存活的对象。
持久代(Permanent)在方法区,不属于Heap。
持久代:存放静态文件,如今Java类、方法等。持久代在方法区,对垃圾回收没有显著影响。
Direct Memory
特点:
- 直接内存并不是虚拟机运行的一部分,也不是Java虚拟机规范中定义的内存区域,但是这部分内存也被频繁使用
- NIO可以使用Native函数库直接分配堆外内存,堆中的DirectByteBuffer对象作为这块内存的引用进行操作
- 大小不受Java堆大小的限制,受本机 (服务器) 内存限制
Compile and Running
Java 代码的编译和执行包括了三个重要机制:
- Java源码编译机制(.java 源代码文件 -> .class 字节码文件)
- 类加载机制(ClassLoader)
- 类执行机制(JVM 执行引擎)
Compile
Java源代码是不能被机器识别的,需要先经过编译器编译成JVM可以执行的.class 字节码文件,再由解释器解释运行。
Java 源文件(.java) –> Java 编译器 –> Java 字节码文件 (.class) –> Java 解释器 –> 执行
- 字节码文件(.class)是平台无关的
- Java中字符只以一种形式存在:Unicode,字符转换发生在JVM和OS交界处(Reader/Writer)
- class文件组成:结构信息,元数据和方法信息
结构信息:包括class文件格式版本号及各部分的数量与大小的信息
元数据:对应于Java源码中声明与常量的信息。包含类/继承的超类/实现的接口的声明信息、域与方法声明信息和常量池
方法信息:对应Java源码中语句和表达式对应的信息。包含字节码、异常处理器表、求值栈与局部变量区大小、求值栈的类型记录、调试符号信息
ClassLoader
Java程序并不一个可执行文件,是由多个独立的类文件组成。这些类文件并非一次性全部装入内存,而是依据程序逐步载入。
JVM的类加载是通过 ClassLoader 及其子类来完成的,类的层次关系和加载顺序可以由下图来描述:
- Bootstrap ClassLoader
JVM的根ClassLoader,由C++实现
JVM启动时即初始化此ClassLoader
加载rt.jar,它包含了Java规范定义的所有接口以及实现
The VM implements the bootstrap class loader, which loads classes from the BOOTPATH, including for example rt.jar. For faster startup, the VM can also process preloaded classes via Class Data Sharing.
Class Data Sharing:Class data sharing (CDS) is a feature introduced in J2SE 5.0 that is intended to reduce the startup time for Java programming language applications, in particular smaller applications, as well as reduce footprint. When the JRE is installed on 32-bit platforms using the Sun provided installer, the installer loads a set of classes from the system jar file into a private internal representation, and dumps that representation to a file, called a “shared archive”.
Extension ClassLoader
加载Java扩展 API(lib/ext 中的类)
App ClassLoader
加载Classpath目录下定义的class
Custom ClassLoader
属于应用程序根据自身需要自定义的ClassLoader,如tomcat、jboss都会根据J2EE规范自行实现ClassLoader
加载过程中会先检查类是否被已加载,检查顺序是自底向上,从Custom ClassLoader到BootStrap ClassLoader逐层检查,只要某个classloader已加载就视为已加载此类,保证此类只所有ClassLoader加载一次。而加载的顺序是自顶向下,也就是由上层来逐层尝试加载此类。
QA:
- 什么是双亲委派机制?
JVM在加载类时默认采用的是双亲委派机制。通俗的讲,就是某个特定的类加载器在接到加载类的请求时,首先将加载任务委托给父类加载器,依次递归。如果父类加载器可以完成类加载任务,就成功返回;只有父类加载器无法完成此加载任务时,才自己去加载。
作用:
- 避免重复加载。
- 更安全。如果没有双亲委派,那么用户可用在自己的classpath编写了一个java.lang.Object的类,那就无法保证Object的唯一性。
类执行机制
Java字节码的执行是由JVM执行引擎来完成,流程图如下所示:
JVM是基于栈的体系结构来执行class字节码的。当线程创建后,都会产生程序计数器(PC)和栈(Stack)。
- 程序计数器存放下一条要执行的指令在方法内的偏移量。
- 栈中存放一个个栈帧,每个栈帧对应着每个方法的每次调用,而栈帧又是有局部变量区和操作数栈两部分组成,局部变量区用于存放方法中的局部变量和参数,操作数栈中用于存放方法执行过程中产生的中间结果。
涉及的技术点:
- 解释属于第一代JVM
- 即时编译JIT属于第二代JVM
- 自适应优化(目前 Sun 的 HotspotJVM 采用这种技术)则吸取第一代JVM和第二代JVM的经验,采用两者结合的方式(开始时,对所有的代码都采取解释执行的方式,并监视代码执行情况。对那些经常调用的方法启动一个后台线程,将其编译为本地代码,并进行优化。若方法不再频繁使用,则取消编译过的代码,仍对其进行解释执行)。
VM Class Loading
The VM is responsible for resolving constant pool symbols, which requires loading, linking and then initializing classes and interfaces.
We will use the term “class loading” to describe the overall(全部的) process of mapping a class or interface name to a class object, and the more specific terms loading, linking and initializing for the phases of class loading as defined by the JVMS(Java Virtual Machine Specification(说明,规范)).
The most common reason for class loading is during bytecode resolution, when a constant pool symbol in the classfile requires resolution. Java APIs such as Class.forName(), classLoader.loadClass(), reflection APIs, and JNI_FindClass can initiate class loading.
The VM itself can initiate class loading. The VM loads core classes such as java.lang.Object, java.lang.Thread, etc. at JVM startup. Loading a class requires loading all superclasses and superinterfaces. And classfile verification, which is part of the linking phase, can require loading additional classes.
The VM and Java SE class loading libraries share the responsibility for class loading. The VM performs constant pool resolution, linking and initialization for classes and interfaces. The loading phase is a cooperative(协同的,合作的) effort between the VM and specific class loaders (java.lang.classLoader).
Class Loading Phases
The load class phase(阶段) takes a class or interface name, finds the binary in classfile format, defines the class and creates the java.lang.Class object.
keyword: take,find,define,create
The load class phase can throw a NoClassDefFound error if a binary representation can not be found. In addition, the load class phase does format checking on the syntax of the classfile, which can throw a ClassFormatError or UnsupportedClassVersionError. Prior to completing loading of a class, the VM must load all of its superclasses and superinterfaces. If the class hierarchy has a problem such that this class is its own superclass or superinterface (recursively), then the VM will throw a ClassCircularityError. The VM also throws IncompatibleClassChangeError if the direct superinterface is not an interface, or the direct superclass is an interface.
The link class phase first does verification, which checks the classfile semantics(语义), checks the constant pool symbols and does type checking. These checks can throw a VerifyError. Linking then does preparation, which creates and initializes static fields to standard defaults and allocates method tables. Note that no Java code has yet been run. Linking then optionally does resolution of symbolic references.
keyword: Linking,verification,preparation
Class initialization runs the static initializers, and initializers for static fields. This is the first Java code which runs for this class. Note that class initialization requires superclass initialization, although(尽管) not superinterface initialization.
keyword: static fields
The JVMS specifies that class initialization occurs on the first “active use” of a class. The JLS allows flexibility in when the symbolic resolution step of linking occurs as long as we respect the semantics of the language, finish each step of loading, linking and initializing before performing the next step, and throw errors when programs would expect them.
For performance, the HotSpot VM generally waits until class initialization to load and link a class. So if class A references class B, loading class A will not necessarily cause loading of class B (unless required for verification). Execution of the first instruction that references B will cause initialization of B, which requires loading and linking of class B.
Class Loader Delegation
When a class loader is asked to find and load a class, it can ask another class loader to do the actual loading. This is called class loader delegation.
The first loader is an initiating loader, and the class loading that ultimately defines the class is called the defining loader. In the case of bytecode resolution, the initiating loader is the class loader for the class whose constant pool symbol we are resolving.
Class loaders are defined hierarchically and each class loader has a delegation parent. The delegation defines a search order for binary class representations.
The Java SE class loader hierarchy searches the bootstrap class loader, the extension class loader and the system class loader in that order.
The system class loader is the default application class loader, which runs “main” and loads classes from the classpath. The application class loader can be a class loader from the Java SE class loader libraries, or it can be provided by an application developer.
The Java SE class loader libraries implement the extension class loader which loads classes from the lib/ext directory of the JRE.
GC
Principle
GC将内存中不再被引用的对象进行回收。
由于GC需要消耗一些资源和时间,Java在对对象的生命周期特征进行分析后,按照新生代、旧生代的方式来对对象进行收集,以尽可能的缩短GC对应用造成的暂停。
- 对新生代的对象的收集称为Minor GC;
- 对旧生代的对象的收集称为Full GC;
- 程序中主动调用System.gc()的GC为Full GC。
Java垃圾回收是单独的后台线程GC执行的,自动运行无需显示调用。即使主动调用了java.lang.System.gc(),该方法也只会提醒系统进行垃圾回收,但系统不一定会回应,可能会不予理睬。
Memory Leak
一直持有不再使用对象的引用,造成对象不能被GC回收,无法释放内存空间。
满足这两个条件即可判定为内存泄漏:
- 对象是可达的
- 对象是无用的
常见原因:
- 全局集合
- 缓存
- ClassLoader
Optimize GC
Direction:
- 减少GC的频率尤其是Full GC的次数,过多的GC会占用很多系统资源影响吞吐量。特别要关注Full GC,因为它会对整个堆进行整理
- 优化JVM的参数,提高垃圾回收的速度,合理分配堆内存各部分的比例
The reason for triggering Full GC:
旧生代空间不足
尽量让对象在新生代被GC回收、不要创建过大的对象及数组避免直接在旧生代创建对象。
持久代(Pemanet Generation)空间不足
增大Perm Gen空间和避免太多静态对象。
System.gc()被显示调用
垃圾回收不要手动触发,尽量依靠JVM自身的机制。
从新生代晋升到旧生代的对象所需内存空间,大于旧生代剩余可用内存空间
控制好新生代和旧生代的比例,避免新生代晋升到旧生代时,旧生代内存不够用,进而触发Full GC回收内存。
QA:
堆内存比例不良设置会导致什么后果?
新生代设置过小:新生代GC次数非常频繁,增大系统消耗;二是导致大对象直接进入旧生代,占据了旧生代剩余空间。
新生代设置过大:是新生代设置过大会导致旧生代过小(堆总量一定),从而容易触发Full GC;新生代GC耗时大幅度增加,一般说来新生代占整个堆1/3比较合适。
Survivor设置过小:导致对象从Eden直接到达旧生代,降低了在新生代的存活时间。
Survivor设置过大:导致Eden过小,增加了GC频率。可用通过-XX:MaxTenuringThreshold=n来控制新生代存活时间,尽量让对象在新生代被回收。JVM可选的GC策略有哪些?
吞吐量优先:自行选择相应的GC策略及控制新生代与旧生代的大小比例,来达到吞吐量指标。这个值可由-XX:GCTimeRatio=n来设置。
暂停时间优先:自行选择相应的GC策略及控制新生代与旧生代的大小比例,尽量保证每次GC造成的应用停止时间都在指定的数值范围内完成。这个值可由-XX:MaxGCPauseRatio=n来设置。
JVM Configuration:
- 堆设置
-Xms: 堆初始大小
-Xmx: 堆最大大小
-XX:NewSize=n: 设置年轻代大小
-XX:NewRatio=n: 设置年轻代和年老代的比值。如果设为3,表示年轻代与年老代比值为1:3,年轻代占整个年轻代年老代和的1/4。
-XX:SurvivorRatio=n: 年轻代中Eden区与两个Survivor区的比值。注意Survivor区有两个。如果设为3,表示Eden:Survivor=3:2,一个Survivor区占整个年轻代的1/5。
-XX:MaxPermSize=n: 设置持久代大小
- 收集器设置
-XX:+UseSerialGC: 设置串行收集器
-XX:+UseParallelGC: 设置并行收集器
-XX:+UseParalledlOldGC: 设置并行年老代收集器
-XX:+UseParNewGC: 并发串行收集器
-XX:+UseConcMarkSweepGC: 设置并发收集器
- 垃圾回收统计信息
-XX:+PrintGC
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-Xloggc:filename
- 并行收集器设置
-XX:ParallelGCThreads=n: 设置并行收集器收集时使用的CPU数,并行收集线程数
-XX:MaxGCPauseMillis=n: 设置并行收集最大暂停时间
-XX:GCTimeRatio=n: 设置垃圾回收时间占程序运行时间的百分比。公式为 1/(1+n)
- 并发收集器设置
-XX:+CMSIncrementalMode: 设置为增量模式,适用于单CPU情况。
-XX:ParallelGCThreads=n: 设置并发收集器年轻代收集方式为并行收集时,使用的CPU数,并行收集线程数。
VM Lifecycle
The following sections gives an overview of the general purpose java launcher pertaining(与…有关的) to the lifecyle of the HotSpot VM.
- Launcher
- JNI_CreateJavaVM
- DestroyJavaVM
Launcher
There are several HotSpot VM launchers in the Java Standard Edition, the general purpose launcher typically used is the java command on Unix and on Windows java and javaw commands, not to be confused with javaws which is a network based launcher.
Parse the command line options, some of the command line options are consumed by the launcher itself, for example -client or -server is used to determine and load the appropriate VM library, others are passed to the VM using JavaVMInitArgs.
Establish the heap sizes and the compiler type (client or server) if these options are not explicitly specified on the command line.
Establishes the environment variables such as LD_LIBRARY_PATH and CLASSPATH.
If the java Main-Class is not specified on the command line it fetches the Main-Class name from the JAR’s manifest.
Creates the VM using JNI_CreateJavaVM in a newly created thread (non primordial thread).
Note: creating the VM in the primordial thread greatly reduces the ability to customize the VM, for example the stack size on Windows, and many other limitations.
Once the VM is created and initialized, the Main-Class is loaded, and the launcher gets the main method’s attributes from the Main-Class.
The java main method is then invoked in the VM using CallStaticVoidMethod, using the marshalled arguments from the command line. - main()真正开始被调用
Once the java main method completes, its very important to check and clear any pending exceptions that may have occurred and also pass back the exit status, the exception is cleared by calling ExceptionOccurred, the return value of this method is 0 if successful, any other value otherwise, this value is passed back to the calling process.
The main thread is detached using DetachCurrentThread, by doing so we decrement the thread count so the DestroyJavaVM can be called safely, also to ensure that the thread is not performing operations in the vm and that there are no active java frames on its stack.
JNI_CreateJavaVM
Ensures that no two threads call this method at the same time and that no two VM instances are created in the same process. Noting that a VM cannot be created in the same process space once a point in initialization is reached, “point of no return”. This is so because the VM creates static data structures that cannot be re-initialized, at this time.
Checks to make sure the JNI version is supported, and the ostream is initialized for gc logging. The OS modules are initialized such as the random number generator, the current pid, high-resolution time, memory page sizes, and the guard pages.
The arguments and properties passed in are parsed and stored away for later use. The standard java system properties are initialized.
The OS modules are further created and initialized, based on the parsed arguments and properties, are initialized for synchronization, stack, memory, and safepoint pages. At this time other libraries such as libzip, libhpi, libjava, libthread are loaded, signal handlers are initialized and set, and the thread library is initialized.
The output stream logger is initialized. Any agent libraries (hprof, jdi) required are initialized and started.
The thread states and the thread local storage (TLS), which holds several thread specific data required for the operation of threads, are initialized.
The global data is initialized as part of the I phase, such as event log, OS synchronization primitives, perfMemory (performance memory), chunkPool (memory allocator).
At this point, we can create Threads. The Java version of the main thread is created and attached to the current OS thread. However this thread will not be yet added to the known list of the Threads. The Java level synchronization is initialized and enabled.
The rest of the global modules are initialized such as the BootClassLoader, CodeCache, Interpreter, Compiler, JNI, SystemDictionary, and Universe. Noting that, we have reached our “point of no return”, ie. We can no longer create another VM in the same process address space.
The main thread is added to the list, by first locking the Thread_Lock. The Universe, a set of required global data structures, is sanity checked. The VMThread, which performs all the VM’s critical functions, is created. At this point the appropriate JVMTI events are posted to notify the current state.
The following classes java.lang.String, java.lang.System, java.lang.Thread, java.lang.ThreadGroup, java.lang.reflect.Method, java.lang.ref.Finalizer, java.lang.Class, and the rest of the System classes, are loaded and initialized. At this point, the VM is initialized and operational, but not yet fully functional.
The Signal Handler thread is started, the compilers are initialized and the CompileBroker thread is started. The other helper threads StatSampler and WatcherThreads are started, at this time the VM is fully functional, the JNIEnv is populated and returned to the caller, and the VM is ready to service new JNI requests.
DestroyJavaVM
Wait until we are the last non-daemon thread to execute, noting that the VM is still functional.
Call java.lang.Shutdown.shutdown(), which will invoke Java level shutdown hooks, run finalizers if finalization-on-exit.
Call before_exit(), prepare for VM exit run VM level shutdown hooks (they are registered through JVM_OnExit()), stop the Profiler, StatSampler, Watcher and GC threads. Post the status events to JVMTI/PI, disable JVMPI, and stop the Signal thread.
Call JavaThread::exit(), to release JNI handle blocks, remove stack guard pages, and remove this thread from Threads list. From this point on we cannot execute any more Java code.
Stop VM thread, it will bring the remaining VM to a safepoint and stop the compiler threads. At a safepoint, care should that we should not use anything that could get blocked by a Safepoint.
Disable tracing at JNI/JVM/JVMPI barriers.
Set _vm_exited flag for threads that are still running native code.
Delete this thread.
Call exit_globals(), which deletes IO and PerfMemory resources.
Return to caller.
Thread
- java.lang.Thread: 这个是Java语言里的线程类,由这个Java类创建的instance都会1:1映射到一个操作系统的osthread。
- JavaThread:JVM中C++定义的类,一个JavaThread的instance代表了在JVM中的java.lang.Thread的instance, 它维护了线程的状态,并且维护一个指针指向java.lang.Thread创建的对象(oop)。它同时还维护了一个指针指向对应的OSThread,来获取底层操作系统创建的osthread的状态。
- OSThread:JVM中C++定义的类,代表了JVM中对底层操作系统的osthread的抽象,它维护着实际操作系统创建的线程句柄handle,可以获取底层osthread的状态。
- VMThread:JVM中C++定义的类,这个类和用户创建的线程无关,是JVM本身用来进行虚拟机操作的线程,比如GC。
Thread Creation and Destruction
There are two basic ways for a thread to be introduced into the VM:
- execution of Java code that calls start() on a java.lang.Thread object
- attaching an existing native thread to the VM using JNI
There are a number of objects associated with a given thread in the VM:
- The java.lang.Thread instance that represents a thread in Java code
- A JavaThread instance that represents the java.lang.Thread instance inside the VM. It contains additional information to track the state of the thread. A JavaThread holds a reference to its associated java.lang.Thread object (as an oop), and the java.lang.Thread object also stores a reference to its JavaThread (as a raw int). A JavaThread also holds a reference to its associated OSThread instance.
- An OSThread instance represents an operating system thread, and contains additional operating-system-level information needed to track thread state. The OSThread then contains a platform specific “handle” to identify the actual thread to the operating system
JavaThread:HotSpot VM在内部管理Java线程执行状态的C++对象,里面会进一步引用一个OSThread对象,是HotSpot VM对底层操作系统线程的状态的抽象描述用的对象。这些管理用的对象都很小,可以忽略不计。
在每个平台上HotSpot VM对OSThread有特定的实现,里面会包含创建真正的平台线程(例如通过 pthread 在Linux上创建LWP)的逻辑,其中会传栈大小的参数下去来分配栈空间。例如说在Linux上现在HotSpot VM默认使用的栈大小是1MB,这个空间才是大头,远大于HotSpot VM用来管理线程用的那些小的C++对象的大小。
因此,在HotSpot VM上遇到 “unable to create new native thread” ,最主要的原因还就是无法分配Java线程的栈的情况。
When a java.lang.Thread is started the VM creates the associated JavaThread and OSThread objects, and ultimately the native thread. After preparing all of the VM state (such as thread-local storage and allocation buffers, synchronization objects and so forth) the native thread is started.
The native thread completes initialization and then executes a start-up method that leads to the execution of the java.lang.Thread object’s run() method, and then, upon its return, terminates the thread after dealing with any uncaught exceptions, and interacting with the VM to check if termination of this thread requires termination of the whole VM. Thread termination releases all allocated resources, removes the JavaThread from the set of known threads, invokes destructors for the OSThread and JavaThread and ultimately ceases execution when it’s initial startup method completes.
A native thread attaches to the VM using the JNI call AttachCurrentThread. In response to this an associated OSThread and JavaThread instance is created and basic initialization is performed.
Next a java.lang.Thread object must be created for the attached thread, which is done by reflectively invoking the Java code for the Thread class constructor, based on the arguments supplied when the thread attached. Once attached, a thread can invoke whatever Java code it needs to via the other JNI methods available.
Finally when the native thread no longer wishes to be involved with the VM it can call the JNI DetachCurrentThread method to disassociate it from the VM (release resources, drop the reference to the java.lang.Thread instance, destruct the JavaThread and OSThread objects and so forth).
A special case of attaching a native thread is the initial creation of the VM via the JNI CreateJavaVM call, which can be done by a native application or by the launcher (java.c). This causes a range of initialization operations to take place and then acts effectively as if a call to AttachCurrentThread was made. The thread can then invoke Java code as needed, such as reflective invocation of the main method of an application.
Thread States
The main thread states from the VM perspective are as follows:
- _thread_new: a new thread in the process of being initialized
- _thread_in_Java: a thread that is executing Java code
- _thread_in_vm: a thread that is executing inside the VM
- _thread_blocked: the thread is blocked for some reason (acquiring a lock, waiting for a condition, sleeping, performing a blocking I/O operation and so forth)
For debugging purposes additional state:
- MONITOR_WAIT: a thread is waiting to acquire a contended monitor lock
- CONDVAR_WAIT: a thread is waiting on an internal condition variable used by the VM (not associated with any Java level object)
- OBJECT_WAIT: a thread is performing an Object.wait() call
Internal VM Threads
- VM thread: This singleton instance of VMThread is responsible for executing VM operations, which are discussed below
- Periodic task thread: This singleton instance of WatcherThread simulates timer interrupts for executing periodic operations withinthe VM
- GC threads: These threads, of different types, support parallel and concurrent garbage collection
- Compiler threads: These threads perform runtime compilation of bytecode to native code
- Signal dispatcher thread: This thread waits for process directed signals and dispatches them to a Java level signal handling method
All threads are instances of the Thread class, and all threads that execute Java code are JavaThread instances (a subclass of Thread).
The VM keeps track of all threads in a linked-list known as the Threads_list, and which is protected by the Threads_lock – one of the key synchronization locks used within the VM.
Reference
https://hacpai.com/article/1542904258051
http://openjdk.java.net/groups/hotspot/docs/RuntimeOverview.html
https://www.zhihu.com/question/64685291
https://blog.csdn.net/justloveyou_/article/details/54347954
关于JVM的类型和模式:https://my.oschina.net/itblog/blog/507822
初探JAVA 10之CDS:https://www.jianshu.com/p/890196bf529a
Java 并发:Thread 类深度解析: https://blog.csdn.net/justloveyou_/article/details/54347954