Skip to main content

Dissecting Java object layout and sizeof() computation


This write up discusses how a rough mental calculation (close to accurate) can be performed to compute a java object size (shallow and deep).

Target Java Virtual Machine: Hotspot 1.7.0_25 64 bit and Hotspot 1.7.0_51 32 bit
Platform: Windows 7 64 bit,  Intel Core i5 Sandy Bridge
Code: Pick this single class from my GitHub repo: [JavaObjectLayout]


This code uses a third party library classmexer which uses Instrumentation to compute shallow and deep sizes of an object. The class I've written is to verify if this library is returning a correct size or not.

class-mexer answers "what" is the size of an object, it does not explain "why". This write-up is an attempt to find that reason. 

JavaObjectLayout uses sun.misc.Unsafe to sneak into an object's heap layout. The class is not general purpose but the approach I have applied can be followed to dissect any java object. You'd see use of absolute offsets to fetch a memory location. That's intentional and has been done to simplify demonstration.

I have tested the code on 32-bit and 64-bit Java 7 HotSpot VMs and all the results match with what classmexer is returning.

If you are seeing compiler warnings while using sun.misc.Unsafe, refer to this page: sun.misc.Unsafe access restrictions


Following layout is specific to Oracle's Hotspot but conceptually all vendor VMs (IBM, HP, Azul and others ...) must be following similar structures.
Generic Java Object Layout



A java object size can be defined as:

Size = (H + L1) or (H + L2)

where H can vary depending on VM is 32 bit or 64 bit and whether compressed pointers are in use if it was 64-bit VM.

Therefore "new Object()" will take: 

(no members, only header)

- 8 bytes on a 32-bit VM
- 12 + 4 = 16 bytes on a 64 bit VM with compressed Oops due to header and 4 bytes padding
- 16 bytes on a 64-bit VM with non-compressed Oops

And "new Integer(21)" will take:

8+ 4 + 4    (8 bytes header + 4 bytes for int data member + 4 bytes for padding) = 16
12 + 4       (12 bytes header + 4 bytes for int data member) = 16
16 + 4       (16 bytes header + 4 bytes for int data member + 4 bytes for padding) = 24

A java.lang.Integer has the largest size on a 64-bit VM in non-compressed mode.

To learn why did we apply padding in some cases and what are the rules for laying data members out in an object, keep reading...

 Header Size (in bytes)


Header size can vary  depending on VM architecture and use of compressed pointers as summarized in this table:

32 Bit JVM
(Reference size will be 4 bytes)
64 bit JVM using Compressed Oops explicitly enabled via:
-XX:+UseCompressedOops
(Reference size will be 4 bytes)
64 bit JVM using Compressed Oops disabled via:
-XX:-UseCompressedOops
(Reference size will be 8 bytes)
Non Array Object Header 8 (4+4) 12 (8+4) 16 (8+8)
Array Object Header 12 (4+4+4) 16 (8+4+4) 20 (8+8+4)

where e.g.

  • On a 32 bit VM, for a non-array, (4+4) means 4 bytes for class-ref, 4 for mark (hash, lock state etc). 
  • On a 32 bit VM, for an array, (4+4+4) means 4 bytes for class-ref, 4 for mark (hash, lock state etc) and 4 bytes for integer array length.
  • On a 64 bit VM with or without compressed pointers, mark is always 8 bytes for mark (for both non-arrays and arrays) 
  • On a 64 bit VM with compressed pointers, for an array, (8+4+4) means 8 bytes for mark, 4 bytes for class-ref, 4 bytes for array length. 
  • On a 64 bit VM without compressed pointers, mark and reference size will be 8 and array size will be 4 bytes (hence padding will be required even on empty arrays).


This article does not explain what goes inside a header. We'd discuss that some other time.

Padding and 8-byte boundary rule:


Padding means filling the space with bytes.

All java objects (plain refs and array refs) are aligned to an 8-byte boundary.

where N-bytes boundary is a length in bytes which is a multiple of N

Example:

so, 8-bytes boundary = length which is multiple of 8 , e.g. 8, 16, 24, ...

4-bytes boundary = length which is multiple of 4 , e.g. 4, 8, 12, 16, 20, ...

Therefore if an object does not end in an 8-bytes boundary, padding is applied which makes the object size equal to a multiple of 8.


Shallow size: when an object has member references but we don't compute size of them recursively.

In following example, object o1 has a reference R1. While computing o1's shallow size, R1 will not be resolved and only R1's ref size (4 or 8 bytes) will be added to sum.


Shallow size of o1 = header(o1) + bytes required for primitive data members P + reference size of R1 + required padding



Deep size: is when we consider size of references, their references and so on.

In following example, R1 and its children will be resolved recursively to compute the overall object size of o1.



Deep size of o1 = header(o1) + bytes required for primitive data members P + reference size of R1 + size of (R1) + required padding

Layout rules

An object member will either be a primitive or a reference type. The reference could itself be a plain reference or an array reference.







In all the examples:

<H>:  means a header field
<M>: a data or reference member
<P>: padding


Examples:

 

Lets start with simple objects first.

A java.lang.Integer


Integer obj = new Integer( 222 );

Running the test against 32 bit VM gives

Shallow size: 16
Deep size: 16
Shallow and deep layout for an Integer object are same.
-- Object header total 8 bytes --
<H>:  Mark:    (4 bytes)  0xaf41fc81                                 -- (4)
<H>:  Class-Ref:   (4 bytes)  0x37538978                        -- (4) 
<M>: 222 (4)                                                                     -- (4)
<P>: padding 4 bytes to allow integer object to end at an 8 byte boundary.  -- (4)
---------------------------------------------------------
total size = 4 + 4 + 4 + 4 = 16 bytes

In the above case we applied padding as otherwise object size was summing to 12 which is not a multiple of 8. We rounded the size to next multiple of 8 by padding extra 4 bytes.

Running against 64-bit compressed pointer mode (-XX:+UseCompressedOops)

Shallow size: 16
Deep size: 16
Shallow and deep layout for an Integer object are same.
-- Object header total 12 bytes --
<H>:  Mark:   (= 8 bytes)  0x1c3aacb401                        -- (8)
<H>:  Class-Ref:   (= 4 bytes)  0xef894ed8                     -- (4)
<M>: 222                                                                         -- (4)
----------------------------------------------------------
total size = 8 + 4 + 4 = 16

Running against 64-bit non-compressed mode (-XX:-UseCompressedOops)

Shallow size: 24
Deep size: 24
Shallow and deep layout for an Integer object are same.
-- Object header total 16 bytes --
<H>:  Mark:   (= 8 bytes)  0x5655d1b401                            -- (8)
<H>:  Class-Ref:   (= 8 bytes)  0x8009aca8                        -- (8)
<M>: 222                                                                             -- (4)
<P>: padding 4 bytes to allow integer object to end at an 8 byte boundary.   -- (4)
-----------------------------------------------
total size = 8 + 8 + 4 + 4 = 24

A java.lang.Double

 

Double obj = new Double( 4567889922.23344 );
Running against 64-bit compressed pointer mode (-XX:+UseCompressedOops)

Shallow size: 24
Deep size: 24
Shallow and deep layout for java.lang.Double object are same.
-- Object header total 12 bytes --
<H>:  Mark:   (= 8 bytes)  0x74c6fd6e01                              
-- (8)             
<H>:  Class-Ref:   (= 4 bytes)  0xef893f9f                         
-- (4)   
<P>: padding 4 bytes to align double value to 8 byte boundary 
-- (4)
<M>: 4.56788992223344E9                       
-- (8 bytes for double)
---------------------------------------------------------
total =  8 + 4+ 4 + 8 = 24

Above example shows the application of Rule 2 because we padded 4 bytes before laying out double. If you look at the code, double is being fetched at the exact memory offset after those padded bytes.

Running against 64-bit non-compressed mode (-XX:-UseCompressedOops)

Shallow size: 24
Deep size: 24
Shallow and deep layout for java.lang.Double object are same.
-- Object header total 16 bytes --
<H>:  Mark:   (= 8 bytes)  0x34287ca701      -- (8)  
<H>:  Class-Ref:   (= 8 bytes )  0x800930d8       -- (8)  
<M>: 4.56788992223344E9       -- (8)                        
---------------------------------------------------------
total =  8 + 8 + 8 = 24

Array types

 

 

Examples of arrays and complex structures

 

short[] array


Arrays in Java are objects.  A short[] array is not different.

short[ ] obj = new short [ ] { 23, 54, 7 };


Running the test against 32 bit VM gives

Shallow size: 24
Deep size: 24
-- short[] heap layout --
-- Array header total 12 bytes --
<H>:  Mark:   (= 4 bytes)  0xbe727381          -- (4)
<H>:  Class-Ref:   (= 4 bytes)  0x37af67d0
   -- (4)
<H>:  Array Size:   (= 4 bytes)  
3                  -- (4)
<M>: 23
                                                         -- (2 bytes for short)
<M>: 54                                                        
-- (2 bytes for short) 
 <M>: 7                                                          -- (2 bytes for short) 
 <P>: padding 6 bytes to align array object to 8 bytes boundary
--------------------------------------------------------------
total size = 4 + 4 + 4 + 2 + 2 + 2 + 6 = 24

Running against 64-bit compressed pointer mode (-XX:+UseCompressedOops)

Shallow size: 24
Deep size: 24
-- short[] heap layout --
-- Array header total 16 bytes --
<H>:  Mark:   (= 8 bytes)  0x429bc5f901            -- (8)
<H>:  Class-Ref:   (= 4 bytes)  0xef8801ea        -- (4)
<H>:  Array Size:   (= 4 bytes)  3                        -- (4)
<M>: 23                                                              -- (2 bytes for short)
<M>: 54                                                              -- (2 bytes for short)
<M>: 7                                                                -- (2 bytes for short)
<P>: padding 2 bytes to align array object to 8 bytes boundary
--------------------------------------------------------------
total size = 8 + 4 + 4 + 2 + 2 + 2 + 2 = 24



Running against 64-bit non-compressed mode (-XX:-UseCompressedOops)

Shallow size: 32
Deep size: 32
-- short[] heap layout --
-- Array header total 24 bytes --
<H>:  Mark:   (= 8 bytes)  0x429bc5f901
  -- (8)
<H>:  Class-Ref:   (= 8 bytes)  0x7fff0f50 -- (8) 
<H>:  Array Size:   (= 8 bytes)  3              -- (4)
<M>: 23                                                    -- (2 bytes for short)
 <M>: 54                                                   -- (2 bytes for short) 
<M>: 7                                                      -- (2 bytes for short)
<P>: padding 6 bytes to align array object to 8 bytes boundary
--------------------------------------------------------------
total size = 8 + 8 + 4 + 2 + 2 + 2 + 6 = 32 bytes


java.lang.String

 

Strings in Java 7 onwards have two integer fields and one char[] array for data. So the structure conceptually looks like:

struct String
{
   int hash;
   int hash 32;
   char[] data;
}


String obj = "enlightenment";

Running against 64-bit compressed pointer mode (-XX:+UseCompressedOops)

Shallow size: 24
Deep size: 72
String is --> char[]: enlightenment, hashCode:-258879020, hash32:0

-- Shallow layout --

-- Object header total 12 bytes --
<H>:  Mark:   (= 8 bytes)  0x3661853301                           -- (8)
<H>:  Class-Ref:   (= 4 bytes)  0xef8816d8                         -- (4)
<M>: 0    (hash32 laid out first but not always necessary)  -- (4)
<M>: -258879020  (hash)                                                    -- (4)
<M>: member char[] array ref size (4 bytes): 0x0               -- (4)
----------------------------------------------------------------------
total size = 8 + 4 + 4 + 4 + 4 = 24

-- Deep layout --

-- Object header total 12 bytes --
 <H>:  Mark:   (= 8 bytes)  0x3661853301                                  -- (8)
<H>:  Class-Ref:   (= 4 bytes)  0xef8816d8                                -- (4)
<M>: 0                                                                                        -- (4)
<M>: -258879020                                                                       -- (4)
<M>: member char[] array ref size (4 bytes): 0xf091d1d4         -- (4)

..member char[] array starts..

-- Array header total 16 bytes --
<H>:  Mark:   (= 8 bytes)  0x523e59ca01                                  -- (8)
<H>:  Class-Ref:   (= 4 bytes)  0xef8800ca                               -- (4)
<H>:  Array Size:   (= 4 bytes)  13                                             -- (4)
<M>: data[0] = e                                                                        -- (2 bytes for Java chars)
<M>: data[1] = n
<M>: data[2] = l
<M>: data[3] = i
<M>: data[4] = g
<M>: data[5] = h
<M>: data[6] = t
<M>: data[7] = e
<M>: data[8] = n
<M>: data[9] = m
<M>: data[10] = e
<M>: data[11] = n
<M>: data[12] = t                                                                      -- (13 chars in data[] array)
<P>: padding 6 bytes to allow char[] array to end at an 8 bytes boundary     -- (6)
----------------------------------------------------------------------------------
total size = 8 + 4 + 4 + 4 + 4 + (8 + 4 + 4 + 13 x 2 + 6) = 72

This is a good example of a complex object and shows appliaction of many layout rules including  Rule 1 and Rule 8 (coz array member object was aligned to 8 bytes), Rule 2 as integer members of String class were laid before array reference member in all cases.

 

 

 

Inheritence rules:


 

Examples:

 

 Lets define a simple hierarchy:

    static class SuperClass1
    {
       double d1 = 101.404;
    }

    static class Subclass1 extends SuperClass1
    {
       int i1 = 711;
       double d2 = 202.505;
       byte b1 = 2;
    }


Subclass1 obj = new Subclass1();

Running the test against 32 bit VM gives

Shallow size: 32
Deep size: 32


-- Heap layout --
-- Object header total 8 bytes --
<H>:  Mark:   (= 4 bytes)  0x74042101
             -- (4) 
<H>:  Class-Ref:   (= 4 bytes)  0x32b3e318     -- (4) 
<M>: 101.404                                                    -- (8 bytes for double)
 
..subclass starts next...
 

<M>: 202.505                                                    -- (8 bytes for double in subclass) 
<M>: 711                                                           -- (4) 
<M>: true                                                           -- (1 byte for boolean) 
<P>: 3 bytes to align entire object to an 8-bytes boundary    -- (3)
----------------------------------------------------------------------------------
total =  4 + 4 + 8 + 8 + 4 + 1 + 3 = 32

 Running against 64-bit compressed pointer mode (-XX:+UseCompressedOops)

Shallow size: 40
Deep size: 40


-- Heap layout --
-- Object header total 12 bytes --
<H>:  Mark:   (= 8 bytes)  0x4769baee01
                                   -- (8) 
<H>:  Class-Ref:   (= 4 bytes)  0xef8e4221                                -- (4)
<P>: padding 4 bytes to align for double in super class             -- (4)
<M>: 101.404                                                                             -- (8)
 
..subclass starts next...
 

<M>: 202.505                                                                             -- (8) 
<M>: 711                                                                                    -- (4)
<M>: true                                                                                    -- (1)
<P>: padding 3 bytes to align entire object to an 8-bytes boundary --(3) -----------------------------------------------------------------------------------------------------
total = 8 + 4 + 4 + 8 + 8 + 4 + 1 + 3 = 40

Above example shows application of Rule 2 and Rule 4 because members in a class/subclass were always aligned based on those rules.

Lets define another class hierarchy to verify our approach further..

    private static class SuperClass3
    {

         // no member field
    }

    private static class Subclass3 extends SuperClass3
    {
       int i1 = 711;
       double d2 = 202.505;
    }


Subclass3 obj = new Subclass3();

Running against 64-bit compressed pointer mode (-XX:+UseCompressedOops)

Shallow size: 24
Deep size: 24


-- Heap layout --
-- Object header total 12 bytes --
<H>:  Mark:   (= 8 bytes)  0x46f0bf3d01
                 -- (8) 
<H>:  Class-Ref:   (= 4 bytes)  0xef8e4571            -- (4)

.. no member in super class, subclass starts next...
 

<M>: 711                                                                 -- (4) -subclass int laid out first
<M>: 202.505                                                          -- (8)
-----------------------------------------------------------------------------
total =  8 + 4 + 4+ 8 = 24

Above example shows application of Rule 5 and 6 where JVM laid subclass integer before double as it saw an opportunity for space optimization there. Hotspot indeed is a really smart VM :)


Inner classes

 


Examples:

Static inner classes don't have any ref. to outer class instance and we have already covered static class examples. Non-static inner classes are special as they have a hidden ref. to enclosing context.

Following class is an non-static inner class of outer class JavaObjectLayout.

    private class Non_Static_Inner_Class
    {
       int i1 = 8179;
    }

Non_Static_Inner_Class obj = new JavaObjectLayout().new Non_Static_Inner_Class();

Running against 64-bit non-compressed pointer mode (-XX:-UseCompressedOops)

Shallow size: 32
Deep size: 48

-- Object header total 16 bytes --
<H>:  Mark:   (= 8 bytes)  0x7935c7b001
             -- (8) 
<H>:  Class-Ref:   (= 8 bytes)  0x80323ea0         -- (8) 
<M>: 8179                                                            -- (4)
<P>: 4 bytes padding to allow the enclosing-context-ref to begin at an 8 bytes boundary.--(4) <M>: enclosing-context-ref (8 bytes) : 0x1ff3         -- (8)
---------------------------------------------------------------------
shallow size = 8 + 8 + 4 + 4 + 8 = 32
deep size = 8 + 8 + 4 + 4 + 8 + (16)

For deep size calc, (16) is the size of enclosing context object of class JavaObjectLayout. That class does not have any data member, so only header size will be computed which in case of 64-bit non-compressed VM is 16 for non-array objects.

 

Side Notes and References

 

This page does not discuss all manifestations of objects. However the rules governing layout and size calculation remain same and can be applied to arbitrary object structures.

I referred to some other blogs for a better understanding of java object structure, here are they:

 * http://www.javamex.com/classmexer/
 * http://www.ibm.com/developerworks/opensource/library/j-codetoheap/index.html
 * http://javadetails.com/2012/03/21/java-array-memory-allocation.html


In my code, I've done all shallow and deep size computation via class-mexer API and confirmation via sun.misc.Unsafe.


Comments

Popular posts from this blog

C++11 std::thread Construction and Assignment Tricks

C++11 arrived with std::thread and related classes built within the standard. With the move semantics in place, the construction and assignment practices of 'std threads' deserve a small write up. Let's define a 'Runnable' first: class Runnable {    public:       virtual void operator ()() = 0;       virtual ~Runnable(){} }; class Task : Runnable {      public:          void operator ()()          {               //do sth here..          } };                                                 //later..      #include <thread>     Task runnable;     std::thread t1; //not attached to any thread                                                 std::thread t2 ( runnable ); //attached and running   Copy Construction of std::thread is not allowed            std::thread t3 ( t2 ); // compilation error            std::thread t4 = t2; // compilation error error : use of deleted function std::thread (const std::th

C++ logging using Apache Log4cxx on Linux

I'd demonstrate the set up of Apache log4cxx on Ubuntu 12.x in following steps: 1. Download all required packages 2. Build Log4cxx via Apache Maven 3. Use the libraries in a NetBeans C++ project on Ubuntu Linux. Reference URL: http://www.yolinux.com/TUTORIALS/Log4cxx.html This URL is quite detailed and explains things for other *nix operating systems as well. I wanted to start with minimum steps required, hence this post. I have a Windows 7 desktop and have Ubuntu 12.x 64-bit running on it via Oracle Virtualbox. So Ubuntu is running as a guest on my Windows 7 host. [The reference URL mentions different versions of  'libaprXX' libs but we have to use 'libapr1-dev' and 'libaprutil-dev', will see that later.] Prerequisites (install all of the following one by one) autoconf and automake --> sudo apt-get install autoconf automake libxml2 and libxml2-devel ( http://xmlsoft.org/ ) --> sudo apt-get install libxml2 libxml2-dev gmp (

..Where Apache Camel meets Spring JMS

This post is relevant for developers who are aware of Apache Camel and use its JMS component to consume from a JMS broker. We are only discussing the message consumption part here. I'm assuming you are conversant with Camel's terminology of routes, endpoints, processors and EIPs. If not, following is a good place to start: Link --->  Apache Camel Home   Ever wondered what happens underneath when a Camel consumer route is defined to take messages from a JMS endpoint ? This post is an attempt to explain how Camel integrates with JMS via Spring APIs and how the business logic defined in a Camel route is invoked ? Following is a simplified sequence diagram showing the set-up process which will eventually lead to message consumption from JMS broker. . Creation of JMS endpoints in a camel context is initiated by definition (via Java/XML DSL). A JMS endpoint takes a number of configurable attributes as seen here Camel JMS Component . As soon as context is bootstrappe