Sunteți pe pagina 1din 29

C++ vtables - Part 1 - Basics

(1204 words)Tue, Mar 1, 2016

In this mini post-series we’ll explore how clang implements vtables & RTTI. In this part
we’ll start with some basic classes and later on cover multiple inheritance and virtual
inheritance.

Please note that this mini-series will include some digging into the binary generated for
our different pieces of code via gdb. This is somewhat low-level(ish), but I’ll do all the
heavy lifting for you. I don’t believe many future posts will be this low-level.

Disclaimer: everything written here is implementation specific, may change in any future
version, and should not be relied on. We look into this for educational reasons only.

☑ I agree

cool, let’s start.

Part 1 - vtables - Basics


Estimated read time: ~15 minutes.

Let’s examine the following code:

#include <iostream>

using namespace std;

class NonVirtualClass {

public:

void foo() {}

};

class VirtualClass {

public:
virtual void foo() {}

};

int main() {

cout << "Size of NonVirtualClass: " << sizeof(NonVirtualClass) << endl;

cout << "Size of VirtualClass: " << sizeof(VirtualClass) << endl;

$ # compile and run main.cpp

$ clang++ main.cpp && ./a.out

Size of NonVirtualClass: 1

Size of VirtualClass: 8

NonVirtualClass has a size of 1 because in C++ class es can’t have zero size.
However, this is not important right now.

VirtualClass ’s size is 8 on a 64 bit machine. Why? Because there’s a hidden pointer


inside it pointing to a vtable . vtable s are static translation tables, created for each
virtual-class. This post series is about their content and how they are used.

To get some deeper understanding on how vtables look let’s explore the following code
with gdb to find out how the memory is laid out:

#include <iostream>

class Parent {

public:

virtual void Foo() {}

virtual void FooNotOverridden() {}


};

class Derived : public Parent {

public:

void Foo() override {}

};

int main() {

Parent p1, p2;


Derived d1, d2;

std::cout << "done" << std::endl;

$ # compile our code with debug symbols and start debugging using gdb

$ clang++ -std=c++14 -stdlib=libc++ -g main.cpp && gdb ./a.out

...

(gdb) # ask gdb to automatically demangle C++ symbols

(gdb) set print asm-demangle on

(gdb) set print demangle on

(gdb) # set breakpoint at main

(gdb) b main

Breakpoint 1 at 0x4009ac: file main.cpp, line 15.

(gdb) run

Starting program: /home/shmike/cpp/a.out


Breakpoint 1, main () at main.cpp:15

15 Parent p1, p2;

(gdb) # skip to next line

(gdb) n

16 Derived d1, d2;

(gdb) # skip to next line

(gdb) n

18 std::cout << "done" << std::endl;

(gdb) # print p1, p2, d1, d2 - we'll talk about what the output means soon

(gdb) p p1

$1 = {_vptr$Parent = 0x400bb8 <vtable for Parent+16>}

(gdb) p p2

$2 = {_vptr$Parent = 0x400bb8 <vtable for Parent+16>}

(gdb) p d1

$3 = {<Parent> = {_vptr$Parent = 0x400b50 <vtable for Derived+16>}, <No data

fields>}
(gdb) p d2

$4 = {<Parent> = {_vptr$Parent = 0x400b50 <vtable for Derived+16>}, <No data

fields>}

Here’s what we learned from the above:

 Even though the classes have no data members, there’s a hidden pointer to a
vtable;
 vtable for p1 and p2 is the same. vtables are static data per-type;
 d1 and d2 inherit a vtable-pointer from Parent which points to Derived ’s vtable;
 All vtables point to an offset of 16 (0x10) bytes into the vtable. We’ll also discuss
this later.

Let’s continue with our gdb session to see the contents of the vtables. I will use
the x command, which dumps memory to the screen. I ask it to print 300 bytes in hex
format, starting at 0x400b40. Why this address? Because above we saw that the vtable
pointer points to 0x400b50, and the symbol for that address is vtable for
Derived+16 (16 == 0x10).

(gdb) x/300xb 0x400b40

0x400b40 <vtable for Derived>: 0x00 0x00 0x00 0x00 0x00

0x00 0x00 0x00

0x400b48 <vtable for Derived+8>: 0x90 0x0b 0x40 0x00 0x00

0x00 0x00 0x00

0x400b50 <vtable for Derived+16>: 0x80 0x0a 0x40 0x00 0x00

0x00 0x00 0x00

0x400b58 <vtable for Derived+24>: 0x90 0x0a 0x40 0x00 0x00

0x00 0x00 0x00

0x400b60 <typeinfo name for Derived>: 0x37 0x44 0x65 0x72 0x69

0x76 0x65 0x64

0x400b68 <typeinfo name for Derived+8>: 0x00 0x36 0x50 0x61

0x72 0x65 0x6e 0x74

0x400b70 <typeinfo name for Parent+7>: 0x00 0x00 0x00 0x00

0x00 0x00 0x00 0x00

0x400b78 <typeinfo for Parent>: 0x90 0x20 0x60 0x00 0x00

0x00 0x00 0x00

0x400b80 <typeinfo for Parent+8>: 0x69 0x0b 0x40 0x00 0x00

0x00 0x00 0x00

0x400b88: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

0x400b90 <typeinfo for Derived>: 0x10 0x22 0x60 0x00 0x00

0x00 0x00 0x00

0x400b98 <typeinfo for Derived+8>: 0x60 0x0b 0x40 0x00 0x00

0x00 0x00 0x00


0x400ba0 <typeinfo for Derived+16>: 0x78 0x0b 0x40 0x00 0x00

0x00 0x00 0x00

0x400ba8 <vtable for Parent>: 0x00 0x00 0x00 0x00 0x00 0x00

0x00 0x00

0x400bb0 <vtable for Parent+8>: 0x78 0x0b 0x40 0x00 0x00

0x00 0x00 0x00

0x400bb8 <vtable for Parent+16>: 0xa0 0x0a 0x40 0x00 0x00

0x00 0x00 0x00

0x400bc0 <vtable for Parent+24>: 0x90 0x0a 0x40 0x00 0x00

0x00 0x00 0x00

...

Note: we’re looking at demangled symbols. If you really want to know, _ZTV is a prefix
for vtable, _ZTS is a prefix for type-string (name) and _ZTI is for type-info.

Here’s Parent ’s vtable layout:

Address Value Meaning


0x400ba8 0x0 top_offset (more on this later)
0x400bb0 0x400b78 Pointer to typeinfo for Parent (also part of the above memory dump)
0x400bb8 0x400aa0 Pointer to Parent::Foo() 1. Parent ’s _vptr points here.
0x400bc0 0x400a90 Pointer to Parent::FooNotOverridden() 2

Here’s Derived ’s vtable layout:

Address Value Meaning


0x400b40 0x0 top_offset (more on this later)
0x400b48 0x400b90 Pointer to typeinfo for Derived (also part of the above memory dump)
0x400b50 0x400a80 Pointer to Derived::Foo() 3. Derived ’s _vptr points here.
0x400b58 0x400a90 Pointer to Parent::FooNotOverridden() (same as Parent ’s)

1:

(gdb) # find out what debug symbol we have for address 0x400aa0

(gdb) info symbol 0x400aa0


Parent::Foo() in section .text of a.out

2:

(gdb) info symbol 0x400a90

Parent::FooNotOverridden() in section .text of a.out

3:

(gdb) info symbol 0x400a80

Derived::Foo() in section .text of a.out

Remember that the vtable pointer in Derived pointed to a +16 bytes offset into the
vtable? The 3rd pointer is the address of the first method pointer. Want the 3rd method?
No problem - add 2 * sizeof(void*) to vtable-pointer. Want the typeinfo record? jump
to the pointer before.

Moving on - what about the typeinfo records layout?

Parent ’s:

Address Value Meaning


0x400b78 0x602090 Helper class for type_info methods1
0x400b80 0x400b69 String representing type name2
0x400b88 0x0 0 meaning no parent typeinfo record

And here’s Derived ’s typeinfo record:

Address Value Meaning


0x400b90 0x602210 Helper class for type_info methods3
0x400b98 0x400b60 String representing type name4
0x400ba0 0x400b78 Pointer to Parent ’s typeinfo record

1:

(gdb) info symbol 0x602090

vtable for __cxxabiv1::__class_type_info@@CXXABI_1.3 + 16 in section .bss of

a.out
2:

(gdb) x/s 0x400b69

0x400b69 <typeinfo name for Parent>: "6Parent"

3:

(gdb) info symbol 0x602210

vtable for __cxxabiv1::__si_class_type_info@@CXXABI_1.3 + 16 in section .bss

of a.out

4:

(gdb) x/s 0x400b60

0x400b60 <typeinfo name for Derived>: "7Derived"

If you want to read more about __si_class_type_info you can find some info here, and
also here.

This exhausts my gdb skills, and also concludes this post. I assume some people will
find this too low-level, or maybe just unactionable. If so, I’d recommend skipping parts 2
and 3, jumping straight to part 4.

C++ vtables - Part 2 - Multiple


Inheritance
(934 words)Tue, Mar 8, 2016

The world of single-parent inheritance hierarchies is simpler for the compiler. As we saw
in Part 1, each child class extends its parent vtable by appending entries for each new
virtual method.

In this post we will cover multiple inheritance, which complicates things even when only
inheriting from pure-interfaces.

Let’s look at the following piece of code:


class Mother {

public:

virtual void MotherMethod() {}

int mother_data;

};

class Father {

public:

virtual void FatherMethod() {}

int father_data;

};

class Child : public Mother, public Father {

public:

virtual void ChildMethod() {}

int child_data;

};

Child ’s layout

_vptr$Mother

mother_data (+ padding)

_vptr$Father

father_data

child_data1
Note that there are 2 vtable pointers. Intuitively I’d expect either 1 or 3 pointers
( Mother , Father and Child ). In reality it’s impossible to have a single pointer (more on
this soon), and the compiler is smart enough to combine Child ’s vtable entries as a
continuation of Mother ’s vtable, thus saving 1 pointer.

Why can’t Child have one vtable pointer for all 3 types? Remember that a Child pointer
can be passed to a function accepting a Mother pointer or a Father pointer, and both
will expect the this pointer to hold the correct data in the correct offsets. These
functions don’t necessarily know of Child , and definitely shouldn’t assume that
a Child is really what’s underneath the Mother / Father pointer they have in their hands.

1 Unrelated to this topic, but interesting nontheless, is that child_data is actually placed
inside Father ’s padding. This is called ‘tail padding’, and might be the topic of a future
post.

Here’s the vtable layout:

Address Value Meaning

0x4008b8 0 top_offset (more on this later)

0x4008c0 0x400930 pointer to typeinfo for Child

0x4008c8 0x400800 Mother::MotherMethod() . _vptr$Mother points here.

0x4008d0 0x400810 Child::ChildMethod()

0x4008d8 -16 top_offset (more on this later)

0x4008e0 0x400930 pointer to typeinfo for Child

0x4008e8 0x400820 Father::FatherMethod() . _vptr$Father points here.

In this example, an instance of Child will have the same pointer when casted to
a Mother pointer. But when casting to a Father pointer the compiler calculates an offset
of the this pointer to point to the _vptr$Father part of Child (3rd field in Child ’s
layout, see table above).

In other words, for a given Child c; : (void*)&c != (void*)static_cast<Father*>(&c) .


Some people don’t expect this, and maybe some day this information will save you
some debugging time. I found it useful more than once.

But wait, there’s more.

What if Child decided to override one of Father ’s methods? Consider this code:
class Mother {

public:

virtual void MotherFoo() {}

};

class Father {

public:

virtual void FatherFoo() {}

};

class Child : public Mother, public Father {

public:

void FatherFoo() override {}

};

This gets tricky. A function may take a Father* argument and call FatherFoo() on it.
But if you pass a Child instance, it is expected to invoke Child ’s overridden method
with the correct this pointer. However, the caller doesn’t know it’s really holding
a Child . It has a pointer to a Child ’s offset where Father ’s layout is. Someone needs to
offset this , but how is it done? What magic does the compiler perform to get this to
work?

[Before we answer that, note that overriding one of Mother ’s methods is not really tricky
as the this pointer is the same. Child knows to read beyond the Mother vtable and
expects the Child methods to be right after that.]

Here’s the solution: the compiler creates a ‘thunk’ method that corrects this and then
calls the ‘real’ method. The address of the thunk method will sit
under Child ’s Father vtable, while the ‘real’ method will be under Child ’s vtable.

Here’s Child ’s vtable:


0x4008e8 <vtable for Child>: 0x00 0x00 0x00 0x00 0x00 0x00

0x00 0x00

0x4008f0 <vtable for Child+8>: 0x60 0x09 0x40 0x00 0x00

0x00 0x00 0x00

0x4008f8 <vtable for Child+16>: 0x00 0x08 0x40 0x00 0x00

0x00 0x00 0x00

0x400900 <vtable for Child+24>: 0x10 0x08 0x40 0x00 0x00

0x00 0x00 0x00

0x400908 <vtable for Child+32>: 0xf8 0xff 0xff 0xff 0xff

0xff 0xff 0xff

0x400910 <vtable for Child+40>: 0x60 0x09 0x40 0x00 0x00

0x00 0x00 0x00

0x400918 <vtable for Child+48>: 0x20 0x08 0x40 0x00 0x00

0x00 0x00 0x00

Which means:

Address Value Meaning

0x4008e8 0 top_offset (soon!)

0x4008f0 0x400960 typeinfo for Child

0x4008f8 0x400800 Mother::MotherFoo()

0x400900 0x400810 Child::FatherFoo()

0x400908 -8 top_offset

0x400910 0x400960 typeinfo for Child

0x400918 0x400820 non-virtual thunk to Child::FatherFoo()

Explanation: as we saw earlier, Child has 2 vtables - one used for Mother and Child ,
and the other for Father . In Father ’s vtable, FatherFoo() points to a thunk,
while Child ’s vtable points directly to Child::FatherFoo() .
And what’s in this thunk, you ask?

(gdb) disas /m 0x400820, 0x400850

Dump of assembler code from 0x400820 to 0x400850:

15 void FatherFoo() override {}

0x0000000000400820 <non-virtual thunk to Child::FatherFoo()+0>: push

%rbp

0x0000000000400821 <non-virtual thunk to Child::FatherFoo()+1>: mov

%rsp,%rbp
0x0000000000400824 <non-virtual thunk to Child::FatherFoo()+4>: sub

$0x10,%rsp
0x0000000000400828 <non-virtual thunk to Child::FatherFoo()+8>: mov

%rdi,-0x8(%rbp)

0x000000000040082c <non-virtual thunk to Child::FatherFoo()+12>: mov -

0x8(%rbp),%rdi

0x0000000000400830 <non-virtual thunk to Child::FatherFoo()+16>: add

$0xfffffffffffffff8,%rdi

0x0000000000400837 <non-virtual thunk to Child::FatherFoo()+23>: callq

0x400810 <Child::FatherFoo()>

0x000000000040083c <non-virtual thunk to Child::FatherFoo()+28>: add

$0x10,%rsp
0x0000000000400840 <non-virtual thunk to Child::FatherFoo()+32>: pop

%rbp

0x0000000000400841 <non-virtual thunk to Child::FatherFoo()+33>: retq

0x0000000000400842: nopw %cs:0x0(%rax,%rax,1)


0x000000000040084c: nopl 0x0(%rax)

Like we discussed - offsetting this and calling FatherFoo() . And by how much should
we offset this to get Child? top_offset !
[Please note that I personally think that the name non-virtual thunk is extremely
confusing as this is the entry in the virtual table to the virtual function. I’m not sure
what’s not virtual about it, but that’s just my opinion.]

Stay tuned for Part 3 - Virtual inheritance - where things get even funkier.

C++ vtables - Part 3 - Virtual Inheritance


(1763 words)Tue, Mar 15, 2016

In Part 1 and Part 2 of this series we talked about how vtables work in the simplest
cases, and then in multiple inheritance. Virtual inheritance complicates things even
further.

As you may remember, virtual inheritance means that there’s only one instance of a
base class in a concrete class. For example:

class ios ...

class istream : virtual public ios ...

class ostream : virtual public ios ...

class iostream : public istream, public ostream

If weren’t for the virtual keyword above, iostream would in fact have two instances
of ios , which may cause sync headaches and would just be inefficient.

To understand virtual inheritance we will investigate the following piece of code:

#include <iostream>

using namespace std;

class Grandparent {

public:

virtual void grandparent_foo() {}


int grandparent_data;

};

class Parent1 : virtual public Grandparent {

public:

virtual void parent1_foo() {}

int parent1_data;

};

class Parent2 : virtual public Grandparent {

public:

virtual void parent2_foo() {}

int parent2_data;

};

class Child : public Parent1, public Parent2 {

public:

virtual void child_foo() {}

int child_data;

};

int main() {

Child child;
}

Let’s explore child . I’ll start by dumping a whole lot of memory just where Child ’s
vtable begins like we did in previous posts and will then analyze the results. I suggest
quickly glazing over the output here and coming back to it as I reveal details below.

(gdb) p child

$1 = {<Parent1> = {<Grandparent> = {_vptr$Grandparent = 0x400998 <vtable for

Child+96>, grandparent_data = 0}, _vptr$Parent1 = 0x400950 <vtable for

Child+24>, parent1_data = 0}, <Parent2> = {_vptr$Parent2 = 0x400978 <vtable

for Child+64>, parent2_data = 4195888}, child_data = 0}


(gdb) x/600xb 0x400938

0x400938 <vtable for Child>: 0x20 0x00 0x00 0x00 0x00 0x00

0x00 0x00

0x400940 <vtable for Child+8>: 0x00 0x00 0x00 0x00 0x00

0x00 0x00 0x00

0x400948 <vtable for Child+16>: 0x00 0x0b 0x40 0x00 0x00

0x00 0x00 0x00

0x400950 <vtable for Child+24>: 0x70 0x08 0x40 0x00 0x00

0x00 0x00 0x00

0x400958 <vtable for Child+32>: 0xa0 0x08 0x40 0x00 0x00

0x00 0x00 0x00

0x400960 <vtable for Child+40>: 0x10 0x00 0x00 0x00 0x00

0x00 0x00 0x00

0x400968 <vtable for Child+48>: 0xf0 0xff 0xff 0xff 0xff

0xff 0xff 0xff

0x400970 <vtable for Child+56>: 0x00 0x0b 0x40 0x00 0x00

0x00 0x00 0x00


0x400978 <vtable for Child+64>: 0x90 0x08 0x40 0x00 0x00

0x00 0x00 0x00

0x400980 <vtable for Child+72>: 0x00 0x00 0x00 0x00 0x00

0x00 0x00 0x00

0x400988 <vtable for Child+80>: 0xe0 0xff 0xff 0xff 0xff

0xff 0xff 0xff

0x400990 <vtable for Child+88>: 0x00 0x0b 0x40 0x00 0x00

0x00 0x00 0x00

0x400998 <vtable for Child+96>: 0x80 0x08 0x40 0x00 0x00

0x00 0x00 0x00

0x4009a0 <VTT for Child>: 0x50 0x09 0x40 0x00 0x00 0x00

0x00 0x00

0x4009a8 <VTT for Child+8>: 0xf8 0x09 0x40 0x00 0x00 0x00

0x00 0x00

0x4009b0 <VTT for Child+16>: 0x18 0x0a 0x40 0x00 0x00 0x00

0x00 0x00

0x4009b8 <VTT for Child+24>: 0x98 0x0a 0x40 0x00 0x00 0x00

0x00 0x00

0x4009c0 <VTT for Child+32>: 0xb8 0x0a 0x40 0x00 0x00 0x00

0x00 0x00

0x4009c8 <VTT for Child+40>: 0x98 0x09 0x40 0x00 0x00 0x00

0x00 0x00

0x4009d0 <VTT for Child+48>: 0x78 0x09 0x40 0x00 0x00 0x00

0x00 0x00

0x4009d8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

0x4009e0 <construction vtable for Parent1-in-Child>: 0x20 0x00 0x00

0x00 0x00 0x00 0x00 0x00


0x4009e8 <construction vtable for Parent1-in-Child+8>: 0x00 0x00

0x00 0x00 0x00 0x00 0x00 0x00

0x4009f0 <construction vtable for Parent1-in-Child+16>: 0x50 0x0a

0x40 0x00 0x00 0x00 0x00 0x00

0x4009f8 <construction vtable for Parent1-in-Child+24>: 0x70 0x08

0x40 0x00 0x00 0x00 0x00 0x00

0x400a00 <construction vtable for Parent1-in-Child+32>: 0x00 0x00

0x00 0x00 0x00 0x00 0x00 0x00

0x400a08 <construction vtable for Parent1-in-Child+40>: 0xe0 0xff

0xff 0xff 0xff 0xff 0xff 0xff

0x400a10 <construction vtable for Parent1-in-Child+48>: 0x50 0x0a

0x40 0x00 0x00 0x00 0x00 0x00

0x400a18 <construction vtable for Parent1-in-Child+56>: 0x80 0x08

0x40 0x00 0x00 0x00 0x00 0x00

0x400a20 <typeinfo name for Parent1>: 0x37 0x50 0x61 0x72 0x65

0x6e 0x74 0x31

0x400a28 <typeinfo name for Parent1+8>: 0x00 0x31 0x31 0x47

0x72 0x61 0x6e 0x64

0x400a30 <typeinfo name for Grandparent+7>: 0x70 0x61 0x72 0x65

0x6e 0x74 0x00 0x00

0x400a38 <typeinfo for Grandparent>: 0x50 0x10 0x60 0x00 0x00

0x00 0x00 0x00

0x400a40 <typeinfo for Grandparent+8>: 0x29 0x0a 0x40 0x00

0x00 0x00 0x00 0x00

0x400a48: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

0x400a50 <typeinfo for Parent1>: 0xa0 0x10 0x60 0x00 0x00

0x00 0x00 0x00


0x400a58 <typeinfo for Parent1+8>: 0x20 0x0a 0x40 0x00 0x00

0x00 0x00 0x00

0x400a60 <typeinfo for Parent1+16>: 0x00 0x00 0x00 0x00 0x01

0x00 0x00 0x00

0x400a68 <typeinfo for Parent1+24>: 0x38 0x0a 0x40 0x00 0x00

0x00 0x00 0x00

0x400a70 <typeinfo for Parent1+32>: 0x03 0xe8 0xff 0xff 0xff

0xff 0xff 0xff

0x400a78: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

0x400a80 <construction vtable for Parent2-in-Child>: 0x10 0x00 0x00

0x00 0x00 0x00 0x00 0x00

0x400a88 <construction vtable for Parent2-in-Child+8>: 0x00 0x00

0x00 0x00 0x00 0x00 0x00 0x00

0x400a90 <construction vtable for Parent2-in-Child+16>: 0xd0 0x0a

0x40 0x00 0x00 0x00 0x00 0x00

0x400a98 <construction vtable for Parent2-in-Child+24>: 0x90 0x08

0x40 0x00 0x00 0x00 0x00 0x00

0x400aa0 <construction vtable for Parent2-in-Child+32>: 0x00 0x00

0x00 0x00 0x00 0x00 0x00 0x00

0x400aa8 <construction vtable for Parent2-in-Child+40>: 0xf0 0xff

0xff 0xff 0xff 0xff 0xff 0xff

0x400ab0 <construction vtable for Parent2-in-Child+48>: 0xd0 0x0a

0x40 0x00 0x00 0x00 0x00 0x00

0x400ab8 <construction vtable for Parent2-in-Child+56>: 0x80 0x08

0x40 0x00 0x00 0x00 0x00 0x00

0x400ac0 <typeinfo name for Parent2>: 0x37 0x50 0x61 0x72 0x65

0x6e 0x74 0x32


0x400ac8 <typeinfo name for Parent2+8>: 0x00 0x00 0x00 0x00

0x00 0x00 0x00 0x00

0x400ad0 <typeinfo for Parent2>: 0xa0 0x10 0x60 0x00 0x00

0x00 0x00 0x00

0x400ad8 <typeinfo for Parent2+8>: 0xc0 0x0a 0x40 0x00 0x00

0x00 0x00 0x00

0x400ae0 <typeinfo for Parent2+16>: 0x00 0x00 0x00 0x00 0x01

0x00 0x00 0x00

0x400ae8 <typeinfo for Parent2+24>: 0x38 0x0a 0x40 0x00 0x00

0x00 0x00 0x00

0x400af0 <typeinfo for Parent2+32>: 0x03 0xe8 0xff 0xff 0xff

0xff 0xff 0xff

0x400af8 <typeinfo name for Child>: 0x35 0x43 0x68 0x69 0x6c

0x64 0x00 0x00

0x400b00 <typeinfo for Child>: 0xa0 0x10 0x60 0x00 0x00

0x00 0x00 0x00

0x400b08 <typeinfo for Child+8>: 0xf8 0x0a 0x40 0x00 0x00

0x00 0x00 0x00

0x400b10 <typeinfo for Child+16>: 0x02 0x00 0x00 0x00 0x02

0x00 0x00 0x00

0x400b18 <typeinfo for Child+24>: 0x50 0x0a 0x40 0x00 0x00

0x00 0x00 0x00

0x400b20 <typeinfo for Child+32>: 0x02 0x00 0x00 0x00 0x00

0x00 0x00 0x00

0x400b28 <typeinfo for Child+40>: 0xd0 0x0a 0x40 0x00 0x00

0x00 0x00 0x00


0x400b30 <typeinfo for Child+48>: 0x02 0x10 0x00 0x00 0x00

0x00 0x00 0x00

0x400b38 <vtable for Grandparent>: 0x00 0x00 0x00 0x00 0x00

0x00 0x00 0x00

0x400b40 <vtable for Grandparent+8>: 0x38 0x0a 0x40 0x00 0x00

0x00 0x00 0x00

0x400b48 <vtable for Grandparent+16>: 0x80 0x08 0x40 0x00 0x00

0x00 0x00 0x00

Wow. That’s a lot of input. 2 new questions that immediately pop up: what’s a VTT and
what’s a construction vtable for X-in-Child ? We’ll answer these soon enough.

Let’s start with Child ’s memory layout:

Size Value

8 bytes _vptr$Parent1

4 bytes parent1_data (+ 4 bytes padding)

8 bytes _vptr$Parent2

4 bytes parent2_data

4 bytes child_data

8 bytes _vptr$Grandparent

4 bytes grandparent_data (+ 4 bytes padding)

Indeed, Child has only 1 instance of Grandparent . The non-trivial thing is that it is last
in memory even though it is topmost in the hierarchy.

Here’s the vtable layout:

Address Value Meaning

0x400938 0x20 (32) virtual-base offset (we’ll discuss this soon)

0x400940 0 top_offset
Address Value Meaning

0x400948 0x400b00 typeinfo for Child

0x400950 0x400870 Parent1::parent1_foo() . Parent1 ’s vtable pointer points here.

0x400958 0x4008a0 Child::child_foo()

0x400960 0x10 (16) virtual-base offset

0x400968 -16 top_offset

0x400970 0x400b00 typeinfo for Child

0x400978 0x400890 Parent2::parent2_foo() . Parent2 ’s vtable pointer points here.

0x400980 0 virtual-base offset

0x400988 -32 top_offset

0x400990 0x400b00 typeinfo for Child

0x400998 0x400880 Grandparent::grandparent_foo() . Grandparent ’s vtable pointer points here.

Above there’s a new concept - virtual-base offset . We’ll soon understand why it’s
there.

Let’s further explore these weird-looking construction tables. Here’s construction


vtable for Parent1-in-Child :

Value Meaning

0x20 (32) virtual-base offset

0 top-offset

0x400a50 typeinfo for Parent1

0x400870 Parent1::parent1_foo()

0 virtual-base offset

-32 top-offset

0x400a50 typeinfo for Parent1

0x400880 Grandparent::grandparent_foo()
At this point I think it would be clearer to describe the process rather than dump more
tables with random numbers on you. So here goes:

Imagine you’re Child . You are asked to construct yourself on a fresh new piece of
memory. Since you’re inheriting Grandparent directly (that’s what virtual-inheritance
means), first you will call its constructor directly (if it wasn’t virtual inheritance you’d
call Parent1 ’s constructor, which in turn would have called Grandparent ’s constructor).
You set this += 32 bytes , as this is where Grandparent ’s data sits, and you call the
constructor. Easy peasy.

Next it’s time to construct Parent1 . Parent1 can safely assume that by the time it
constructs itself Grandparent had already been constructed, so it can, for instance,
access Grandparent ’s data and methods. But wait, how can it know where to find this
data? It’s not even near Parent1 ’s variables!

Enters construction table for Parent1-in-Child . This table is dedicated to


telling Parent1 where to find the pieces of data it can access. this is pointing to
the Parent1 ’s data. virtual-base offset tells it where it can find Grandparent ’s data:
Jump 32 bytes ahead of this and you’ll find Grandparent ’s memory. Get it? virtual-
base offset is similar to top_offset but for virtual classes.

Now that we understand this, constructing Parent2 is basically the same, only
using construction table for Parent2-in-Child . And indeed, Parent2-in-Child has
a virtual-base offset of 16 bytes.

Take a moment to let all this info sink in. Are you ready to continue? Good.

Now let’s get back to that VTT thingy. Here’s the VTT layout:

Address Value Symbol Mean

0x4009a0 0x400950 vtable for Child + 24 Parent1 ’s entries in Child ’s

0x4009a8 0x4009f8 construction vtable for Parent1-in-Child + 24 Parent1 ’s methods in Paren

0x4009b0 0x400a18 construction vtable for Parent1-in-Child + 56 Grandparent's methods f

0x4009b8 0x400a98 construction vtable for Parent2-in-Child + 24 Parent2's methods in Pa

0x4009c0 0x400ab8 construction vtable for Parent2-in-Child + 56 `Grandparent’s methods for P

0x4009c8 0x400998 vtable for Child + 96 `Grandparent’s entries in Chi

0x4009d0 0x400978 vtable for Child + 64 `Parent2’s entries in Child’s v


VTT stands for virtual-table table, which means it’s a table of vtables. This is the
translation table that knows if, for example, a Parent1 ’s constructor is called for a
standalone object, for a Parent1-in-Child object, or for a Parent1-in-
SomeOtherObject object. It always appears immediately after the vtable for the compiler
to know where to find it, so there’s no need to keep another pointer in the objects
themselves.

Pheww… many details, but I think we covered everything I wanted to cover. In Part 4
we will talk about higher level details of vtables. Don’t miss it as it’s probably the most
important post in this series!

C++ vtables - Part 4 - Compiler-


Generated Code
(742 words)Tue, Mar 22, 2016

So far in this mini-series we learned how the vtables and typeinfo records are placed in
our binaries and how the compiler uses them. Now we’ll understand some of the work
the compiler does for us automatically.

Constructors
For any class’s constructor the following code is generated:

 Call parent(s) constructors if there are any;


 Set vtable pointer(s) if there are any;
 Initialize members according to initializer list;
 Execute code inside constructor’s brackets.

All of the above can happen without explicit code:

 Parent default constructors happen automatically unless otherwise specified;


 Members are default initialized unless they have a default value or an entry in the
initializer list;
 The entire constructor can be marked = default .
 Only the vtable assignment is always hidden.

Here’s an example:

#include <iostream>

#include <string>
using namespace std;

class Parent {

public:

Parent() { Foo(); }

virtual ~Parent() = default;

virtual void Foo() { cout << "Parent" << endl; }

int i = 0;

};

class Child : public Parent {

public:

Child() : j(1) { Foo(); }

void Foo() override { cout << "Child" << endl; }

int j;

};

class Grandchild : public Child {

public:

Grandchild() { Foo(); s = "hello"; }

void Foo() override { cout << "Grandchild" << endl; }

string s;

};
int main() {

Grandchild g;

Let’s write the pseudo-code for each class’s constructor:

Parent Child Gra


1. vtable = Parent’s vtable; 1. Call Parent’s default c’tor; 1. Call Child’s defaul
2. i = 0; 2. vtable = Child’s vtable; 2. vtable = Grandchi
3. Call Foo(); 3. j = 1; 3. Call s’s default c’to
4. Call Foo(); 4. Call Foo();
5. Call operator= on

Given this, it’s no surprise that in the context of a class constructor, the vtable points to
that very class’s vtable rather than its concrete class. This means that virtual calls are
resolved as if no inheritors are available. Thus the output is:

Parent

Child

Grandchild

What about pure virtual functions? If they are not implemented (yes, you can implement
pure virtual functions, but why would you?) you’re probably (and hopefully) going to
segfault. Some compilers actually omit an error about this, which is cool.

Destructors
As one might imagine, destructors have the same behavior of constructors, only happen
in reverse order.

Here’s a quick thought-exercise: why do destructors change the vtable pointer to point
to the their own class’s rather than keep it pointing to the concrete class? Answer:
Because by the time the destructor runs, any inheriting class had already been
destroyed. Calling such class’s methods is not something you want to do.

Implicit casts
As we saw in Part 2 & Part 3, a pointer to a child is not necessarily equal to the same
instance’s parent pointer (like in multiple inheritance).

Yet, there’s no added work for you (the developer) to call a function that receives a
parent’s pointer. This is because the compiler implicitly offsets this when you up-cast
pointers and references to parent classes.

Dynamic casts (RTTI)


Dynamic casts use the typeinfo tables we explored in Part 1. They do it in runtime by
looking at the typeinfo record that’s 1 pointer before what vtable pointer points to, and
use the class there to check whether or not a cast is possible.

This explains the cost of dynamic_cast when used a lot.

Method pointers
I plan to write a full post about method pointers in the future. Until then I’d like to stress
that a method pointer pointing at a virtual function will actually call the overridden
method (unlike non-member function pointers).

// TODO: add a link when the post is alive

Test yourself!
You should now be able to explain to yourself why the following piece of code behaves
the way it does:

#include <iostream>

using namespace std;

class FooInterface {

public:

virtual ~FooInterface() = default;

virtual void Foo() = 0;

};
class BarInterface {

public:

virtual ~BarInterface() = default;

virtual void Bar() = 0;

};

class Concrete : public FooInterface, public BarInterface {

public:

void Foo() override { cout << "Foo()" << endl; }

void Bar() override { cout << "Bar()" << endl; }

};

int main() {

Concrete c;

c.Foo();

c.Bar();

FooInterface* foo = &c;

foo->Foo();

BarInterface* bar = (BarInterface*)(foo);


bar->Bar(); // Prints "Foo()" - WTF?

This concludes my first blog post, which grew to become a 4 piece post. I hope you
learned some new things, I know I sure did.

S-ar putea să vă placă și