Performance cost of RTTI vs programmed type system

Discussion:

(too old to reply)

Sebastian Karlsson

2008-07-31 22:31:56 UTC

Hi, I'm kind of comparing RTTI vs a programmed type sytem of the type:

enum Type { blah, etc }

class Base
{
virtual Type GetType() const = 0;
}

For this type of system I would have to make a virtual function call,
in this regard I reckon RTTI will perform the same, or perhaps even
better. The problem is that the typeinfo object needs to be
constructed and returned, and it seems way more costly than a simple
enum.

My question really is, why was typeinfo designed like this? Wouldn't
it be better if typeinfo just was some typedef for a int, which should
be more than enough to uniquely represent all the possible classes in
even the largest system. If the user wants a string representation,
then surely that could've been accomplished with a function call using
the unique id. The only problem I see with this approach is when
linking with different libraries, but wouldn't the linker be able to
patch this up then anyway? Even as it is now typeinfo holds no
guarantee of being unique across compilers / platforms etc. Or have
some form of weak_typeid() operator with less guarantees. RTTI
could've been a perfect fit for me, in my use case I don't mind paying
for the slight memory overhead, I'm having a hard time justifying its
use due to the performance implications of the construction /
deconstruction of typeinfo.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Erik Wikström

2008-08-01 01:55:45 UTC

Permalink

Post by Sebastian Karlsson
enum Type { blah, etc }
class Base
{
virtual Type GetType() const = 0;
}
For this type of system I would have to make a virtual function call,
in this regard I reckon RTTI will perform the same, or perhaps even
better. The problem is that the typeinfo object needs to be
constructed and returned, and it seems way more costly than a simple
enum.

It was some time I looked at RTTI, but I suspect that a type_info object
for every used type (or for which it will be needed if the compiler can
figure that out) could be created at startup and all typeid would have
to do is to return that object.

--
Erik Wikström

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Alberto Ganesh Barbati

2008-08-01 01:56:11 UTC

Permalink

I understand one term of the comparison, but not the other. Could you
please post some code (or pseudocode) showing the "other" approach using
RTTI that you are considering? You know, there are many ways to use RTTI
and if you want to get a proper answer it's essential that you are clear
about which is the approach you are talking about. Moreover, it might
also help to know the use case you are facing...

Ganesh

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

blargg

2008-08-01 09:11:03 UTC

Permalink

In article

Post by Sebastian Karlsson
enum Type { blah, etc }
class Base
{
virtual Type GetType() const = 0;
}

If speed is most important, you could even store the type as a field in
the object, making GetType() a single inline memory fetch and nothing
more:

class Base {
Type const type;
public:
Base( Type t ) : type( t ) { }
Type GetType() const { return type; }
};

If you're comparing to something like

class Base2 {
public:
virtual ~Base2() { }
std::type_info const& GetType() const { return typeid (*this); }
};

On most compilers I'd expect two memory fetches to get the type: one to
read the vtable pointer, and another to get the pointer to the type_info
object. A smart compiler could perhaps store the type_info object in the
vtable, perhaps at a negative offset, eliminating one of the memory
fetches.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Mathias Gaunard

2008-08-02 00:21:05 UTC

Permalink

Post by Sebastian Karlsson
The problem is that the typeinfo object needs to be
constructed and returned, and it seems way more costly than a simple
enum.

That's incorrect.
std::type_info is never constructed at runtime. typeid() only returns
a reference to a std::type_info that is in static memory.

Moreover, RTTI has tables to allow hierarchical type identification.
(If you have a class hierarchy A : B : C, and you have an A object
accessed through a pointer/reference to C, you can identify that your
object is actually a valid B)

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Sebastian Karlsson

2008-08-02 16:58:23 UTC

Permalink

Post by Mathias Gaunard
That's incorrect.
std::type_info is never constructed at runtime. typeid() only returns
a reference to a std::type_info that is in static memory.

Thanks, this is what I missed. I'll go for RTTI then.

My use of RTTI would mostly be of the type:

if( typeid( component ) == typeid( PhysicsActor ) ) // Component is
PhysicsActor, safe to up cast.

And some sparse use of dynamic_cast where appropriate. The games
development community have a very strong dislike for RTTI, which going
by this information seems pretty unfounded. For above usage it should
practically perform very much like a hand crafted RTTI system if I'm
not mistaken.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Seungbeom Kim

2008-08-03 09:02:51 UTC

Permalink

Post by Sebastian Karlsson
if( typeid( component ) == typeid( PhysicsActor ) ) // Component is
PhysicsActor, safe to up cast.
And some sparse use of dynamic_cast where appropriate.

Note that if component is an object of a type publicly derived from
PhysicsActor, it still "is-a" PhysicsActor, and dynamic_cast will
succeed, but the above typeid equality test will not. This is why
dynamic_cast is usually preferred to a typeid equality test.

--
Seungbeom Kim

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Mathias Gaunard

2008-08-03 20:16:49 UTC

Permalink

Post by Sebastian Karlsson
The games
development community have a very strong dislike for RTTI, which going
by this information seems pretty unfounded.
From my personal experience, C++ code within the game development

community is fairly ugly, hacky and C-ish.
That aside, there are real issues with RTTI:
- It takes up some memory, as the lookup tables need to be stored
somewhere. Current compilers are still unable to remove unused extern
global variables, due to the separate compilation system.
- typeid(SomeType) is not a compile-time constant, preventing
constant-time visitation [1], which is possible with ints. Visitation
has to be done through virtual functions, which are intrusive and
cannot be templates. That is mainly due to the fact that the type
identifier is an address in static memory, which is not resolved
before link-time.

[1] A possible way would be to mangle the type names of the visited
possibilities at compile-time and build an automaton to identify the
result of std::type_info::name in constant time. That would still be
less efficient, of course.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Alberto Ganesh Barbati

2008-08-04 03:21:37 UTC

Permalink

Post by Mathias Gaunard

community is fairly ugly, hacky and C-ish.

Sooo true! Sigh...

Post by Mathias Gaunard
- It takes up some memory, as the lookup tables need to be stored
somewhere. Current compilers are still unable to remove unused extern
global variables, due to the separate compilation system.

Actually a lot of current compilers *are* able to do that, including two
that are very used in the game development industry: VC++ and
CodeWarrior. BTW, both of them have link-time code generation, so they
are no longer pure separate compilation systems.

Anyway, RTTI tables for polymorphic types cannot be stripped from the
final executable in most cases, because the compiler is not able to
detect that they won't be needed at runtime.

Ganesh

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Mathias Gaunard

2008-08-04 21:58:30 UTC

Permalink

Post by Alberto Ganesh Barbati
Anyway, RTTI tables for polymorphic types cannot be stripped from the
final executable in most cases, because the compiler is not able to
detect that they won't be needed at runtime.> final executable in most cases, because the compiler is not able to

That's because even with link-time code generation, some symbols are
still unresolved (the ones from linked libraries and system calls).
And those symbols usually do not expose the global variables they
access. A shame.

The worst is that even with this code:

struct Foo
{
int value;

Foo(int i) : value(i) {}
virtual ~Foo() {}
};

void do_something(const int&);

int main()
{
Foo foo(42);
do_something(foo.value);
}

The RTTI information is not removed, at least with LLVM.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Kevin Frey

2008-08-05 05:12:38 UTC

Permalink

What happens when using RTTI across a shared module boundary such as a DLL?
It seems to me that RTTI will be broken in that regard, and require special
handling, even IF the classes in question (the main executable and the DLL)
are in fact the same class.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Mathias Gaunard

2008-08-05 16:33:23 UTC

Permalink

Post by Kevin Frey
What happens when using RTTI across a shared module boundary such as a DLL?

Nothing special.
Using RTTI is like accessing a global variable, whose address is the
first word of the object.

Post by Kevin Frey
It seems to me that RTTI will be broken in that regard

There is no problem at all.

I don't really see how that is related to the message you replied to
though.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Eugene Gershnik

2008-08-05 21:28:25 UTC

Permalink

Post by Kevin Frey
What happens when using RTTI across a shared module boundary such as a DLL?

Depends on the compiler.

Post by Kevin Frey
It seems to me that RTTI will be broken in that regard, and require special
handling, even IF the classes in question (the main executable and the DLL)
are in fact the same class.

MSVC, for example, handles this situation but at the cost of *always*
using string comparisons of the mangled class names to check for type
'sameness'. This slows RTTI down considerably and makes it perform
worse than manual type identifiers.
IIRC GCC uses plain pointer comparisons so on this compiler RTTI won't
work reliably across shared libraries. OTOH such implementation is
probably as fast as a manual scheme.

--
Eugene

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Mathias Gaunard

2008-08-06 14:28:24 UTC

Permalink

Post by Eugene Gershnik
IIRC GCC uses plain pointer comparisons so on this compiler RTTI won't
work reliably across shared libraries. OTOH such implementation is
probably as fast as a manual scheme.

I don't know much about shared libraries/DLLs, but why is it a
problem?

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Pete Becker

2008-08-06 19:58:25 UTC

Permalink

Post by Mathias Gaunard

I don't know much about shared libraries/DLLs, but why is it a
problem?

Sometimes the compiler generates separate RTTI objects in different
DLLs for the same type. If the mechanism for comparing RTTI objects
hasn't been designed properly, this situation can fool it. i.e. it's a
GCC bug.
--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Mathias Gaunard

2008-08-07 10:56:58 UTC

Permalink

Post by Pete Becker

Post by Mathias Gaunard

I don't know much about shared libraries/DLLs, but why is it a
problem?

Sometimes the compiler generates separate RTTI objects in different
DLLs for the same type.

I see.
Simple fix: don't compare instances of types that you want to identify
at runtime that weren't created in the same DLL.
That doesn't seem like a very serious limitation to me.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Pete Becker

2008-08-07 15:58:29 UTC

Permalink

Post by Mathias Gaunard

Post by Pete Becker

Post by Mathias Gaunard

I don't know much about shared libraries/DLLs, but why is it a
problem?

Sometimes the compiler generates separate RTTI objects in different
DLLs for the same type.

I see.
Simple fix: don't compare instances of types that you want to identify
at runtime that weren't created in the same DLL.
That doesn't seem like a very serious limitation to me.

Well, it is. <g> For example, catching an exception with someting other
than ... depends on being able to identify its type. But again: there's
no inherent reason this can't be done right.
--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Eugene Gershnik

2008-08-06 22:58:26 UTC

Permalink

Post by Mathias Gaunard

I don't know much about shared libraries/DLLs, but why is it a
problem?

Imagine how you as a compiler/linker author would manage type_info
objects. Once you see a class that requires one you need to generate
type_info object and make typeid return it. The final 'module' (i.e.
either application or shared library) should get exactly one copy of
type_info for each class that requires it. Inside this module code
everything is great. Two types can be checked for sameness by simply
comparing the address of their type_info. This happens inside
dynamic_cast, catch() clauses or type_info::operator==.

Now consider a class that has identical copies in two modules (EXE and
DLL or two DLLs). This could happen if the class is defined in a
shared header file or if both modules link with the same static lib.
If this class needs type_info in both modules they will end up with
their separate copies of it. You can probably already see where it
leads. If an object of this class is created in one module then passed
to another and typeid/dynamic_cast/catch is applied there the
type_info address it gets from the object will not match the one the
module associates with the class.

What happens next is up to the compiler/linker/loader. As I said in my
previous post some may choose to handle this situation and some may
not. When they don't there are all sorts of weird things that may
happen. The relatively well-known one is that an exception thrown from
one shared library may not be caught in another.

This situation can theoretically be dismissed as a simple violation of
ODR but this isn't very helpful. Shared libraries that use common data
structures have to violate ODR almost by definition.

--
Eugene

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]