Search This Blog

Loading...

Wednesday, 1 June 2016

MCTS - Prologue - Lions

Once upon a time, in a land far far away (Guildford) there was a game of myth and magic. The destiny of a great studio rested on the shoulders of few (hundred) developers.

Known by many titles ... artist, programmer, chicken chaser, bastard … collectively they went by the moniker ‘Lions’. Valiantly battling the demons of bugs and deadlines as they sought that illusive grail of fun.

But alas, they succumbed to the fate that oft befalls many a heroic developer. A game shelved, a team scattered, their Fable ended. Only to be recounted many times across the watering holes of this land we call the internet.

As their Legends slip into the sands of time, so it rests upon your humble scribe to chronicle some of his adventures whilst exploring new lands with those brave few.

A journey into the depths of Monte Carlo, lost alone in a forest of trees (oh so many trees!), searching for a way out.

Tales of triumph and failure, to inspire and to guide … or perhaps, to serve as a warning for the next foolhardy traveller who is thinking of taking this less travelled road!

Tuesday, 26 January 2016

Reflecting with Coffee (Part 4)

The journey so far ... Part 1 - all about the wonderful things we get with reflection,  Part 2 - what a type introspection runtime might look like and Part 3 - extracting the symbols from our source code with clang.

The final step is to connect the two together and fill in a few details. The reflector code is deliberately simple, the script does all the processing work. It is relatively straightforward and does not really require any explanation. Its job is to the JSON output from reflector and fill in the TypeInfoImpl::Create() specialisation.

Any scripting language would do here. In the past I probably would have reached for Perl but my language de jour is coffeescript. There are various reasons for this but feel free to insert language of choice here :) Preferably one with a handy template engine, I happen to be using doT.js.

There are a few extra bits and bobs I added that maybe worth a mention. For convenience, fundamental types (floats, ints, etc) get "type converters" for converting to and from strings. Nothing particularly exciting.

Container types are more fun. If a class fields is a container (say std::vector) then we need a way to iterate over it when saving or to insert into it when loading it back in. Being abstract metadata type of stuff then we need to provide an abstract runtime interface for this.

This means in the script, we need to identify the container type and write the appropriate iterator and inserter field. I'm using std::functions to wrap up the implementation into an abstract callable interface. 
    
struct GWRTTI_API Field
{
    std::function< std::pair<void*,const TypeInfo*>() > (*Iterator)( void* );
    std::function< bool(void*) >                        (*Inserter)( void*, int );
};

The iterator returns a pointer to the next item along with a type, both are null when we are at the end. So in the case of a std::vector we can implement it with a funky little mutable lambda.
    
fields[ 2 ].Iterator = []( void* o ) -> std::function< std::pair< void*,const TypeInfo* >() >
{
    auto obj = reinterpret_cast<GameObject*>( o );
    auto itr = std::begin( obj->Components );
    auto end = std::end( obj->Components );

    return [=]() mutable -> std::pair< void*,const TypeInfo* >
    {
        if( itr == end ) return std::make_pair( nullptr, nullptr );
        auto cur = *itr++;
        return std::make_pair( cur, cur->GetType() );
    };
}
This slightly obscure bit of code is a function that returns a std::function, which is implemented by the lamda. The lambda holds and can change (hence mutable) the iterators for the container. Gotta love C++11 :)

Inserters are slightly easier. I pass in the size merely so I can pre-allocate the space and the function returned provides the interface for adding to the collection.
fields[ 2 ].Inserter = []( void* o, int size ) -> std::function< bool(void*) >
{
    auto obj = reinterpret_cast<GameObject*>( o );

    if( size > 0 )
    {
        obj->Components.reserve( obj->Components.size() + size );
    }

    return [=]( void* i ) -> bool
    {
        obj->Components.push_back( reinterpret_cast< Component* >( i ) );
        return true;
    };
}
There is no type checking, the assumption is this is done by the serialiser. Of course we would need to implement iterators and inserters for all our collection types but it suffices for the proof of concept code.




Sunday, 17 January 2016

Clangerizing (Part 3)

After a long and tedious disquisition on the specifics of implementing a RTTI system we get to the fun (!?) stuff ... generating the data, or clangerizing as I am calling it.

This is a two phase process
  1. Using clang to parse the source code and extract symbols
  2. Running a script over the output to generate the C++ code
As I often change my mind about the RTTI implementation, having the second part as a script makes it easier to pander to my whimsical nature.

For the clang bit, we are going to write a recursive AST visitor. There are plenty of examples of how to do this online already but one more won't hurt :) The sample code for this post is on github.

Clang is awesome, modular and easy to build tools that plug into the underlying libraries. Getting it to compile on the other hand, not so much.

Once you have compiled all the source, the easiest way to get started it to copy and modify an existing tool from the llvm/tools/clang/tools folder. You'll also need to modify the CMakeLists.txt in the folder above and regenerate the make files to add it to the list of things to be built.

Pro-tip! The make files with LLVM/Clang have a "fast" option if you just want to build a single project. For example:
make reflector/fast
So to the code ...

Getting clang to do its thing is just a case of parsing the command line and creating a ClangTool instance with a FrontEnd action that gets invoked for every file to be processed.
int main( int argc, const char* argv[] )
{
    // file to parse
    std::vector< std::string > files;
    files.push_back( argv[1] );

    // compilation options
    auto options = FixedCompilationDatabase::loadFromCommandLine( argc, argv );

    // run tool        
    ClangTool Tool( *options, files );
    return Tool.run( new FrontendActionFactory< ReflectFrontendAction >().get() );
}

The FixedCompilationDatabase is just a complicated way of saying "read my compilation option from the command line" (defines, include paths, etc). Normally clang looks for these in a corresponding json file but if you are not using cmake then I find this an easier option to integrate it into my build environment.

Note that compilation options come after a "--" on the command line. So invocation of your tools would be something like this ...
reflector myfile.cpp -- -Wall -Isome/path -DDEBUG=1

Our front end action creates an Abstract Syntax Tree consumer, which unsurprisingly is an interface onto the syntax tree of our source code. There are various overridable functions but for our purposes we are going to find the top level tags (structs, classes, unions or enums) and pass off the rest of the work to our ASTVisitor, that recursively walks the underlying symbols.
   
class ReflectASTConsumer : public ASTConsumer
{
    public:

        virtual bool HandleTopLevelDecl( DeclGroupRef group ) override
        {
            // for each declaration

            for( auto itr = group.begin(); itr != group.end(); ++itr )
            {
                // if it is a "Tag" (class, enum, etc)

                if( auto decl = dyn_cast<TagDecl>( *itr ) )
                {
                    // traverse it!

                    mVisitor.TraverseDecl( decl );
                }
            }

            return true;
        }

        virtual void Initialize( ASTContext& Context ) override
        {
            // keep hold of context as we need it for getCommentForDecl
            gContext = &Context;
        }


    protected:

        ReflectVisitor  mVisitor;
};

class ReflectFrontendAction : public ASTFrontendAction
{
    public:

        virtual std::unique_ptr< ASTConsumer >
        CreateASTConsumer( CompilerInstance& CI, StringRef file ) override
        {
            return make_unique< ReflectASTConsumer >();
        }
};

The visitor class is similar. We implement the required function and use the API to get the things we want to write out ...
    
class ReflectVisitor : public RecursiveASTVisitor< ReflectVisitor >
{
    public:

        // when we "visit" a record declaration (struct, class or union) ...

        bool VisitCXXRecordDecl( CXXRecordDecl* decl )
        {
            // get type and name

            std::string type = decl->getKindName().str();
            std::string name = decl->getQualifiedNameAsString();

            // get base classes

            for( auto& base : decl->bases() )
            {
                std::string type = base.getType().getAsString();
            }

            // get fields

            for( const auto& field : decl->fields() )
            {
                std::string name = field->getName().str();
                std::string type = field->getType().getAsString( pp );
            }

            return true;
        }
};

That is pretty much all there is to walking our code and reading the declarations. However, there is one final step we need to add before we are done.

Our reflection library requires us to be able to add additional annotations to the source code. Things that are not present in the AST but required for validation or context, for example ranges of values ...
    
/// min=0, max=100
float someField;
Conveniently, clang provides a getCommentForDecl function for just this kind of purpose. This returns any comment in a "doxygen style" immediately above the declaration (or parent in the case of classes, inherited comments are very convenient![1]).

FYI these are special comment blocks that start with three '/' or two '*', i.e. look like this ...
    
/// this is a special comment
or
    
/**
 * so is this ...
 */

NB: in the case of my reflection code, I look for an additional %% to mean a special "reflection instruction".

Comments take a little bit of work to unpack from clang ...

void GetComment( TagDecl* decl )
{
    std::string str;

    auto comment = gContext->getCommentForDecl( decl, nullptr );

    if( comment == nullptr )
    {
        return;
    }

    for( auto commentItr = comment->child_begin();
         commentItr != comment->child_end();
         ++commentItr )
    {
        auto commentSection = *commentItr;

        if( commentSection->getCommentKind() != BlockContentComment::ParagraphCommentKind )
        {
            continue;
        }

        for( auto textItr = commentSection->child_begin();
             textItr != commentSection->child_end();
             ++textItr )
        {
            if( auto textComment = dyn_cast<TextComment>( *textItr ) )
            {
                str += textComment->getText();
            }
        }
    }
}


And ... huzzah! That is all there is to it :)

Clang handles all the C++ complexities for us and provide a function to get additional "meta data". So it is relatively straightforward to pull out the information we need. The next step is to connect the output from reflector to our RTTI data model, coming up in part 4.

The full source code is on github along with some sample output.




[1] Alternatively you could use __attribute(),  whilst not supported in MSVC could be easily defined out and only read by the parser. For example, if you wanted to use the macros instead of comments

Monday, 11 January 2016

Reflections on Introspection (Part 2)

Following on from part 1, the idea is to create a run-time type system using clang to auto-generate the content.

But before we can get onto all the fun stuff, we what does the run-time type data look like anyway? If you want to play along at home, I have thrown the code up on github.

One of the things we want to do is automatic serialisation, because writing load and save code is just dull. For example, given some data we want to walk the structure and convert it to JSON. So given a class, we need a name, description of its fields, types and so forth. Something like this ...
    struct TypeInfo
    {
        // name
        // list of fields
        // list of base classes
        // any custom attributes
    };
For starters, we should probably be able to grab the type data for any class. C++ gives us the typeid() operator for static types. This is a nice feature so lets steal it for our own Type() function :)
    Thingamajig thingy;

    TypeInfo* type = Type< Thingamajig >();
    TypeInfo* type = Type( &thingy );
Which we can implement with templates by deriving a TypeInfo to hold the data for each specific type.
    template< typename T >
    struct TypeInfoImpl : public TypeInfo
    {
        static const TypeInfo* GetType()
        {
            static TypeInfo info;
            return &info;
        }
    }
And then our Type functions becomes ...
    template< typename T > inline const TypeInfo* Type()
    {
        return TypeInfoImpl::GetType();
    }

    template< typename T > inline const TypeInfo* Type( const T* )
    {
        return TypeInfoImpl::GetType();
    }
Which is quite neat, with just this we can start working with (static) types by getting the TypeInfo and comparing them.
    float f;

    if( Type( &f ) == Type<float>() )
    {
         // do some things here
    }
Of course we want to fill out TypeInfo with all the details, which is where our automagical clang tool comes into play, but more on that later.

This we will implement with a “Create” function that we specialise for each type and call during instantiation.
    template<> void TypeInfoImpl< Vector3 >::Create()
    {
         // do the things
    }
Sounds easy enough, just a couple of niggly little details to make it work nicely for multithreaded code and windows DLL’s.

Even though we will only be calling Create() during instantiation, there is still a possibility that two threads could end up inside here at the same time causing “bad things” to happen.

We could guard the code with mutexes but that seems a bit wasteful. It is much easier to ensure everything is setup during the global construction phase (guaranteed thread-safe, where as scoped static initialisation is not).

This also allows us to separate the acquisition of the TypeInfo pointer from its instantiation. This might not be an obvious benefit but may become a little clearer when we consider dynamically loading types from DLL's.

So ... mix in a little auto-registration
    template< typename T >
    struct Register
    {
        Register()
        {
            const TypeInfo* info = TypeInfoImpl<T>::GetType();
            reinterpret_cast< TypeInfoImpl<T>* >( info )->Create();
        }
    }
And wrap it up in a macro for convenience.
    #define REGISTER_CONCAT(a,b) a##b
    #define REGISTER_CREATENAME(c) REGISTER_CONCAT( __rtti_, c )
    #define REGISTER(T) \
        static Register<T> REGISTER_CREATENAME( __COUNTER__ )();
I’m using __COUNTER__ here to create a unique name for our registrant, type names could contain funky characters so we can't just append it to the name[1].

Then so long as we register our types in advance it all works.

Shared libraries are slightly more problematic.

Windows DLL's get their own copy of global and static variables. This means unless we attach a __declspec( dllexport ) / __declspec( dllimport ) as appropriate to our symbol, they are statically linked to each module individually, i.e. those static TypeInfo's in the GetType() function are unique to each module.

Right now Type<float>() will return a different pointer depending on which module called it. The standard C++ RTTI has the same issue, which is why std::type_info has the hash_code() function to compare types.

We could add __declspec( dllexport ) to the TypeInfoImpl template, which will work so long as we are happy linking to and declaring all our types in the same module. However, on a large code base it is probably more preferable to be able to define types in multiple modules and load them dynamically without any linkage shenanigans.

This means I could write a serialisation function in one module that can read or write any type from any module, without actually linking to the implementation or including the headers. This is the power of reflection!

The solution here is to create all the TypeInfo's in one place during registration and for GetType() to look up the type by name and hold the pointer, rather than the TypeInfo itself. For loading and script binding we will need to keep a registry of types we can look up by name anyway so this ties in nicely.

Something like this ...
    __declspec( dllexport ) const TypeInfo* FindOrCreate( const char* );

    template< typename T >
    struct TypeInfoImpl : public TypeInfo
    {
        // either find or create the metadata entry based on it's type name<
        static const TypeInfo* GetType()
        {
            static const TypeInfo* info = FindOrCreate( typeid(T).name() );
            return info;
        }

        // we specialise Create() per type and call it once from 
        // the module that registers it
        void Create();
    }
But hang on … doesn’t typeid(T).name() require the standard RTTI to be enabled? Yes it does. But didn’t you say most games turn off RTTI because games programmers are paranoid control freaks? Well that sounds like something I would say.

Every game I have worked on has had RTTI disabled. Maybe the reasoning for turning it off does not really hold any more but old habits die hard so let us assume we cannot rely on it being available. This means we need another way of identifying the type.

I have seen some implementations "declare" types in headers to solve this problem, which is OK I suppose but I am getting old and having more things to remember is too much for my little brain. I like the elegance of it "just working".

A neat little trick is to use the __PRETTY_FUNCTION__ define (or if we are using MSVC it is called __FUNCSIG__). This expands the function call as a string, including the template variables. Looks something like this ...
    "const struct TypeInfo *__cdecl TypeInfoImpl<struct Vector3>::GetType(void)"
So we can use this to create a unique identifier for the type.
    static const TypeInfo* GetType()
    {
        #ifdef  _MSC_VER
            static const TypeInfo* info = FindOrCreate( __FUNCSIG__ );
        #else
            static const TypeInfo* info = FindOrCreate( __PRETTY_FUNCTION__ );
        #endif

        return info;
    }
So that is the basic framework for our registry of types that supports dynamic loading and multithreading.

Working with run-time types (as opposed to the static compile time types so far) is just a case of adding a virtual function to each class that requires RTTI.
    virtual const TypeInfo* GetType() const { return Type( this ); }
Simples.

All that remains is to fill in the TypeInfo struct, which our snazzy clang tool will take care of for us but if you can take a sneak peek at the end result here.

The only final thing worth some discussion is offsetof().

Part of  type introspection is to look up member variables of a class by name and get pointers to them. A typical implementation is to use the offsetof() macro.

This works something like this ...
    #define offsetof( Type, Member ) ((size_t)(&((Type*)nullptr)->Member))

    struct MyThingy
    {
         float someField;
    };

    size_t offset = offsetof( MyThingy, someField );

    MyThingy thingy;
    float* pSomeField = (float*) ( (char*)(&thingy) + offset );
The idea is that we store the offset in bytes for a given member variable alongside the name, so at run-time we can work out the actual address for given an instance of that class. This works for plain old data structures and probably most cases we care about.

However, this falls over for more complicated data layouts (virtual inheritance) and is dubious with multiple inheritance.

As we are going to generate the TypeInfo data anyway we can do better by adding a getter function that will work with virtual inheritance. I also have the vague notion that we could override this with some kind of per-type customisation in the future, which will probably never happen but humour me :)

So for fields, we add a getter function ...
    struct Field
    {
        const char*     Name;
        const TypeInfo* Type;
        void* (*Get )( void* ); // getter function (takes pointer of object)
    }
And we can generate a custom lambda to fill it in.
    fields[ 2 ].Get = []( void* o ) -> void* { return &reinterpret_cast<vector3>(o)->z; };
I have missed out a few details but hopefully this covers the important points.


[1] Just a little aside about the registrant pattern in C++. If the registrant classes are included directly in an exe or DLL then they will work fine. It is possible for them to be missed out if they are contained in a static library. Static libraries are really just a collection of object files, linkers only pull in an object file if a symbol from that file is referenced, but that is outside the scope of this article ;)

Friday, 1 January 2016

Reflection on Reflections (Part 1)

The holidays, always a good time for reflection ... and being a programmer, reflection of course can only possibly mean "the ability of a program to examine the type or properties of an object at runtime".

Yep, this is hardly breaking new ground but I have to amuse myself somehow. So why bother with reflection? Well it gives us some cool things, such as ...

  • Automatic serialisation
Who likes writing loading and saving code? Or packaging data for a network stream? Surely there is more interesting code to write.
  • Tool and script integration
Also, adding script bindings for functions or populating editor controls is a hardly a scintillating task. It would be nice if our classes integrated automatically with the rest of the game systems.
  • Attaching additional attributes to variables
And for that matter, validating input or adding meta data (like ranges for values) should be be at least somewhat streamlined.

In short, type introspection simplifies of a lot of tedious and error prone stuff. And I for one am in favour of all things that mean I have to work less. Although somehow I always seem to spend more effort working less, go figure.

Now, if you are writing your game in a managed language (hello Unity) then you are probably wondering what all the fuss it about ... um, good point! I suggest you stop reading the rest of this drivel and actually go make yourself a game ...

Still here? OK, then lets go ahead and build this mousetrap! Down in C++ land we don't get much for free, but I guess we knew that already. The standard C++ gives us some limited RTTI if we actually turn it on, but where would the fun be in that.

To get to this fabled land where things happen automagically, we need a data model. A bunch meta-data that describes our types, names, fields, attributes and what have you.

The question is how do we create it.

We could hard code it all but that sounds like "work". We can do funky things with templates but this only gets us part of the way there (via a chunk of incomprehensible code). A separate "description file" is just one more thing to manage, and I wont even comment on the idea of trying to scrape symbols out of PDB files.

As I have already gone to the trouble of writing all that lovely (?) code in the first place, lets use that and whip up some meta-data as a pre-build step.

The usual approach here is to pop some special macros above various symbols that the tool picks up but the compiler ignores (à la mode de Unreal).

UPROPERTY(min=10, max=100)
float ThisIsMySpecialFloat;

Apart from a few ugly macros, it seems solid but writing a tool to parse C++ properly, taking into account the language, pre-processor shenanigans and the like is not a simple task.

Fortunately, the clang guys and gals have already done all the hard work for me and gone to the trouble of wrapping it up into a convenient library to boot, which was nice :)

So that's the idea. Over the next couple of days, throw together some prototype code that utilises clang to do the heavy lifting. Then I can sit back and bask in the reflected glow of glorious meta-data ... and actually get on with the project I was meant to be working on before this sidetracked me!

Progress, after all, is made by lazy men trying to find an easier way to do something ... or something, I digress.

Tuesday, 30 October 2012

Day 22 – Quantified Self

It has been a few days and I have some Zeo data to share.

It helped me almost immediately identify a problem with my core sleep. I had assumed the reason I felt so rough in the mornings was part of the adaption process, however the Zeo showed that I was trying to wake up during a deep sleep. Digging back through my sleep log helped me identify my ultradian rhythm and move the core sleep back into alignment with my body clock. The next day I felt perfectly fine.



The graphs show that I am getting between 60-80 minutes of both REM and SWS during my core sleeping hours (see below). The body requires 90 minutes of each and I appear to be making the remainder up during the nap time. This demonstrates that not only has my body adapted (I have a compressed sleep cycle that minimises light sleep) but that I am also getting a healthy amount of the right kind of sleep.

Unfortunately the Zeo does not upload the data for my naps, so I have to check this manually after each one.

By sheer serendipity, I was prompted towards a “Quantified Self” (QS) group that meets at the Google campus in Shoreditch last week. I had not heard the term before but it turns out there are other strange people in the work with my same obsession for measuring and experimenting with your life.

Encouraged by some of the talks I have started using fitbit to track my activity levels, food that I eat, weight and so on. Interestingly, the Fitbit pedometer can be used to monitor your sleep quality by strapping it to your wrist. The body makes different movements during different phases of sleep. I am sceptical as to how accurate this is but I may try it and compare it against the Zeo data just to see.

As an aside, it occurred to me that the evidence for a technological singularity would be observable and even could be brought about through QS culture. Although I suspect the transhumanists will be disappointed for some time to come. Just a thought.

Now I am left wondering what other parts of my life I can instrument ;)

h+

Wednesday, 24 October 2012

Day 16 – Hello Zeo

As you may or may not know, sleep is broken down into five different phases but it is the last two that your body needs – SWS (slow wave sleep where your body attends to your immune system) and REM sleep (where your mind organises your memories). The others are not required.
You need about 90 minutes of both SWS and REM to be fit and healthy. The idea behind poly-sleep is to optimise your time spent asleep to maximise SWS and REM and to minimise or cut out the others. The problem is how do you know if you are actually getting the right type of sleep or how you might change your sleeping patterns to get the maximum effect?
I have been keeping a “sleep log” (below) to try and keep a subjective track of by sleeping and alertness patterns. However, I have now got my hands on a Zeo. This is a neat little box that will monitor your brainwaves as you sleep to tell you exactly what kind of sleep you are getting and will allow me to play with the nap times to see if I can improve the results. Regardless of whether or not you are attempting to poly-sleep it seems like a useful exercise and plays nicely with my obsession with data and metrics.
Sleep Log - Day 16

Generally now I am in the swing of things. The Everyman (E3) schedule has been much less brutal than the Uberman but at the same time I have not adapted as quickly and occasionally I fail to sleep during the nap time. By this time using the uberman I was pretty much always sleeping during the naps.
After doing a bit of research I could have chosen my nap times better. Hopefully with the Zeo I can experiment a little with moving the nap times around and optimising my REM and SWS.