Sunday, September 14, 2008

return_by_smart_ptr policy for Boost Python

As part of my ongoing work to create some minimal Python bindings for Nebula3 I've implemented a new return value policy for Boost Python that's tailored for Nebula3 singletons. There were two existing potential candidates but neither quite fit the job. One was the manage_new_object policy, which will delete the C++ object embedded in a Python object when the Python object is destroyed, this leads to problems like this:
>>> from pynebula3.Core import CoreServer
# C++ singleton object is created and embedded in a new Python object
>>> coreServer = CoreServer.Create()
>>> print coreServer
<pynebula3.Core.CoreServer object at 0x00AB1340>
# existing C++ singleton object is embedded in a new Python object
>>> coreServer2 = CoreServer.Instance()
>>> print coreServer2
<pynebula3.Core.CoreServer object at 0x00AB1378>
>>> print coreServer2.AppName
Nebula3
# the C++ singleton object embedded in the Python object is destroyed
>>> coreServer = None
# remaining Python object now contains a dangling pointer to the C++ singleton object
>>> print coreServer2
<pynebula3.Core.CoreServer object at 0x00AB1378>
>>> print coreServer2.AppName # crash!

The other candidate was the reference_existing_object policy, and this is the one that is usually suggested for use with singletons. Under this policy the C++ object embedded in a Python object is not deleted when the Python object is destroyed, you have to delete the C++ object explicitly. The problem is that you should never delete a reference counted Nebula object explicitly, you should use AddRef()/Release() to adjust the reference count, and it will be automatically deleted when the count hits zero. So you'll have to do something like this:

>>> from pynebula3.Core import CoreServer
>>> coreServer = CoreServer.Create()
>>> coreServer.AddRef()
>>> print coreServer.AppName
Nebula3
>>> coreServer.Release()
>>> coreServer = None

Two lines to create an object and two lines to destroy it? No thanks! And so I set out to create my own policy. Essentially what I wanted is the behavior of the manage_new_object policy where the C++ object is deleted when the Python object is destroyed, but instead of embedding the C++ object itself I wanted it to embed a Nebula smart pointer to the C++ object. In fact manage_new_policy already embeds smart pointers instead of plain pointers, but they're not Nebula smart pointers. So all I had to do was take the manage_new_object policy and make it use Nebula smart pointers. Here is the new policy:

// return_by_smart_ptr.h

#include <boost/python/detail/prefix.hpp>
#include <boost/python/detail/indirect_traits.hpp>
#include <boost/mpl/if.hpp>
#include <boost/python/to_python_indirect.hpp>
#include <boost/type_traits/composite_traits.hpp>
#include "pynebula3/foundation/core/pointee.h"

namespace boost {
namespace python {
namespace detail {

// attempting to instantiate this type will result in a compiler error,
// if that happens it means you're trying to use return_by_smart_pointer
// on a function/method that doesn't return a pointer!
template <class R>
struct return_by_smart_ptr_requires_a_pointer_return_type
# if defined(__GNUC__) && __GNUC__ >= 3 || defined(__EDG__)
    {}
# endif
    ;

// this is where all the work is done, first the plain pointer is
// converted to a smart pointer, and then the smart pointer is embedded
// in a Python object
struct make_owning_smart_ptr_holder
{
    template <class T>
    static PyObject* execute(T* p)
    {
        typedef Ptr<T> smart_pointer;
        typedef objects::pointer_holder<smart_pointer, T> holder_t;

        smart_pointer ptr(const_cast<T*>(p));
        return objects::make_ptr_instance<T, holder_t>::execute(ptr);
    }
};

} // namespace detail

struct return_by_smart_ptr
{
    template <class T>
    struct apply
    {
        typedef typename mpl::if_c<
            boost::is_pointer<T>::value,
            to_python_indirect<T, detail::make_owning_smart_ptr_holder>,
            detail::return_by_smart_ptr_requires_a_pointer_return_type<T>
        >::type type;
    };
};

}} // namespace boost::python
Using the Core::CoreServer again as an example, I can bind it like this:
namespace bp = boost::python;

bp::class_<Core::CoreServer, Ptr<Core::CoreServer>, boost::noncopyable>("CoreServer", bp::no_init)
    .def("Create", &Core::CoreServer::Create, bp::return_value_policy<bp::return_by_smart_ptr>())
    .staticmethod("Create")
    .def("HasInstance", &Core::CoreServer::HasInstance)
    .staticmethod("HasInstance")
    .def("Instance", &Core::CoreServer::Instance, bp::return_value_policy<bp::return_by_smart_ptr>())
    .staticmethod("Instance")
    .add_property("AppName",
        bp::make_function(&Core::CoreServer::GetAppName, bp::return_value_policy<bp::return_by_value>()),
        bp::make_function(&Core::CoreServer::SetAppName)
    ))
    // etc.
    ;

And then use it in Python:

>>> from pynebula3.Core import CoreServer
# singleton instance is created and stored in a new Ptr<CoreServer> instance
# which is in turn embedded in a new Python object
>>> coreServer = CoreServer.Create()
>>> print coreServer
<pynebula3.Core.CoreServer object at 0x00AB1340>
# existing singleton instance is stored in a new Ptr<CoreServer> instance
# which is embedded in a new Python object
>>> coreServer2 = CoreServer.Instance()
>>> print coreServer2
<pynebula3.Core.CoreServer object at 0x00AB1378>
>>> print coreServer2.AppName
Nebula3
# the Ptr<CoreServer> instance embedded in the Python object is deleted,
# the singleton instance itself stays alive because another Ptr<CoreServer>
# instance still references it
>>> coreServer = None
>>> print coreServer2
<pynebula3.Core.CoreServer object at 0x00AB1378>
>>> print coreServer2.AppName
Nebula3
# the remaining Ptr<CoreServer> instance embedded in the Python object is deleted,
# since no other Ptr<CoreServer> instances reference the singleton instance it
# too is deleted
>>> coreServer2 = None

And that's all there is to it, though it remains to be seen how well it all works in practice.

Wednesday, September 10, 2008

Packages in Python extension modules

I've been working on some Python bindings for Nebula3 in the last few days, using Boost Python. At this point I don't intend to bind everything in Nebula3, I just need access to a few of the classes via Python (more on that in some future post). Boost Python is pretty nice, once you figure out the basics. The Boost build system (bjam) on the other hand is a royal pain to figure out if you want to use it to build your own projects, so I won't be using it for anything other than building the Boost libraries.

Anyway, in this post I'm going to explain how to create a package-like Python C/C++ extension module. I wanted all the Nebula3 Python bindings in one C++ extension module, with each Nebula3 namespace in it's own sub-module. Python packages are usually defined using a hierarchy of directories, so I want the equivalent of:

/mypackage
    __init__.py
    /Util
        __init__.py
        string.py
    /IO
        __init__.py
        uri.py

First of all you need to indicate to Python that your module is actually a package, you do so by by setting the __path__ attribute on the module to the name of the module.

// mypackage.cpp

BOOST_PYTHON_MODULE(mypackage)
{
    namespace bp = boost::python;

    // specify that this module is actually a package
    bp::object package = bp::scope();
    package.attr("__path__") = "mypackage";

     export_util();
     export_io();
}

Now we can create the Util sub-module.

// export_util.cpp

void export_util()
{
    namespace bp = boost::python;
    // map the Util namespace to a sub-module
    // make "from mypackage.Util import <whatever>" work
    bp::object utilModule(bp::handle<>(bp::borrowed(PyImport_AddModule("mypackage.Util"))));
    // make "from mypackage import Util" work
    bp::scope().attr("Util") = utilModule;
    // set the current scope to the new sub-module
    bp::scope util_scope = utilModule;
    // export stuff in the Util namespace
    bp::class_<Util::String>("String");
    // etc.
}

In line 8 we use PyImport_AddModule() to add the sub-module to Python's sys.modules list, and in line 10 we store the sub-module as an attribute of the package. And here's the IO sub-module.

// export_io.cpp

void export_io()
{
    namespace bp = boost::python;

    // map the IO namespace to a sub-module
    // make "from mypackage.IO import <whatever>" work
    bp::object ioModule(bp::handle<>(bp::borrowed(PyImport_AddModule("mypackage.IO"))));
    // make "from mypackage import IO" work
    bp::scope().attr("IO") = ioModule;
    // set the current scope to the new sub-module
    bp::scope io_scope = ioModule;
    // export stuff in the IO namespace
    class_<IO::URI>("URI",
        "A URI object can split a Uniform Resource Identifier string into "
        "its components or build a string from URI components.")
        .def(bp::init<const Util::String&>());
}

Once you've built the extension module you can import it into Python and check that it works just like a regular package.

>>> import mypackage
>>> from mypackage.Util import String
>>> from mypackage import Util
>>> from mypackage.IO import URI
>>> from mypackage import IO

This is something that doesn't seem to be well documented anywhere at this point, so hopefully this short writeup will save a few people some time. Note that while the code here uses Boost Python it should be relatively simple to adapt it to use the plain Python C API.