7.11.14

Running with pointers II

(This is the second in an occasional series of explorations of some of the stranger areas of C++ syntax.)

Let's assume that we're all familiar with the RAII idiom (which is basically a fancy way to say "using destructors to manage resources correctly"). So let's consider the following listing, and ask ourselves: what output might it produce, and "Is it safe?"

#include <stdio.h> // for printf()
#include <memory> // for std::unique_ptr<>

struct Lock {
    void lock() {
        printf("Lock::lock()\n");
        data = std::unique_ptr<int>(new int(23));
    }

    void unlock() {
        printf("Lock::unlock()\n");
        data.release();
    }

protected:
    // valid during lock: using unique_ptr for safety
    std::unique_ptr<int> data;
};

// RAII for safety
struct AutoLock {
    AutoLock(Lock &l): l(l) { l.lock(); }
    ~AutoLock() { l.unlock(); }

    Lock &l;
};

struct SafeCode: Lock {
    void safe() {
        AutoLock(*this);
        // we can access the data now...
        printf("My data is safe here!\n");
        printf("Locked data = %i\n", *data);
    }
};

int main() {
    SafeCode s;
    s.safe();
    return 0;
}

We have a type called Lock that has lock() and unlock() methods. These methods guard a resource which is only valid when the lock is locked. (For our purposes, the 'resource' we're managing is the number 23, and it is accessible directly through a member variable—neither of these are particularly defensible design decisions, but can be ignored.) We're even using a fancy C++11 unique_ptr to manage our resource to make sure we don't get burned by raw pointer errors (which might be ironic if it wasn't a terrible analogy). AutoLock wraps the lock's API in its constructor and destructor, which should ensure that Lock::data should be valid for the lifetime of an AutoLock object.

The type SafeCode inherits from Lock and provides a method called safe() which demonstrates the usage of AutoLock, which ought to be safe, right? It's right there in the name of the function.
So we'd expect to see this:

Lock::lock()
My data is safe here!
Locked data = 23
Lock::unlock()

SPOILERS: No, it's not safe. What I actually see, compiling with gcc -std=c++11 on Ubuntu 14.04 is this:

Lock::lock()
Lock::unlock()
My data is safe here!
Segmentation fault (core dumped)

Ouch. So what went wrong? The problem is actually with the definition of the AutoLock object at the start of safe(). We forgot to give it a name, so it is immediately destroyed after its creation, and more importantly, before we try to access the data. If we give it a name (so AutoLock lock(*this); or even AutoLock _(*this); would suffice) it will survive until the end of the scope it's contained in (i.e. the end of the function safe()).

We'll need to go digging in the standard to find out more. Section 12.2 (Temporary objects) contains verbiage which would indicate that the anonymous AutoLock instance is "a temporary whose lifetime is not extended by being bound to a reference". Because it has no name, it cannot possibly be referenced again after its introduction, so the compiler is justified in placing the call to its destructor immediately after the object is constructed. (We will leave aside the question of why we are allowed to create an anonymous instance like this, and if this is ever legitimate—please post a comment if you have a situation where defining an anonymous temporary variable like this is a valid and useful technique.)

The use of unique_ptr<> is pretty much a red herring, I just threw that in to dissuade critiquing which might've resulted from int *data; in the definition of Lock. (And, I suppose, smart pointers are now standard, so we should use them given the option. Unless, I also suppose, you really are concerned about performance, and you have a profile trace demonstrating that the use of a smart pointer is your performance bottleneck. But this is a lot of parenthetical supposition.) One thing I will say though, is that at least unique_ptr causes the broken code to crash—with a raw pointer, the program happily reported that the value of data was zero, and continued to run to completion. Although given the nature of the bug, we could assume that literally anything could happen.

"Forget it Jake, it's undefined behaviour..."