Streaming Consciousness: 2014

Thursday, October 2, 2014

VC++: steady_clock is not steady

Update July 12, 2015: The problem described below is fixed in the VS2015 RC

C++11 introduced a number of new clock classes, among them system_clock and steady_clock. system_clock is meant to represent wall time; in other words, if a clock adjustment takes place or daylight saving time goes into or out of effect, it may jump forward or backward at a rate that does not reflect the actual passage of time. These properties make it ideal for reporting the actual time. steady_clock, on the other hand, is meant to always increase monotonically, making it ideal for measuring intervals between events. However, Microsoft screwed up here—as of VC++ 2013, at any rate, steady_clock jumps around with changes to the wall clock just like system_clock. I demonstrated this with the simple program below:

#include <Windows.h>
#include <chrono>
#include <iostream>

using namespace std::chrono;

void main()
{
    uint64_t gtcStart = ::GetTickCount64();
    auto steadyStart = steady_clock::now();
    auto systemStart = system_clock::now();
    auto monoStart = monotonic_clock::now();
    auto hrStart = high_resolution_clock::now();

    getchar();

    auto gtcElapsedSec = (::GetTickCount64() - gtcStart) / 1000;
    auto steadyElapsedSec = duration_cast<seconds>(steady_clock::now() - steadyStart).count();
    auto systemElapsedSec = duration_cast<seconds>(system_clock::now() - systemStart).count();
    auto monoElapsedSec = duration_cast<seconds>(monotonic_clock::now() - monoStart).count();
    auto hrElapsedSec = duration_cast<seconds>(high_resolution_clock::now() - hrStart).count();

    std::cout << "GetTickCount64(): " << gtcElapsedSec << " sec" << std::endl;
    std::cout << "steady_clock: " << steadyElapsedSec << " sec" << std::endl;
    std::cout << "system_clock: " << systemElapsedSec << " sec" << std::endl;
    std::cout << "monotonic_clock: " << monoElapsedSec << " sec" << std::endl;
    std::cout << "high_resolution_clock: " << hrElapsedSec << " sec" << std::endl;
}

While the code was sitting on getchar(), I snuck into Control Panel and set the clock ahead an hour, then returned to the program and entered a char. The output was:

GetTickCount64(): 13 sec
steady_clock: 3611 sec
system_clock: 3611 sec
monotonic_clock: 3611 sec
high_resolution_clock: 3611 sec

Now; if you're using boost in your project (and any C++ project of substance really should), boost::chrono::steady_clock has the same syntax and semantics, but actually works. Otherwise, unless you want to write your own clock, it looks like you're stuck with GetTickCount64() until the next release of Visual Studio.

Thursday, August 14, 2014

A PropertyGrid gotcha

In a nutshell, when placed in a PropertyGrid, objects that use the same DisplayNameAttribute in different categories lead to surprising selection behavior.

I decline to call it a bug because I suspect the authors of the framework would claim this is how it's supposed to behave, but it's still the sort of thing that can trip developers up. Imagine you set a PropertyGrid's SelectedObject to an instance of this class:

class Demo
{
    [Category("Category1")]
    [DisplayName("Value")]
    public int Val1 { get; set; }

    [Category("Category2")]
    [DisplayName("Value")]
    public int Val2 { get; set; }

    [Category("Category3")]
    [DisplayName("Value")]
    public int Val3 { get; set; }
}

Sure, we're reusing the display name "Value," but since each instance is in a different category, everything should be fine, right? Well; not exactly. Under the hood, the grid determines equality between GridEntrys by comparing the combination of label, property type, and tree depth. And in this case, "label" means display name. So it doesn't matter if the actual code names of the underlying properties differ—as long as their display names, types, and depths are the same, the grid will consider their entries equal. One consequence of this is that if PropertyGrid.Refresh() is called while the Category3 Value is selected, the selection will move to the Category1 Value; presumably because it is the first entry in the tree where .Equals(oldSelection) returns true.

Luckily, there is a workaround. PropertyGrid omits tab characters when it prints display names, so you can simply add a different number of \t's to the beginning or end of each one to make them unique. A bit of a hack to be sure, but it does the job.

Thanks to stackoverflow for assisting me in my investigation of this issue.

Monday, August 11, 2014

A make_shared caveat

When you allocate a new object for a shared_ptr using the obvious approach:

std::shared_ptr<Foo> p(new Foo(a1, a2)); // case 1

you wind up with (at least) two dynamic memory allocations: one for the Foo object itself, and another for the shared_ptr's control block (which includes the strong and weak reference counts).

That's not usually a big deal, but it would help locality of reference if we could put the whole thing into one block, as well as making construction slightly faster. The make_shared library function can help us with that:

std::shared_ptr<Foo> p = std::make_shared<Foo>(a1, a2); // case 2

make_shared will typically place the shared_ptr control block and the allocated object itself into a single chunk of memory. I say "typically" because the standard does not require implementers to do things this way, but in practice, they do.

This seems like win-win, but there is a catch. Imagine a situation where the Foo object we created above is referenced by one shared_ptr and two weak_ptrs (i.e. its strong count is one and its weak count is two). Now that one shared_ptr goes away. If we'd allocated the Foo as in case 1, at this point we'd be able to free all the memory it used to occupy and go on our way. But if we'd done things as in case 2, this isn't possible. Why? Because we still need to keep the control block around until the weak count goes to zero, the control block and the Foo share a single allocation, and it isn't possible to free just part of an allocation. So while the Foo's destructor will be called the moment the strong count goes to zero as expected, if it was created using make_shared, its memory can't be returned to the system until the weak count does too. In situations where dangling weak_ptrs can hang around more or less indefinitely, this can lead to some surprising memory leaks.

Friday, August 1, 2014

pimpl makes move constructors easy

I described the pimpl idiom and some of its uses in a previous post. Another little fringe benefit is that it makes writing move constructors really, really easy. Consider this (partial) class:

class TuneResult
{
private:
    std::string m_name;
    std::map<std::pair<double, double>, double> m_measurements;
    std::pair<double, double> m_optimum;
    std::vector<std::string> m_logMessages;
};

Without pimpl, adding a move constructor requires a bit of effort:

class TuneResult
{
    TuneResult(TuneResult&& other)
        : m_name(std::move(other.m_name))
        , m_measurements(std::move(other.m_measurements))
        , m_optimum(std::move(other.m_optimum))
        , m_logMessages(std::move(other.m_logMessages))
    {    
    }

private:
    std::string m_name;
    std::map<std::pair<double, double>, double> m_measurements;
    std::pair<double, double> m_optimum;
    std::vector<std::string> m_logMessages;
};

It's not hard to imagine that becoming much longer for classes with more members. But if you've already pimpled anyway, moving becomes trivial:

// h file

class TuneResultImpl;

class TuneResult
{
    TuneResult(TuneResult&& other);
    
private:
    friend class TuneResultImpl;
    std::unique_ptr<TuneResultImpl> m_impl;
};

// cpp file 

class TuneResultImpl
{
    friend class TuneResult;

private:
    TuneResult* m_self;

    std::string m_name;
    std::map<std::pair<double, double>, double> m_measurements;
    std::pair<double, double> m_optimum;
    std::vector<std::string> m_logMessages;
};

TuneResult::TuneResult(TuneResult&& other)
    : m_impl(std::move(other.m_impl))
{
    m_impl->m_self = this; // don't forget!
}

Thursday, June 26, 2014

C++: Tying object lifetimes together with shared_ptr and custom deleters

Once upon a time, when a great man died his slaves would be killed and buried alongside him as gifts to carry into the netherworld. Our civilizations have mostly moved beyond both slavery and grave gifts, but our code has not—it is actually quite common to have a series of subordinate helper objects that you want to keep around for exactly as long as some master object is alive. The most straightforward way of doing this, of course, is to have the the slave objects be actual member variables of the master's class. But that won't work if you've made the choice to keep your objects loosely coupled (for any of several excellent reasons). For example, consider a class that allows builders to attach custom behaviors to it using the observer pattern:

class Master
{
public:
   struct IObserver { /* ... */ };

   // WEAKLY attach an observer that will be notified when this object changes
   void AddChangedObserver(std::weak_ptr<IObserver> observer);
};

// write the changes to disk
class Serializer : public Master::IObserver { /* ... */ };

// update UI to reflect changes to the object
class UiUpdater : public Master::IObserver { /* ... */ };

std::shared_ptr<Master> MakeMaster()
{
   auto master = std::make_shared<Master>();

   auto serializer = std::make_shared<Serializer>();
   master->AddChangedObserver(serializer);

   auto uiUpdater = std::make_shared<UiUpdater>();
   master->AddChangedObserver(uiUpdater);    

   return master; // danger!
}

There is a problem here. We want serializer and uiUpdater to stick around as long as the returned Master does, but that requires us to keep a shared_ptr to each somewhere. The code is not doing that now, so both will be deleted (and implicitly deregistered) as soon as MakeMaster() returns. This will not cause any undefined behavior, but we won't get the attached behaviors we were expecting from these objects either.

So where do we keep the shared_ptrs to the two observers? We could return them from MakeMaster() along with the main object, but that's just punting to the caller (who shouldn't need to know that we're attaching behaviors anyway). A better solution is to leverage the ability to give shared_ptr a custom deleter. Instead of return master, we can do this:

std::shared_ptr<Master> master2(master.get(),
    [master, serializer, uiUpdater](Master*) mutable
    {
        master.reset();
        serializer.reset();
        uiUpdater.reset();
    });
return master2;

This requires a bit of explanation. First of all, we are creating a new reference count/shared_ptr "family" over the Master object we created earlier. Normally having more than one reference count for the same object would eventually result in a double-deletion disaster, but that won't happen here. The reason is that we're giving this second shared_ptr a custom deleter lambda that doesn't actually delete anything—instead, it simply hangs on to "keepalive" shared_ptrs to the three objects via captures. However, when the master2 family's strong reference count reaches zero and this deleter lambda is called, it will reset^* all three of them—it is this action that will result in the actual deletion of the three objects via the ordinary shared_ptr mechanism. Which is exactly what we want.
^*Note: It is important to actually reset the captured shared_ptrs in the deleter and not simply rely on them being destroyed when the lambda is. If this wasn't done, weak_ptrs in the master2 family would be able to keep the objects alive.

Thursday, June 5, 2014

A compatible animated ActionBar refresh icon

Many apps put a refresh icon on the ActionBar, and it's nice to have that icon animate while a refresh is in progress. The internet is full of suggestions on how to do this, but most either require third party libraries (e.g. ActionBarSherlock) or are not compatible with older versions of the API. Here is some simple code that will rotate the refresh icon clockwise for API 4+ (disclaimer: I've only tested it down to API 10).

Note: The code assumes you already have ic_action_refresh in res/drawable. You can get the stock Android refresh icon from the Action Bar Icon Pack here.

res/anim/spin_clockwise.xml:

<?xml version="1.0" encoding="utf-8"?>

<rotate xmlns:android="http://schemas.android.com/apk/res/android"
    android:duration="1000"
    android:fromDegrees="0"
    android:interpolator="@android:anim/linear_interpolator"
    android:pivotX="50%"
    android:pivotY="50%"
    android:toDegrees="360" />

res/layout/refresh_action_view.xml:

<?xml version="1.0" encoding="utf-8"?>

<ImageView xmlns:android="http://schemas.android.com/apk/res/android"
    android:contentDescription="Refresh"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:src="@drawable/ic_action_refresh"
    android:paddingLeft="12dp"
    android:paddingRight="12dp"
    />

AnimatingRefreshButtonManager.java:

import android.content.Context;
import android.support.v4.view.MenuItemCompat;
import android.view.MenuItem;
import android.view.View;
import android.view.animation.Animation;
import android.view.animation.AnimationUtils;

public class AnimatingRefreshButtonManager
{
    private final MenuItem mRefreshItem;
    private final Animation mRotationAnimation;

    private boolean mIsRefreshInProgress = false;

    public AnimatingRefreshButtonManager(Context context,
       MenuItem refreshItem)
    {
        // null checks omitted for brevity

        mRefreshItem = refreshItem;
        mRotationAnimation = AnimationUtils.loadAnimation(
            context, R.anim.spin_clockwise);

        mRotationAnimation.setAnimationListener(
            new Animation.AnimationListener()
            {
                @Override public void onAnimationStart(Animation animation) {}

                @Override public void onAnimationEnd(Animation animation) {}

                @Override public void onAnimationRepeat(Animation animation)
                {
                    // If a refresh is not in progress, stop the animation
                    // once it reaches the end of a full cycle
                    if (!mIsRefreshInProgress)
                        stopAnimation();
                }
            }
        );

        mRotationAnimation.setRepeatCount(Animation.INFINITE);
    }

    public void onRefreshBeginning()
    {
        if (mIsRefreshInProgress)
            return;
        mIsRefreshInProgress = true;

        stopAnimation();
        MenuItemCompat.setActionView(mRefreshItem,
            R.layout.refresh_action_view);
        View actionView = MenuItemCompat.getActionView(mRefreshItem);
        if (actionView != null)
            actionView.startAnimation(mRotationAnimation);
    }

    public void onRefreshComplete()
    {
        mIsRefreshInProgress = false;
    }

    private void stopAnimation()
    {
        View actionView = MenuItemCompat.getActionView(mRefreshItem);
        if (actionView == null)
            return;
        actionView.clearAnimation();
        MenuItemCompat.setActionView(mRefreshItem, null);
    }
}

Skeleton Activity demonstrating usage:

import android.app.Activity;
import android.view.Menu;
import android.view.MenuItem;

public class SomeActivity extends Activity
{
    private AnimatingRefreshButtonManager mRefreshButtonManager;
    
    @Override public boolean onCreateOptionsMenu(Menu menu)
    {
        getMenuInflater().inflate(R.menu.main, menu);

        MenuItem refreshItem = menu.findItem(R.id.action_refresh);

        mRefreshButtonManager =
           new AnimatingRefreshButtonManager(this, refreshItem);

        return super.onCreateOptionsMenu(menu);
    }
    
    @Override public boolean onOptionsItemSelected(MenuItem item)
    {
        switch (item.getItemId())
        {
            case R.id.action_refresh:
                mRefreshButtonManager.onRefreshBeginning();
                // start the refresh
                return true;
        }

        return super.onOptionsItemSelected(item);
    }
    
    // how you detect this is application-specific
    void onRefreshComplete()
    {
        mRefreshButtonManager.onRefreshComplete();
    }
}

Monday, May 19, 2014

Android: Stop your ringtones before sending them to the garbage collector

For some time, I've been aware that the warning "MediaPlayer finalized without being released" was occasionally showing up in my app's logcat output. This was a bit mystifying since I don't use MediaPlayer anywhere, but there were more important things to work on.

Today I finally got fed up and tried to figure out what was going on. It turns out that ringtones are the culprit. RingtoneManager.getRingtone() calls open() on the Ringtone object it returns. Ringtone.open() sets up a MediaPlayer. And this MediaPlayer is only released if you call Ringtone.stop() before letting the thing get GC'ed. Note that you do not actually need to play the ringtone for this to be true. As far as I can tell, none of this is documented.

Anyway, I have no idea how much harm there is in neglecting to stop your ringtones and release their media players, but doing so will at least get rid of a warning message.

Monday, May 12, 2014

A trick for tracking down tough memory leaks in VC++

Memory leaks are a common problem in C++. Even if you use smart pointers everywhere, you can still leak memory if you create a shared_ptr cycle. If this happens in a debug build and you've enabled leak detection as described here, Visual Studio will alert you in the output at shutdown:

Detected memory leaks!
Dumping objects ->
C:\PROGRAM FILES\VISUAL STUDIO\MyProjects\leaktest\leaktest.cpp(20) : {10021} 
normal block at 0x00780E80, 64 bytes long.
 Data: <                > CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD
Object dump complete.

The filename and line number may or may not be present, and may or may not be useful. If they identify one of your own files, and if there aren't too many different paths to the indicated location, they might be all you need to know. But if they point to something in the guts of vector or make_shared, they tell you practically nothing.

That "{10021}," on the other hand, has potential. What it's telling you is that the block returned by dynamic allocation #10021 was never freed. Unfortunately, by this point it's too late to say what allocation #10021 was. You can run the program again and use this technique to break when it occurs, but if there's any indeterminacy in what you allocate and when, it's not going to be the same call stack. In particular, if your application is multi-threaded (and what applications aren't anymore), none of the allocation numbers after the second thread spawns will be stable from one run to the next.

The third-party tool Visual Leak Detector is designed to help you in these situations, but (at least in my experience) it doesn't always give the right answer. So here is a sort of brute-force approach that can be used when all else fails.

The brute-force leak detector

(tested with VS2008-2012; things may be different in 2013)

Get debug symbol info for the VC++ runtime if you don't have it already. The quickest way is Tools > Options > Debugging > Symbols > check "Microsoft Symbol Servers."
Comment out code that spawns nonessential threads if they aren't part of the problem, and do anything else you can think of to temporarily reduce concurrency. The goal is to get to a place where the leaked allocation's number, while not completely stable, is at least restricted to a known, reasonably narrow range.
Set a breakpoint in dbgheap.c at the top of _heap_alloc_dbg_impl(). If intellisense is working, you can get to this file by typing _crtBreakAlloc somewhere in your code and doing a "Go To Definition" on it. Otherwise check "[drive]\Program Files (x86)\Microsoft Visual Studio [version]\VC\crt\src."
Right-click the breakpoint dot and choose "Condition..." Make the condition "_lRequestCurr >= [minimum] && _lRequestCurr <= [maximum]," where "[minimum]" and "[maximum]" are the endpoints for the range you observed in step #1.
Right-click the breakpoint dot again and select "When Hit...". Choose "Print a message" and make that message "***{_lRequestCurr} : $CALLSTACK". Make sure "Continue execution" is checked.
Start the program. It will run much more slowly than usual due to the need to evaluate the tracepoint each time the code passes over it. If you already have the leak narrowed down to a particular area of the code, it will help to disable the tracepoint until the suspect code is reached.
Once you're sure you've done whatever it takes to trigger the leak, shut the program down. Note the allocation numbers in Visual Studio's memory leak messages.

Every time a dynamic allocation happens, the tracepoint will dump the allocation number to the output followed by its callstack. So once the program exits and you see a message like the one above indicating that allocation 10021 was leaked, you can simply search the output for "***10021" to find the corresponding callstack. With any luck, that will tell you what you need to know.

Tuesday, April 22, 2014

C++: Hiding the implementation

C++ is a powerful language, but it does have a few aggravations (some would quibble with "a few"; others would quibble with "some"). Large among them is the header/source file dichotomy. If you have a class that you want to make visible to the rest of your application, you need to write an outline-like definition of it in one file (an #include-able header), and put the actual code for its methods in another (a source file).

For example, the header for a class representing a monster in an RPG might look like this:

Standard header file

// Revenant.h

#include "Armor.h"

class Revenant
{
public:
 Revenant();

 int GetHp() const;
 void SetHp(int newHp);

 void TakeAttack(int attackPower);

private:
 int GetDamageFromAttack(int attackPower) const;

private:
 static const int STARTING_HP = 20;

 int m_hp;
 Armor m_armor;
};

The corresponding source file might look like this:

Standard source file

// Revenant.cpp

#include "Revenant.h"
#include "ArmorFactory.h"

Revenant::Revenant()
 : m_hp(STARTING_HP)
 , m_armor(ArmorFactory::CreatePlateArmor())
{
}

int Revenant::GetHp() const
{
 return m_hp;
}

void Revenant::SetHp(int newHp)
{
 m_hp = newHp;
}

void Revenant::TakeAttack(int attackPower)
{
 int damage = attackPower;
 GetDamageFromAttack(attackPower);
 SetHp(GetHp() - damage);
}

int Revenant::GetDamageFromAttack(int attackPower) const
{
 int damage = attackPower;
 damage -= m_armor.MitigateDamage(damage);
 // apply more mitigations, etc.
 return damage;
}

The problem with header files

Right away you should notice one of the problems with the header file approach—you're forced to repeat a great deal of information. Luckily you don't need to worry about the two copies gradually drifting out of sync because, with a couple of exceptions, the compiler won't let you get away with that. But it's still extra work. Writing code in the source file and decide you want to add a private function? You'll need to switch to the header, scroll to the correct position, write the new function's signature, switch back to the implementation and write the same signature again (with a syntax that is just different enough to prevent you from using straight copy/paste), and only then start writing the actual code. There are some good tools that can automate much of this, but it's still a hassle, and you can't depend on automatic tools to keep things organized as well as you could by hand.

The other, bigger problem is more subtle. The need to declare all your member variables and functions in the header forces you to expose the private workings of your class to the entire outside world. Anyone reading Revenant.h will see that a Revenant starts with 20 HP, and may come to depend on that detail. More damningly, because Revenant has a private member variable of type Armor, any file that wants to use Revenant will need to #include Armor.h, even if it never actually uses Armor directly. In addition to making more work for the preprocessor, this means that any change to Armor.h will require all consumers of Revenant to be recompiled. Something similar happens if you decide to change Revenant's implementation details, perhaps by splitting GetDamageFromAttack() into several functions that each apply a different type of mitigation. This would mean that everyone using Revenant would need to be recompiled, even though no one should need to care about the details of how you calculate damage. Thanks to transitive dependencies, over time this sort of arrangement can easily lead to a situation where changing a single "hot" header file causes a good chunk of the source files in the project to be recompiled.

pimpl: A partial solution

A big part of the reason we need to fuss around with all this header stuff is because the compiler needs to know the size of any object that you pass around. This means if you declare a parameter or variable of type MyAwesomeObject, the compiler will need to read all of MyAwesomeObject's details from its header file so it can determine how much space to set aside for it. But there is a loophole—if we replace all those MyAwesomeObjects with pointers to MyAwesomeObjects, and we never try to dereference any of those pointers, then we no longer need to tell the compiler anything about MyAwesomeObject except that it is the name of a class. This is because pointers are always the same size no matter what they point to. This is true for smart pointers too, since they are just glorified wrappers around raw pointers when all is said and done.

This fact is exploited in the idiom known as pimpl (either "private implementation" or "pointer to implementation"), which solves some of the header file issues by hiding implementation details in the source file. The basic idea is to define a second, private "impl" class within the source file to hold all the private methods and members of the public "interface" class. Your public class then declares a private pointer to an instance of the impl and no other private functions or variables. You still need to forward-declare the impl in the header so the compiler can make sense of the symbol, but includers don't need to know anything about it beyond its name. Revenant might look like this after being pimpled:

pimpl example

// Revenant.h

// forward declaration for the impl
// #includers don't get to know anything about it except its name
class RevenantImpl;

class Revenant
{
public:
 Revenant();
 Revenant(const Revenant& other);
 Revenant& operator =(const Revenant& other);
 
 int GetHp() const;
 void SetHp(int newHp);

 void TakeAttack(int attackPower);

private:
 // all (other) private implementation contained within
 std::shared_ptr<RevenantImpl> m_impl;
};



// Revenant.cpp

#include "Revenant.h"
#include "Armor.h"
#include "ArmorFactory.h"

class RevenantImpl
{
private:
 static const int STARTING_HP = 20;

 int m_hp;
 Armor m_armor;

public:
 RevenantImpl()
  : m_hp(STARTING_HP)
  , m_armor(ArmorFactory::CreatePlateArmor())
 {
 }

 int GetHp() const
 {
  return m_hp;
 }

 void SetHp(int newHp)
 {
  m_hp = newHp;
 }

 void TakeAttack(int attackPower)
 {
  int damage = GetDamageFromAttack(attackPower);
  SetHp(GetHp() - damage);
 }

private:
 int GetDamageFromAttack(int attackPower) const
 {
  int damage = attackPower;
  damage -= m_armor.MitigateDamage(damage);
  // apply more mitigations, etc.
  return damage;
 }
};



Revenant::Revenant()
 : m_impl(std::make_shared<RevenantImpl>())
{
}

Revenant::Revenant(const Revenant& other)
{
 *this = other;
}

Revenant& Revenant::operator =(const Revenant& other)
{
 *m_impl = *other.m_impl;
 return *this;
}

int Revenant::GetHp() const
{
 return m_impl->GetHp();
}

void Revenant::SetHp(int newHp)
{
 m_impl->SetHp(newHp);
}

void Revenant::TakeAttack(int attackPower)
{
 m_impl->TakeAttack(attackPower);
}

This has solved our biggest problems—other coders won't see our starting HP anymore unless they deliberately go looking for it, Revenant's consumers don't need to know anything about Armor, and we can modify the impl all day long without forcing anyone else to recompile.

But we've only solved the repetition problem for the private functions. For the public ones we've actually made things worse since we now need to write each one's signature three times. We also had to jump through some hoops to add deep-copy semantics to the class (another option would have been to make the object noncopyable). Finally, we've got all these stub functions that do nothing but delegate to identical functions in the impl. The performance impact is minimal, but it's that much extra code to read and write. This final point can be ameliorated somewhat by making everything in the impl public and dividing the logic between the two classes:

A more freeform pimpl example

// Revenant.cpp

#include "Revenant.h"
#include "Armor.h"
#include "ArmorFactory.h"

class RevenantImpl
{
public:
 static const int STARTING_HP = 20;

 int m_hp;
 Armor m_armor;

public:
 RevenantImpl()
  : m_hp(STARTING_HP)
  , m_armor(ArmorFactory::CreatePlateArmor())
 {
 }

 int GetDamageFromAttack(int attackPower) const
 {
  int damage = attackPower;
  damage -= m_armor.MitigateDamage(damage);
  // apply more mitigations, etc.
  return damage;
 }
};



Revenant::Revenant()
 : m_impl(std::make_shared<RevenantImpl>())
{
}

Revenant::Revenant(const Revenant& other)
{
 *this = other;
}

Revenant& Revenant::operator =(const Revenant& other)
{
 *m_impl = *other.m_impl;
 return *this;
}

int Revenant::GetHp() const
{
 return m_impl->m_hp;
}

void Revenant::SetHp(int newHp)
{
 m_impl->m_hp = newHp;
}

void Revenant::TakeAttack(int attackPower)
{
 int damage = m_impl->GetDamageFromAttack(attackPower);
 SetHp(GetHp() - damage);
}

This works, but it's ugly. The code is scattered across two locations. Sometimes you need to prefix a member's name with m_impl-> and sometimes you don't, and if you move code from one class to the other, you'll need to either add or remove those prefixes. Further, although I don't show it here, in a real class it's likely that at some point the impl will need to call a function in the outer class, which means it must be provided with a reference to it. All this adds up to one more piece of state information you need to keep in your head when you're reading and writing code, and there's enough of that already.

Interface-based programming: a better solution

A technique called interface-based programming takes things one step farther in an attempt to solve the same problems pimpl addresses without introducing the new ones it creates (I am not the biggest fan of the name "interface-based programming" since that term already has a similar but broader definition, but that's what people call it). With interface-based programming, the header file contains nothing but a pure, abstract definition of the class's public interface and a free or static factory function for creating concrete instances. The concrete class itself lives entirely within the source file. Let's take a look at Revenant again:

Interface-based programming example

// Revenant.h

class Revenant
{
public:
 // static factory function instead of ctor
 static std::shared_ptr<Revenant> Create();
 
 // virtual dtor is essential!
 virtual ~Revenant() = 0 {}
 
 // interface requires clone semantics instead of copy
 virtual std::shared_ptr<Revenant> Clone() const = 0;

 virtual int GetHp() const = 0;
 virtual void SetHp(int newHp) = 0;

 virtual void TakeAttack(int attackPower) = 0;

private:
 // copy not implemented; hide the assignment operator
 Revenant& operator =(const Revenant&);
};



// Revenant.cpp

#include "Revenant.h"
#include "Armor.h"
#include "ArmorFactory.h"

namespace
{
 class RevenantImpl : public Revenant
 {
 private:
  static const int STARTING_HP = 20;

  int m_hp;
  Armor m_armor;

 public:
  RevenantImpl()
   : m_hp(STARTING_HP)
   , m_armor(ArmorFactory::CreatePlateArmor())
  {
  }

  virtual std::shared_ptr<Revenant> Clone() const override
  {
   auto clone = std::make_shared<RevenantImpl>();
   clone->m_hp = m_hp;
   clone->m_armor = m_armor;
   return clone;
  }

  virtual int Revenant::GetHp() const override
  {
   return m_hp;
  }

  virtual void Revenant::SetHp(int newHp) override
  {
   m_hp = newHp;
  }

  virtual void Revenant::TakeAttack(int attackPower) override
  {
   int damage = GetDamageFromAttack(attackPower);
   SetHp(GetHp() - damage);
  }

 private:
  int GetDamageFromAttack(int attackPower)
  {
   int damage = attackPower;
   damage -= m_armor.MitigateDamage(damage);
   // apply more mitigations, etc.
   return damage;
  }
 };
}

std::shared_ptr<Revenant> Revenant::Create()
{
 return std::make_shared<RevenantImpl>();
}

This really gives us the best of both worlds. As with pimpl, the header only communicates the class's public interface. But unlike pimpl, the entire implementation is now contained within a single class. We've also gotten rid of pimpl's stub functions and triple-repeats, though we still had to write the signatures of the public methods twice (that's as good as we're going to do). As an added bonus, Revenant is now a true interface in the technical sense of the word, which means we gain the ability to mock and stub it in tests for free.

A couple details to note: first, since client code can only access RevenantImpl via a pointer to the base class Revenant, it is essential that Revenant have a virtual destructor; otherwise deleting a Revenant would have undefined behavior. If this hadn't been done in the code above, for example, it's likely that the Armor member variable's destructor would not be called properly, and any resources it held would be leaked.

Second, since Revenant is an interface now, it no longer makes sense to copy instances of it—we need to use clone semantics instead. In addition to adding Clone() to the interface, this means we should hide the assignment operator. If we didn't, someone could get away with writing "*revenantPointer1 = *revenantPointer2". The compiler would accept this, but the call would do nothing since the Revenant interface itself doesn't contain anything to copy.

Inheritance

If I've neglected to mention inheritance in any of the above, it's because I practically never use it—I've been pretty thoroughly converted to the composition camp. But if you want to hide your private implementation and retain the ability for other classes to derive from you, all is not lost. You're pretty much out of luck with interface-based programming, but pimpl can still get the job done. You'll just need to add any protected methods or members you want to expose to your descendants to your header—anything in your private impl will be invisible to them (they'll be able to see the m_impl pointer itself if you make it protected, but they won't know anything about the structure of the object it points to).

pimpl vs. interface-based programming

I've described two different techniques for hiding implementation details in C++ here; how do you decide which to use when? Or when to use them at all? A lot of it is personal preference, but I follow an algorithm that goes something like this:

I don't bother hiding implementation at all if:

It's a simple object where any of this would be overkill
I need all the object's data to be contiguous in memory
I don't want to allocate anything on the heap
Performance is so critical that I can't afford either interface-based-programming's virtual calls or pimpl's extra delegations (very, very rare)

Otherwise, I use pimpl if:

I need to be able to allocate instances on the stack
I can't afford the extra overhead of the virtual function calls
I'm working with some monster legacy class that can't easily be converted to interface-based programming. In this case, I'll add an impl to hold new private data and functionality while leaving everything else the way it is.
I need to be able to inherit from the class

Otherwise, the default choice is interface-based programming

Wrapup

Implementation-hiding is one of those game-changing concepts that can radically alter how you design and think about your code. I use it for virtually every class I write these days, and can hardly imagine doing things any other way (every time I have to go back and read something I wrote n years ago before I learned about these concepts, I die a little inside). Interface-based programming in particular has a way of forcing you to design your code in a way that makes it more modular and testable almost by accident. And anything that makes you think about your class's public interface up-front and enforces not just a conceptual but also a physical separation between interface and implementation can only be a good thing.

Sunday, April 20, 2014

RingtonePreference and SharedPreferences.OnSharedPreferenceChangeListener

Revision

This problem was actually something different than what I (and stackoverflow) originally thought. RingtonePreference notifies listeners just fine—it has to, because the notification happens at the level of the SharedPreference file, and there's nothing client code can do to shut that behavior off. What was actually happening was this: RingtonePreference launches a new activity that partially conceals the PreferenceActivity. This causes the PreferenceActivity to be paused. And I was unregistering my listener in onPause(). I switched the (un)registration to onStart()/onStop() instead, and everything worked fine.

Original Post

Since I seem to have gotten into the habit of documenting Android framework bugs and surprises as I encounter them, here's another. RingtonePreference doesn't notify the underlying SharedPreference file's OnSharedPreferenceChangeListeners when it persists its changes. A workaround is to register a Preference.OnPreferenceChangeListener on each individual RingtonePreference instead. Just remember that this handler is notified before the change gets persisted to storage, so if you need to use the new value in the handler, get it from the Object that is passed in rather than from the SharedPreferences file itself.

Also remember that both OnPreferenceChangeListener and OnSharedPreferenceChangeListener are notified before any dependency-related updates are applied, so for a Preference whose enabled state is linked to a different Preference, you can't rely on isEnabled() returning the correct value inside either type of handler.

Thursday, April 17, 2014

The revamped C++11 auto keyword: blessing and curse

In 2011, the C++11 (aka C++0x) standard was accepted and subsequently implemented at least in part by most major compilers. One of the more talked-about changes to the language was the expansion of the auto keyword.

auto has been in C since the beginning as a way to denote automatic (as opposed to static, register, or external) storage for a variable. But since auto is implied by default when no storage class is specified, no one ever bothered to actually write it. Until 2011. With C++11, the word auto can also be used in place of a type identifier when creating a variable as a way to tell the compiler to deduce the type from context. This change was nominally motivated by a need to support template programming techniques where the type really can't be known until compile time, but what auto really gets used for 99% of the time is saving keystrokes. No longer must you, weary coder, suffer wrist strain from typing out travesties like:

Some::Long::Namespace::SomeLongClassName<SomeTemplateType>* myVar =
   new Some::Long::Namespace::SomeLongClassName<SomeTemplateType>();

Instead, you can write:

auto myVar = new Some::Long::Namespace::SomeLongClassName<SomeTemplateType>();

Voila! auto has nearly halved the number of characters needed to construct a new instance of this particular object. And if that's as far as things went, I doubt anyone would have a problem with it. But like any good thing, auto can be taken too far. The existence of auto means that authors in principle need never write a typename on the left side of an assignment again, and more than a few do exactly that. This certainly makes things more convenient for you as a writer. The problem is that—hate to break it to you—the writer's convenience doesn't matter. Code should always be written for the convenience of the reader. The crucial question, then, is: does auto make code easier or harder to read? And the answer to that question, as usual, is that it depends.

But before going further, let's address a couple pro-auto arguments I simply don't buy. You'll hear that if you really want to know what an auto is, any modern IDE will have intellisense features that can tell you with little fuss. With C++ that may or may not be true. C++ is a notoriously difficult language to parse in real time, and while there are a bunch of tools that can make reasonable guesses and be right most of the time as long as you don't try to do anything too tricky, only the compiler is able to make a final, authoritative judgement. Further, code is not always read in an IDE. If I'm doing a diff, or getting a snippet in an email, or reading something that's been posted online, there are no tools available to help figure out what that auto actually is, and I just have to hope it doesn't matter.

Next, some people will tell you you shouldn't be overly-concerned with types when you're reading code anyway, and that it's better to create good variable names and have the types abstracted away so you can reason about the code at a higher level. And if a high-level overview is what I'm after, that may be a valid point. But what if it's not? What if the code doesn't work, and the whole reason I'm reading it is to figure out where it's wrong? Bugs and the devil are in the details, and they're that much harder to find when the details are hidden. Now, if I've got decent intellisense (again, by no means a given with C++), I can hover over a variable name and get a popup telling me what its type is. But I'll probably forget as soon as I start thinking about something else, especially if there are multiple auto variables I'm trying to track at the same time. When you're debugging, you want to maximize the amount of information you can access at a glance. auto is your enemy here.

Others will claim that auto makes refactoring easier. If GetValue() returns int, but everyone who calls it stores the result in an auto, I can later change it to return double without breaking compilation all over the place. My response to that is if you're going to change something as fundamental as a function's return type, you darn well better take the time to examine each and every caller and make sure it will still function correctly after the change. The fact that the code won't compile until you do so helps enforce that sort of diligence, and using auto to get around the "problem" just makes it that much easier to write bugs.

So when is it ok to use auto? My opinions are always evolving and I've gotten quite a bit more permissive as time has gone by, but here are the cases where I currently allow myself the convenience:

1) The type already appears on the right-hand side of the expression.

auto somethingConcrete = dynamic_cast<LongConcreteName*>(somethingAbstract);

We're typing enough here already without repeating LongConcreteName. I don't feel the slightest bit of ambivalence about using auto in these situations.

Sometimes the type isn't strictly quoted on the right side, but can be confidently deduced from what is there.

auto mySuperWidget = SuperWidget::Create();

SuperWidget::Create() had better return a SuperWidget. True, we don't know if we're getting a SuperWidget*, a shared_ptr<SuperWidget>, or a stack SuperWidget. This used to bother me and I can contrive situations where making the wrong assumption would cause a bug, but I've learned to be a little flexible here. These days I tend to always return shared_ptr from my factory functions, so I typically use auto in those cases and only write the type out when it needs to be something different.

A more dicey example is something like this:

auto mySuperWidget = myWidgetFactory.CreateSuperWidget();

Am I getting a concrete SuperWidget? An ISuperWidget? An IWidget whose concrete type is SuperWidget? I go back and forth on cases like this. It comes down to how much ambiguity there really is given the classes and interfaces that really exist in the codebase, and what kind of mood I'm in.

2) You're capturing the return value from some function, and you're not doing anything with it except passing it off to someone else.

auto result = ComputeResult();
return FilteredResult(result);

This is not fundamentally different from:

return FilteredResult(ComputeResult());

3) You're capturing the return value from a well-known library function that any likely reader will be familiar with.

auto it = myStlCollection.begin();

I think it's safe to assume that anyone who's working with a C++/STL codebase will know that begin() returns some sort of iterator. In addition, the full typenames for STL iterators are really ugly if you have to write them out completely. Which leads to...

4) The actual typename is really long and ugly. A project I'm working on right now has a function that returns:

std::shared_ptr<Reactive::IObservable<std::shared_ptr<std::vector<double>>>>

(it actually has quite a lot of functions that return things of that sort). Although having that all in front of you does communicate a lot of important information, this is a case where the inconvenience of having to look the type up may be less than the inconvenience of having a nearly 80-character typename filling up the screen. In situations like these it's usually better to use a typedef to communicate the gist of the type with a smaller number of characters (although many of the complaints I have about auto work against typedef too), but if you have a whole bunch of functions all returning ugly types and they're all just a little bit different from each other, that may not be practical.

What all of these except #4 boil down to is simply this: only use auto when the reader will know what the type is anyway. If I only got one rule, that's what it would be.

Used, properly, auto is great for both writer and reader. After getting used to having it, I've been surprised how frustrating it is to work in languages like Java that lack an equivalent, and I certainly wouldn't want to go back to the days before it existed. But used willy-nilly by authors who don't want to type ten characters instead of four, or (worse) who can't be bothered to check what the result of some expression is, it can lead to incomprehensible, bug-prone code that can't be debugged. Enjoy responsibly.