For example, the header for a class representing a monster in an RPG might look like this:
Standard header file
// Revenant.h #include "Armor.h" class Revenant { public: Revenant(); int GetHp() const; void SetHp(int newHp); void TakeAttack(int attackPower); private: int GetDamageFromAttack(int attackPower) const; private: static const int STARTING_HP = 20; int m_hp; Armor m_armor; };
The corresponding source file might look like this:
Standard source file
// Revenant.cpp #include "Revenant.h" #include "ArmorFactory.h" Revenant::Revenant() : m_hp(STARTING_HP) , m_armor(ArmorFactory::CreatePlateArmor()) { } int Revenant::GetHp() const { return m_hp; } void Revenant::SetHp(int newHp) { m_hp = newHp; } void Revenant::TakeAttack(int attackPower) { int damage = attackPower; GetDamageFromAttack(attackPower); SetHp(GetHp() - damage); } int Revenant::GetDamageFromAttack(int attackPower) const { int damage = attackPower; damage -= m_armor.MitigateDamage(damage); // apply more mitigations, etc. return damage; }
The problem with header files
The other, bigger problem is more subtle. The need to declare all your member variables and functions in the header forces you to expose the private workings of your class to the entire outside world. Anyone reading Revenant.h will see that a Revenant starts with 20 HP, and may come to depend on that detail. More damningly, because Revenant has a private member variable of type Armor, any file that wants to use Revenant will need to #include Armor.h, even if it never actually uses Armor directly. In addition to making more work for the preprocessor, this means that any change to Armor.h will require all consumers of Revenant to be recompiled. Something similar happens if you decide to change Revenant's implementation details, perhaps by splitting GetDamageFromAttack() into several functions that each apply a different type of mitigation. This would mean that everyone using Revenant would need to be recompiled, even though no one should need to care about the details of how you calculate damage. Thanks to transitive dependencies, over time this sort of arrangement can easily lead to a situation where changing a single "hot" header file causes a good chunk of the source files in the project to be recompiled.
pimpl: A partial solution
This fact is exploited in the idiom known as pimpl (either "private implementation" or "pointer to implementation"), which solves some of the header file issues by hiding implementation details in the source file. The basic idea is to define a second, private "impl" class within the source file to hold all the private methods and members of the public "interface" class. Your public class then declares a private pointer to an instance of the impl and no other private functions or variables. You still need to forward-declare the impl in the header so the compiler can make sense of the symbol, but includers don't need to know anything about it beyond its name. Revenant might look like this after being pimpled:
pimpl example
// Revenant.h // forward declaration for the impl // #includers don't get to know anything about it except its name class RevenantImpl; class Revenant { public: Revenant(); Revenant(const Revenant& other); Revenant& operator =(const Revenant& other); int GetHp() const; void SetHp(int newHp); void TakeAttack(int attackPower); private: // all (other) private implementation contained within std::shared_ptr<RevenantImpl> m_impl; }; // Revenant.cpp #include "Revenant.h" #include "Armor.h" #include "ArmorFactory.h" class RevenantImpl { private: static const int STARTING_HP = 20; int m_hp; Armor m_armor; public: RevenantImpl() : m_hp(STARTING_HP) , m_armor(ArmorFactory::CreatePlateArmor()) { } int GetHp() const { return m_hp; } void SetHp(int newHp) { m_hp = newHp; } void TakeAttack(int attackPower) { int damage = GetDamageFromAttack(attackPower); SetHp(GetHp() - damage); } private: int GetDamageFromAttack(int attackPower) const { int damage = attackPower; damage -= m_armor.MitigateDamage(damage); // apply more mitigations, etc. return damage; } }; Revenant::Revenant() : m_impl(std::make_shared<RevenantImpl>()) { } Revenant::Revenant(const Revenant& other) { *this = other; } Revenant& Revenant::operator =(const Revenant& other) { *m_impl = *other.m_impl; return *this; } int Revenant::GetHp() const { return m_impl->GetHp(); } void Revenant::SetHp(int newHp) { m_impl->SetHp(newHp); } void Revenant::TakeAttack(int attackPower) { m_impl->TakeAttack(attackPower); }
This has solved our biggest problems—other coders won't see our starting HP anymore unless they deliberately go looking for it, Revenant's consumers don't need to know anything about Armor, and we can modify the impl all day long without forcing anyone else to recompile.
But we've only solved the repetition problem for the private functions. For the public ones we've actually made things worse since we now need to write each one's signature three times. We also had to jump through some hoops to add deep-copy semantics to the class (another option would have been to make the object noncopyable). Finally, we've got all these stub functions that do nothing but delegate to identical functions in the impl. The performance impact is minimal, but it's that much extra code to read and write. This final point can be ameliorated somewhat by making everything in the impl public and dividing the logic between the two classes:
A more freeform pimpl example
// Revenant.cpp #include "Revenant.h" #include "Armor.h" #include "ArmorFactory.h" class RevenantImpl { public: static const int STARTING_HP = 20; int m_hp; Armor m_armor; public: RevenantImpl() : m_hp(STARTING_HP) , m_armor(ArmorFactory::CreatePlateArmor()) { } int GetDamageFromAttack(int attackPower) const { int damage = attackPower; damage -= m_armor.MitigateDamage(damage); // apply more mitigations, etc. return damage; } }; Revenant::Revenant() : m_impl(std::make_shared<RevenantImpl>()) { } Revenant::Revenant(const Revenant& other) { *this = other; } Revenant& Revenant::operator =(const Revenant& other) { *m_impl = *other.m_impl; return *this; } int Revenant::GetHp() const { return m_impl->m_hp; } void Revenant::SetHp(int newHp) { m_impl->m_hp = newHp; } void Revenant::TakeAttack(int attackPower) { int damage = m_impl->GetDamageFromAttack(attackPower); SetHp(GetHp() - damage); }
This works, but it's ugly. The code is scattered across two locations. Sometimes you need to prefix a member's name with m_impl-> and sometimes you don't, and if you move code from one class to the other, you'll need to either add or remove those prefixes. Further, although I don't show it here, in a real class it's likely that at some point the impl will need to call a function in the outer class, which means it must be provided with a reference to it. All this adds up to one more piece of state information you need to keep in your head when you're reading and writing code, and there's enough of that already.
Interface-based programming: a better solution
Interface-based programming example
// Revenant.h class Revenant { public: // static factory function instead of ctor static std::shared_ptr<Revenant> Create(); // virtual dtor is essential! virtual ~Revenant() = 0 {} // interface requires clone semantics instead of copy virtual std::shared_ptr<Revenant> Clone() const = 0; virtual int GetHp() const = 0; virtual void SetHp(int newHp) = 0; virtual void TakeAttack(int attackPower) = 0; private: // copy not implemented; hide the assignment operator Revenant& operator =(const Revenant&); }; // Revenant.cpp #include "Revenant.h" #include "Armor.h" #include "ArmorFactory.h" namespace { class RevenantImpl : public Revenant { private: static const int STARTING_HP = 20; int m_hp; Armor m_armor; public: RevenantImpl() : m_hp(STARTING_HP) , m_armor(ArmorFactory::CreatePlateArmor()) { } virtual std::shared_ptr<Revenant> Clone() const override { auto clone = std::make_shared<RevenantImpl>(); clone->m_hp = m_hp; clone->m_armor = m_armor; return clone; } virtual int Revenant::GetHp() const override { return m_hp; } virtual void Revenant::SetHp(int newHp) override { m_hp = newHp; } virtual void Revenant::TakeAttack(int attackPower) override { int damage = GetDamageFromAttack(attackPower); SetHp(GetHp() - damage); } private: int GetDamageFromAttack(int attackPower) { int damage = attackPower; damage -= m_armor.MitigateDamage(damage); // apply more mitigations, etc. return damage; } }; } std::shared_ptr<Revenant> Revenant::Create() { return std::make_shared<RevenantImpl>(); }
This really gives us the best of both worlds. As with pimpl, the header only communicates the class's public interface. But unlike pimpl, the entire implementation is now contained within a single class. We've also gotten rid of pimpl's stub functions and triple-repeats, though we still had to write the signatures of the public methods twice (that's as good as we're going to do). As an added bonus, Revenant is now a true interface in the technical sense of the word, which means we gain the ability to mock and stub it in tests for free.
A couple details to note: first, since client code can only access RevenantImpl via a pointer to the base class Revenant, it is essential that Revenant have a virtual destructor; otherwise deleting a Revenant would have undefined behavior. If this hadn't been done in the code above, for example, it's likely that the Armor member variable's destructor would not be called properly, and any resources it held would be leaked.
Second, since Revenant is an interface now, it no longer makes sense to copy instances of it—we need to use clone semantics instead. In addition to adding Clone() to the interface, this means we should hide the assignment operator. If we didn't, someone could get away with writing "*revenantPointer1 = *revenantPointer2". The compiler would accept this, but the call would do nothing since the Revenant interface itself doesn't contain anything to copy.
Inheritance
pimpl vs. interface-based programming
- I don't bother hiding implementation at all if:
- It's a simple object where any of this would be overkill
- I need all the object's data to be contiguous in memory
- I don't want to allocate anything on the heap
- Performance is so critical that I can't afford either interface-based-programming's virtual calls or pimpl's extra delegations (very, very rare)
- Otherwise, I use pimpl if:
- I need to be able to allocate instances on the stack
- I can't afford the extra overhead of the virtual function calls
- I'm working with some monster legacy class that can't easily be converted to interface-based programming. In this case, I'll add an impl to hold new private data and functionality while leaving everything else the way it is.
- I need to be able to inherit from the class
- Otherwise, the default choice is interface-based programming
Wrapup
Implementation-hiding is one of those game-changing concepts that can radically alter how you design and think about your code. I use it for virtually every class I write these days, and can hardly imagine doing things any other way (every time I have to go back and read something I wrote n years ago before I learned about these concepts, I die a little inside). Interface-based programming in particular has a way of forcing you to design your code in a way that makes it more modular and testable almost by accident. And anything that makes you think about your class's public interface up-front and enforces not just a conceptual but also a physical separation between interface and implementation can only be a good thing.
No comments:
Post a Comment