Expressing Implementation Sameness in Modern C++

Daisy Hollman

CppCon 2023
Twitter: @The_Whole_Daisy
Email: cpp@dsh.fyi

Disclaimers

  • I'm terrified
    • I often give talks about "programming", not software engineering
    • This is a software engineering talk, and you might not agree with me on everything.
    • I have some slides without code on them (not many), and I don't usually do that
  • It's really hard to talk about sameness and repetition in code on slides
    • In many recommendations, the maximum amount of code that you should be "comfortable" repeating is about the same as the amount of code that fits on a slide 😅
    • Please try to think critically about the concepts themselves rather than the specific examples I give in slides
  • I'm mostly interested in very large scale code that is meant to be maintained for a very long time

Common Storylines

  • Code is written to be read by humans
    • If you're a professional software engineer, you should be able to come up with multiple ways to get the computer to do basically the same thing
    • You should think about code in terms of how the reader will experience it
  • Cognitive load: the amount of mental effort required to understand and use code
    • When writing "real" code, almost everything you do should be focused on reducing cognitive load for the reader
  • Information loss: when the reader of the code doesn't have an easy way to retrieve a piece of information that's obvious to the writer
    • Start thinking of copy-paste this way!
    • Thought experiment: imagine what code and programming languages would look like if we disabled copy-paste in code editors
    • Copy-pasting code is selfish: it saves you time at the expense of someone else's time

Non-themes

  • I don't care how long it takes you to type something!
    • If you're typing most of the code you write, you need a better development environment!
  • I don't care about adding a layer or two of abstraction (within reason)
    • If you don't currently know off of the top of your head the keybinding to jump to the definition of a function (and back) in your development environment, you need a better IDE!
  • I don't care about adding (linear) compilation cost (within reason)
    • Distributed builds, caching, smaller files, modules, and more can all help with this
    • …but beware of non-linear scaling in compilation cost!
    • …and beware of using arcana (e.g., weird language features) that increases cognitive load!
  • I'm not interested in making it faster for you to write code
    • Reading and maintaining code efficiently is far more important!


Anyway...

Here we go!

🌼 🤷🏼‍♀️ 🌼

Two different kinds of Sameness

Interface Sameness

  • Enables users to treat things the same way in their code (create their own sameness)
  • Cannot be removed later
    • You can't change your user's code (at least not easily)
    • Once you allow users to treat things as the same, it's very hard to change that later
  • When done well: low code coupling ("the degree of interdependence between software modules")

Implementation Sameness

  • Enables readers to understand and use existing sameness or similarity
  • Can be changed or removed at any point in the future when it stops being helpful
  • When done well: high code cohesion ("the degree to which the elements inside a module belong together")

Don't Repeat Yourself (DRY)

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system


Andy Hunt and Dave Thomas, The Pragmatic Programmer

DRY programming isn't (always) about typing less!



Mechanisms for Expressing "Sameness" in C++

(Either implementation or interface sameness)

"Traditional" Polymorphism

  • Notice that the code reuse happens in the calling code
  • In other words, "traditional" polymorphism is oriented around interface sameness
  • It can be used for implementation sameness, but often with some added runtime cost

Templates

  • The code for vector<int> and vector<string> is the same (implementation sameness)
  • The interface is also the same (interface sameness)
  • But this doesn't always have to be the case!
  • handle_disturbance() could be thought of as part of the non-intrusive interface of vector<unique_ptr<Animal>>
  • handle_disturbance() is not part of the interface of vector<int>
  • In other words, we've introduced an interface difference while still maximizing implementation sameness

Interface difference with implementation sameness

(for intrusive interfaces)



Question: what's the /* ... */?

Interface difference with implementation sameness

(for intrusive interfaces)


  • Conceptually, the /* ... */ is something like...

Interface difference with implementation sameness

(A bad solution)

  • Why is this bad?
    • Leaks implementation details into the interface
    • Not restrictive enough
    • Still doesn't express all aspects of sameness

Mixins

  • How do we usually spell OfThings in C++?
    • Templates!

Mixins

  • We've succinctly expressed the implementation sameness between Barnyard and Canvas

Mixins

  • The class template OwningCollection is typically called a mixin
  • In this case, we've also introduced some interface sameness because we're using public inheritance
    • This reuse is helpful because it reuses names to reduce cognitive load ("names mean the same thing in different use cases")
    • but we've maintained some control over code coupling because OwningCollection<Animal> is unrelated to OwningCollection<Shape>

More Mixins

Question: what is the "sameness" that's not expressed here?

Element-wise comparisons!

More Mixins

  • ComparableElementwise uses curiously recurring template pattern (CRTP)
  • The elements() member function is sometimes called a customization point
  • Customization points give us a succinct way to express the differences in our code, enabling the expression of sameness

Reusing customization points

  • The customization point elements() can be used to express multiple unrelated types of "sameness"

Mixin Mixins

  • This down-cast is a common pattern when using CRTP
  • Common pattern? Say it with code!

C++23's Deducing this

  • What's wrong with this?
    • It mixes interface sameness with implementation sameness

What other forms of sameness do we have here?

  • This sort of "sameness" requires reflection to express (in general)
  • C++ needs this badly (and it doesn't have to be good)

Language Support for Mixins?

In C++20...

  • operator<=> is a language-level mixin for comparisons
  • We're "adding" reflection to C++ piecewise
  • This is not sustainable
    • One-of features like this increase cognitive load

Qualifier Forwarding

  • "Perfect forwarding" (available since C++11)
  • T&& is called a "forwarding reference" (or sometimes, a "universal reference")
  • It has to be exactly in that form to act that way
  • Equivalent to...
  • The rvalue reference version is harder to write...

Qualifier Forwarding in Member Functions

  • But we're missing something...
  • And actually, if we want to be thorough, we'd need at least 3:

Qualifier Forwarding in Member Functions

  • C++23 deducing this to the rescue!

Name Forwarding (Missing from C++)

  • Why is this better than exposing the vector directly to the user?
  • This would be pretty easy to build on top of relatively basic reflection and reification
  • Languages like Ruby and Python have this as a library feature
  • This is important because it reuses understanding and propagates code changes, not because it reduces typing

Don't Repeat Yourself (DRY)

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system


Andy Hunt and Dave Thomas, The Pragmatic Programmer

"Every piece of knowledge"


Question: What are the pieces of knowledge here?


The Owning and the Collection parts!

Separating the Ownership Mechanism

  • This is a typical pattern for a class template customization point
  • What else could Owner be?
    • shared_ptr
    • small buffer optimization
  • Warning: this can easily be taken too far!
    • Good customization point design (and when to use it) takes a career to learn

Standard template parameter customization points

  • std::vector<T, Allocator=std::allocator<T>>
    • The expandable container with contiguous storage "piece of knowledge" is separable from how that storage is created
  • std::queue<T, Container=std::deque<T>>
    • The semantics of a queue are separable from the way that the items in the queue are stored
      • This is actually an example of layering of customization points:
        • std::queue<T, Container=std::deque<T, std::allocator<T>>>

Standard template parameter customization points

  • std::unique_ptr<T, Deleter=std::default_delete<T>>
    • The unique ownership "piece of knowledge" is separable from how that memory (and even the object itself!) is created or destroyed
    • Where is the allocation and creation portion?
      • At the construction site!
      • But also...
      • Usually, however...

Down the rabbit hole: unique_ptr

  • std::unique_ptr<T, Deleter=std::default_delete<T>>
    • Does anything feel "off" here?
    • This design isolates the "piece of knowledge" that needs to persist for the lifetime of the object.
      • …sort of. What else does it do?
        • unique_ptr<T, Deleter>::pointer is (basically) Deleter::pointer if that type exists, otherwise T*.
        • Example:
        • Another example use case: boost::offset_ptr
        • Question: is the pointer type a separable piece of knowledge?
    • Another question: are allocation and destruction separable pieces of knowledge?

Don't Repeat Yourself (DRY)

Where does "separable" come into this?

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system


Andy Hunt and Dave Thomas, The Pragmatic Programmer

Down the rabbit hole: unique_ptr

  • When you're stuck on a design, look at analogous situations
    • Why?
      • Reducing cognitive load!
      • But also...DRY! You might find a way to avoid repeating yourself!
  • What is the most analogous standard library class template?
    • shared_ptr!
  • From cppreference.org: "Every standard library component that may need to allocate or release storage, from std::string, std::vector, [...], to shared_ptr, does so through an Allocator."
  • So allocating and releasing storage are not separable pieces of knowledge in the standard library
  • The authoritative representation for allocation and destruction of storage in the standard library is the Allocator abstraction

unique_ptr and shared_ptr: 🤦🏼‍♀️ 🤦🏼‍♀️ 🤦🏼‍♀️

  • So unique_ptr<T, Deleter> is "broken," right?
    • It doesn't use the authoritative representation for allocation and destruction.
  • Fortunately, shared_ptr uses allocators, so...
    • …wait...🤦🏼‍♀️
    • We have two options for destruction:
      • d(ptr)
      • alloc.destroy(ptr) then alloc.deallocate(ptr, 1)
        • Not used (at least not how you think it's used...)

allocate_shared to the rescue?

  • Fortunately, we have allocate_shared
  • "Unlike the shared_ptr constructors, allocate_shared does not accept a separate custom deleter: the supplied allocator is used for destruction of the control block and the T object, and for deallocation of their shared memory block."
    • 🎉
  • So we have an easy way to be consistent:
    • We can use make_shared when no allocation or deletion customization is needed,
    • and allocate_shared when we need to customize either of those

Why not allocate_unique?

allocate_unique's complexity would result from the fact that the Standard Library currently doesn't contain enough machinery to implement it—specifically, to adapt allocator syntax to deleter syntax. (Returning unique_ptr<T, unspecified> would be inconvenient for users.) This is unlike allocate_shared, because shared_ptr is powered by type erasure.


Stephan T. Lavavej, N3588

Deleter abstraction

While this proposal doesn't provide allocate_unique (and is recommending that it never be provided), one way to provide it without introducing a wrapper class could be to unify deleters and allocators in unique_ptr's specification. That is, specifying that if the expression d(ptr) is not valid, then D shall meet the Allocator requirements.


Stephan T. Lavavej, N3588



(end unique_ptr rabbit hole)

Concepts

  • Where do concepts fit in with interface and implementation sameness?
    • 🚨 C++20 concepts extract interface sameness without your permission! 😱
  • Concepts allow library users to accidentally create code coupling between unrelated modules based only on names
  • Concept checks don't check namespaces (especially when checking intrusive names like this!)

More tools for expressing "sameness"

That I don't have time to talk about 😭

  • "Normal" functions (bonus: also available in C)
  • Macros and Code generation
    • "Gross," but sometimes better than repeating things!
  • Customization Point Objects (CPOs)
  • Type erasure
  • constexpr functions
    • "Sameness" of compile-time and runtime implementations
  • Dependency injection (missing from C++, needs reflection)
  • Aspect-oriented programming (missing from C++, needs reflection)
  • Decoration (missing from C++, needs reflection)
  • …you get the idea


Thanks!

Twitter: @The_Whole_Daisy
Email: cpp@dsh.fyi