Thoughts: Approaches to encapsulation

One part of programming I enjoy is the creative process--taking a thing you've sketched out on a whiteboard, and making it come alive. Once you're at this long enough though, you start to see common problems crop-up, no matter what problem you're solving or what language you're solving it in. I'd like to think through one of those today (or at least start to...)

The problem is, I can't really give a snazzy name to the problem itself. When you're dealing with multithreaded programming, you can say "I have concurrency problems," and someone in the ether will grok what you mean. When dealing with GUIs, you can talk Model-View-Controller (MVC) all day. As my friend Chuck would say, what I have here is a "meta" problem--a problem about a problem.

The best way I can describe it is when a system is "over engineered" or "over factored": At the machine level, a program is an algorithm--a sequence of unambiguous instructions executed by a computer one at a time. In the early days of computing, that's all there were, sequences of instructions, written in machine language. Then came assembly, a shorthand way of writing machine language. At that point, people started seeing commonality among lines of asseembly: "Hey, these 5 lines are repeated 200 times in our code!" So, people came-up with ways to factor-out those common pieces into their own blocks, and functions (or procedures) were born. Languages became more abstract, and programmers became "software engineeers". Like real engineers, we want to reuse as much prior art as possible to get the job done--prior art is often stable and debugged.

But this is where I'm getting to the problem--as engineers, we like things to be strict and uniform. We try to make systems "freeze" so that we can layer things one atop another, each layer forming an abstraction for the layers above it. For instance, when I write a networking application, I can create a software construct called a "Socket", which connects to another machine...I need not worry about what sort of network is between our two machines, or if that other machine is running a different operating system. The socket abstraction takes care of all that.

Heres where it starts getting dicey. Sometimes we overdo that layering and factoring, and it creates a problem, like the one I'm facing today: I have two systems that do pretty much the same thing. One is a monolithic win32 app, the other is a collection of 3 separate modules layered like this:


--------------------
| UI layer |
|------------------|
| Core layer |
|------------------|
| Communication |
| layer |
-------------------|


Oh, and by the way, the communication layer is a separate service/daemon process from the other piece. So here's the problem: The UI layer needs a piece of information found deep in the Communication layer. In the monolithic app, this is no problem: I just added another getXXX() function to the communication piece, and then had the UI stuff call it--it wasn't that clean, but it was ~1/2 day worth of work to get working properly.

This other thing has been 2 months of design, coordination, and back-and-forth trying to achieve the same thing we could do in the monolithic app in 3 hrs.

To be blunt: I've seen this problem before. You need to pass an additional parameter WAAAAY down the call stack, or you need to pass back 2 new pieces of data where you'd designed your oh-so-nice system to use only 1. I've seen some good and bad approaches to this:

  • Use global variables. That way, everything in the namespace has access to the data it needs, if it needs it. You'd be surprised how often I've seen this in production software, espeically the sort that started out as a "little thing" and grew to a "big hairy mess". Global variables are evil--they're easy, but they quickly make a codebase unmanageable

  • Use an external datastore to serialize the data from the upper layer, and suck it back at the lower layer, or vice-versa. That is, circumvent the callstack, and use a database or the registry as a global variable. LOTS of Win32 programs do this, and I think it's a mistake. Just like with global variables, you quickly lose track of who's depending upon this external store, and it makes your application more difficult to port or refactor. However, this is the devil's bargain that most production apps are making today.

  • Pass parameters as variable length lists. This leads to a big, hairy mess, because flattening named parameters lists to an array is a lossy operation. I meant "address", but your code flattened that into array element 0...Which reveals my intent better? Perl does this, and I don't like it much. Some homegrown Java frameworks do this too. ;-)

  • Pass parameters as maps/hashmaps/dictionaries. This is the Ruby way, seems like, and it's a nice way of getting runtime flexibility without losing the semantics of named parameters.

  • Disallow mutable state altogether and let people compose functions as they need. Yeah, you probably knew I was coming around to this: Functional programming, Erlang, Lisp, etc. These have a different take on encapsulation altogether--forget O-O, just be very strict about who can change the state of a variable: NO ONE CAN.



Yeah, I've got a proper strawman, I know...generalizing from an over-factored system as a particular example to why we should all chuck it and start learning Functional Programming (FP). Like I said...these were just 'thoughts'

Comments

Popular posts from this blog

Review: The Southeast Christian Church Easter Pageant

No, I don't have Connective Tissue Disorder

Fun with Assembly