Anatomy of Software Applications

This is kinda "duh" stuff, but here goes: What does all professional software have in common? What attributes do all non-trivial programs share?

When you're in school and you write assignment programs, often, they're command-line, single-execution processes. These toy programs exist as drivers for some concept you're trying to grasp--string operations, numercal calculations, or Object-Oriented programming. An example might be: "Read a file that contains a maze and print out the steps to find the exit as coordinates."

That's not real world, however. What makes a real-world application?

I came up with 6 Characteristic areas :

  • Type/Mode

  • Algorithms/Business Logic

  • Communications

  • Logging/Debugging

  • Persistence/Migration

  • Metadata

What kind of app is it? Some examples would be: Command-line, GUI, Service/Daemon, Web App. For instance, Gmail is a web app used for reading mail online. Grep is a command-line app for filtering textual input against a pattern. Whatever your app is, it's one or more of these types.

Algorithms/Business Logic
This is what your app does--the problem that it solves. Be it as simple as 10 print "Hello World" or as complex as excel or mathematica, your app has business logic inside it.

Your app is running on a computer somewhere, but without communicating with the outside world, it's useless. Logic is blind, deaf, and dumb, and must have some data upon which to operate. This can be anything from reading from STDIN and writing to STDOUT, through files on disks, Sockets, Communication Ports, or a GUI.

* * *

Okay, okay, *every* app (even our maze example) fulfills these first three. It's the next three that get interesting.

In real world apps, we need some way to know what the app's doing as it's running, and we need a way to tunnel deeper and deeper into its innards as needed. They don't make MRI's for software (yet), so we need to weave logging into our apps somehow. This can take several forms: Log files with varying severity levels, debug consoles or dashboards, or even runtimes that let us change & modify application logic at runtime. One thing I've noticed--systems without logging don't survive long. You can't get into the field and hook-up your debugger; you need enough information in a logfile to locallize and reproduce the bug.

An accessory to the above is automated crash reporting--sending the corefile, dump, and/or logfile back to the home office whenever you fail.

Oftentimes, you want to save information between runs of your program, or visits to your site. If you could buy an infinite amount of RAM and ensure your computer never powered-down, you could save it in data structures in memory. Sadly, this won't work. So, we need some form of external persistence--files, the Windows Registry, a database, or one of those newfangled "clouds" like Amazon's EC2.

This persistence stuff is non-trivial, and whole sub-industries spring-up (no pun intended!) to service it. Holy wars emerge. One solution is slow and easy, another is fast and cumbersome. That's just for the current version.

It gets really nasty if you have a particular class of app--the client-hosted app with persistence. In this case, when you change from v1 to v2 of your app, you must MIGRATE the data from its previous form to the new version's tangled web of hacked up"schema". People will go to extreme lengths to avoid migrating data, even writing whole new, incompatible apps just so they needn't worry about it.

Good metadata management is the mark of professional software. Sure, you can hard-code things like strings, timeout values, and input validation regex's into your code, but WHY do that? Every programming environment has good to great support for storing these metadata values in some easily-accessible external format--an INI file, .rc file, Windows Registry, Java .properties, YAML file, or (gulp) Spring XML. All hew-down the same tree--factoring-out stuff that doesn't belong in source code.

* * *

My big beef with the above are apps that confuse Persistence and Metadata. Yes, if you oversimplify both involve storing values across instances/invocations on some external medium. But I have two guideliness I like to follow:

  • Don't use a metadata medium as a database. Specifically, the Windows Registry. Yes, it's convenient, but don't be lazy.

  • Don't use a general-purpose storage medium for your metadata. Specifically, a database. Yes, it's convenient, but don't be lazy.

I know plenty of successful apps that abuse the above (I've worked on two!), but on a greenfield app, I'd keep the metadata in an read-often/write-rarely medium, like the Registy or a YAML file, and the persistent data is a read/write optimized, easy-to-migrate medium like a RDBMS.

Like I said, alot of 'Duh' stuff for those of us in the industry, but planning for each of these in a new app (particularly logging/debug) is worth the effort.

Popular posts from this blog

Monday Mope


Review: The Southeast Christian Church Easter Pageant