Failing to fully specify what happens with input buffers. Returned buffers too.
For example, ponder a hypothetical (yeah, right) library routine–let’s call it, say,
new_form which takes a NULL (rather than NUL, which is different, but you knew that) terminated buffer of field pointers. You call it and the fields in the buffer are now part of a brand-spanking new form. Yay, us. Anything that handles even part of form and field stuff is welcome, as it’s a pain (though I could rant about ncurses for a while. But not today) to handle.
But… what happens to that buffer? Is it now the property of the form library? Did it make a copy and now I can delete my copy? If the docs don’t say, I don’t know.
Looking at the source is no help–I don’t care what the source says, I care about what guarantees are made by the API. Just because the library makes a copy now (it doesn’t, I looked, but that’s a separate rant) doesn’t mean that it has to in the future. Someone might, at some random time after now, decide that since the behaviour’s unspecified and as such open to change. I’m perfectly fine with that, too, as flexibility is good, and I for one have no compunctions about viciously and egregiously changing behaviours that I’ve promised are undefined. (I take a certain glee in it, in fact 🙂 But I should at least know what behaviours are undefined, dammit! Don’t make me guess.
APIs are promises of behavior. Be specific with your promises, and clear about what you’re not promising.
…having callback function arguments that do not take a corresponding invocation-specific data pointer.
You want to have a function that takes a function pointer, and have your library call that function at some point in the future if some event happens? Cool! Works for me. I like those. (Well, sorta, event/callback/async programming is a pain) However…. the signature should never be:
int register_callback(func_pointer_t callback);
Bad! Bad programmer! No cookie! That signature should be:
int register_callback(func_pointer_t callback, void *extra_data);
Or, if you’d rather, take a struct that has the function pointer and callback data in it, if you don’t want to manage the two pointers in your library. The signature for that callback pointer should be:
int callback_func(struct instance_data *lib_data, void *extra_data);
though I’m less adamant about that. Very very simple signature lists are best.
Why? Simple. While you may think that I’m going to have a custom callback routine with private embedded data in it all primed and ready for your particular call, you’re wrong. This is C we’re talking here–it’s not like we have closures, so there’s no way to have any sort of data bound at runtime. If I want to bind any data to the call it means I need to stick it in a global somewhere. Blech. Very, very ungood.
It gets even worse when dealing with any sort of indirect access to the library–like, say, if you’re trying to do this from an interpreter. For that to work without any sort of data pointer requires creating a custom C function, either at compile-time for the module (which requires having a C compiler of some sort handy) or at runtime (which requires the capability of creating new functions on the fly) neither of which is particularly desirable. (Parrot, for example, doesn’t need a C compiler handy to interface to most C libraries)
Postgres, pleasantly, doesn’t make this mistake. You should endeavor to not make it as well.
If you want a really good reason, consider the following. Someone is writing an interface to your library for an interpreted language. Perl, Python, Ruby, Java, something on .NET–doesn’t matter. The program runs, conceptually at least, on the interpreter. The interface writer wants to be able to write those callback functions in the interpreted language.
With a separate data parameter, it’s easy. The interpreter builds some sort of closure structure, sets the callback function to be an entry point to the interpreter, and the callback data to be that closure structure. When the callback’s made, the entry point function yanks all the info it needs out of the data parameter, sets up the world, and calls into the properly set up interpreter. While there may be a lot of really nasty funkiness going on in there, it’s at least doable.
With no data parameter, though… you’re stuck. The only way to do it, short of generating a new custom function pointer (which isn’t that tough, but is painfully non-portable and something that gives most people a screaming fit to even think about) is to stuff the information you need for the callback into a C global somewhere. The problem there is that it means you can only have one pending callback (which is often suboptimal) and you’ve got potentially unpleasant threading issues. This is an especially egregious mistake with things like GUI interfaces where you may have dozens of hundreds of some sort of thing instantiated. At least there you’ve often got an OO interface, so there’s the data in the objects, but even then it makes the low-level stuff annoying.
Generally people who use the libraries realize the problem straightaway, but the problem is that often the people using the libraries aren’t the people writing the libraries…
Charles Miller complains about an obvious flaw in his design:
If I had access to my own source-code, and could recompile myself whenever anything went wrong, life would be so much easier.
Interesting starting point for home-grown automated GUI-testing (although selecting a button via it’s window text won’t work well for us).
The canonical discussion of FP arithmetic (for this audience, at least) is a 1991 paper by David Goldberg titled “What Every Computer Scientist Should Know About Floating-Point Arithmetic” via [Eric Gunnerson’s C# Compendium]
[via Thinking In .NET]