Someone asked me what the rationale is for the decision, in my FSet 2.0 release candidate, to have no default default for maps and seqs, so that an out-of-domain lookup will signal an error. I started to write an answer, but after putting the arguments for and against this change down on the page and mulling them over for a few days, I concluded it was a mistake and decided to reverse it.
So in FSet 2.0, it will still be the case, unless you specify otherwise, that an out-of-domain lookup on a map, or an out-of-bounds lookup on a seq, will simply return nil (with a nil second value). You do, as before, have the option to specify a different default, and now you also have the option to specify no default, if you want out-of-domain/bounds lookups to signal an error.
I have tagged v2.0.0-rc1.
This has been a difficult decision that I have changed my mind about a few times. Let me summarize the arguments for and against the change. I'll start with some in favor of not having a default default:
- It will be simpler to explain to new FSet users that the map or seq has a default only if explicitly given one.
- Users will supply a default of nil only for those maps and seqs which actually have out-of-domain/bounds lookups done on them. More maps and seqs will have no default, which will surface cases when an intended invariant, that the lookups are all in-domain, is violated; this will improve the overall robustness of their code.
- Some operations, primarily map-union, map-intersection, and compose, are easier to use when their arguments have no defaults; if they have nil defaults, the function passed in to combine or map values (often specified as a lambda expression) must explicitly handle nil, which is often inelegant. If there is no default default, fewer people will trip over this speed bump.
Some arguments in favor of a nil default default:
- It's consistent with FSet past practice; having no default default will require migration effort on the part of FSet users.
- It's consistent with the majority of CL collection accessors (assoc, gethash, nth).
- It's consistent with other FSet behaviors, such as that of arb on an empty set, which returns two nil values.
Minimizing migration effort is somewhat desirable, of course, but I try not to overweight it. There's an old story I once heard about Stu Feldman, the original author of make. He wrote it and passed it around to his colleagues at Bell Labs. Pretty soon he realized that the syntax was a dumpster fire, but he didn't want to fix it, the story goes, because he already had ten users. And now millions of us have to live with it.
So I'm willing to impose some migration pain on existing users, as long as it doesn't seem excessive, if I believe they themselves will be happier in the long run. It's not that their interests don't count; it's just that future benefits can outweigh present pain. And in this case, I think the amount of present pain would not have been large; I did the conversion on some of my own code that uses FSet, and it didn't seem very hard. So all told, the migration argument carried a little weight, but not a huge amount.
As for the CL collection accessors, there is some inconsistency there already. Sequence accessors — svref, elt, and aref — do signal an error on an out-of-bounds index, except perhaps at safety 0. (Surprisingly, at least to me, of these only elt is specified to signal an error, but the other two do so also in all the implementations I've tried.) nth is a funny case; at least in the major implementations, on a positive index greater than or equal to the length of the list, it just returns nil, but on a negative index it signals an error. The consistency-with-CL argument is thus not quite as strong as it may sound, when CL isn't even completely self-consistent. Of course, the map accessors assoc and gethash do return nil on an out-of-domain lookup. All told, again, this argument carries somewhat more weight for me than the migration argument, but it's not overwhelming.
The argument from internal consistency of FSet was the one that tipped the balance for me. There are other access operations besides lookup that indicate failure by returning a second (or sometimes third) value which is false. I suppose I could have changed these to signal errors also, but this seemed a bridge too far; in the cases of set and bag operations, there isn't currently a way you could select between the error behavior and the return-nil behavior, the way that the choice of defaults allows you to do for maps and seqs.
I also tried to estimate the frequency of the following two cases:
- In a no-default-default FSet, how often would users have to add an explicit :default nil to prevent undesired lookup errors?
- In a nil-default-default FSet, how often would users have to add an explicit :no-default or :no-default? t to cause errors on out-of-domain lookups, or for reasons having to do with map-union etc?
Although it's hard to be extremely confident about my estimates without seeing a lot of code others have written against FSet, my experience suggests that the former would be several times as frequent as the latter. This argument also helps tip the balance toward a nil default default.