Posted to tcl by kbk at Mon Jan 23 19:25:25 GMT 2017view raw
- Some work that I've been messing with recently in the tclquadcode
- compiler raises an issue that has been a niggling itch to me for
- nearly the entire 25+ years that I've been using Tcl: the fact that
- our comparison operations do not define an ordering.
- The issue is, of course, that we use the same notation for string and
- numeric comparisons, which yields anomalies when comparing operands of
- mixed type. In particular, we have the cycle:
- "0x10" < "0y" (string comparison)
- "0y" < "1" (string comparison again)
- "1" < "0x10" (numeric comparison)
- I'd like to get a TIP rolling to introduce some means of declaring
- what style of comparison is meant. We already have introduced one
- disambiguation (which was a start, but not nearly enough), with the
- 'eq' and 'ne' operators. {$x eq $y} and {$x ne $y} are always string
- comparisons.
- At the very least, I want to propose 'lt', 'le', 'gt' and 'ge', to
- complete the set of string comparison operators.
- That alone is not enough. I also want to have some way of forcing a
- numeric comparison.
- One possibility is that the desired semantics can actually be achieved
- in current code, by writing
- [expr {+$x < +$y}]
- Since unary '+' requires a numeric argument, this expression will have
- my desired effect of making sure that $x and $y are both numeric. If
- we go this route, we should probably document this as an idiom in the
- manual pages that discuss arithmetic expressions. It can be foreseen
- to have a significant impact in the performance, for instance, of code
- compiled by tclquadcode, because in a form like:
- for {set x $a} {$x < $b} {incr x} { ... }
- it may be impossible for the compiler to prove that $b is
- numeric. Without such a guarantee, the compiler is forced to generate
- code for a possible string comparison on each trip through the loop. I
- have notes on how the test of $b's type might be detected as
- loop-invariant and hoisted out of the inner loop, but have not tried
- to implement such a thing, and the implementation looks decidedly
- non-trivial. If we change the loop to
- for {set x $a} {+$x < +$b} {incr x} { ...}
- the problem, while still non-trivial, is somewhat easier. There's no
- possibility of unexpected comparison semantics. There will still be
- unneeded type-checking code checking whether unary '+' needs to throw
- an error; again, I have notes on how this could be avoided, by
- unrolling the loop once.
- As an alternative, we might want to adopt a second syntax for numeric
- comparisons. Something like [if {$a :<: $b} {...}] for forcing the
- comparison to be numeric would work. This wouldn't save many
- keystrokes, but might make the intent more obvious than a somewhat
- arcane use of unary '+'. I'll let others argue about notation, but
- take as a requirement that it has to be lexically compatible with
- whatever we do in today's arithmetic expressions.
- A more radical proposal might be, once 'lt' and friends are in place,
- to deprecate the use of '<', '<=', etc. for non-numeric comparisons.
- I've actually learnt to be fairly careful in Tcl code with such
- things, and in my own code will usually write explicitly:
- if {[string compare $a $b] < 0} { ... }
- if string comparison is intended, precisely to avoid surprises if $a
- and $b both look like numbers. (That is, I already reserve <, <=, etc.
- for numeric comparisons.) Given our conservatism about breaking
- working code, no matter how weird the behaviour that it depends on, I
- suspect this last idea is a bridge too far, even though I'd like to
- see it.
- So, what do people think? I'd be happy to draft a TIP to reflect a
- consensus - consider this to be the skeleton of such a TIP.