Posted to tcl by kbk at Mon Jan 23 19:25:25 GMT 2017view raw

  1. Some work that I've been messing with recently in the tclquadcode
  2. compiler raises an issue that has been a niggling itch to me for
  3. nearly the entire 25+ years that I've been using Tcl: the fact that
  4. our comparison operations do not define an ordering.
  5.  
  6. The issue is, of course, that we use the same notation for string and
  7. numeric comparisons, which yields anomalies when comparing operands of
  8. mixed type. In particular, we have the cycle:
  9.  
  10. "0x10" < "0y" (string comparison)
  11. "0y" < "1" (string comparison again)
  12. "1" < "0x10" (numeric comparison)
  13.  
  14. I'd like to get a TIP rolling to introduce some means of declaring
  15. what style of comparison is meant. We already have introduced one
  16. disambiguation (which was a start, but not nearly enough), with the
  17. 'eq' and 'ne' operators. {$x eq $y} and {$x ne $y} are always string
  18. comparisons.
  19.  
  20. At the very least, I want to propose 'lt', 'le', 'gt' and 'ge', to
  21. complete the set of string comparison operators.
  22.  
  23. That alone is not enough. I also want to have some way of forcing a
  24. numeric comparison.
  25.  
  26. One possibility is that the desired semantics can actually be achieved
  27. in current code, by writing
  28.  
  29. [expr {+$x < +$y}]
  30.  
  31. Since unary '+' requires a numeric argument, this expression will have
  32. my desired effect of making sure that $x and $y are both numeric. If
  33. we go this route, we should probably document this as an idiom in the
  34. manual pages that discuss arithmetic expressions. It can be foreseen
  35. to have a significant impact in the performance, for instance, of code
  36. compiled by tclquadcode, because in a form like:
  37.  
  38. for {set x $a} {$x < $b} {incr x} { ... }
  39.  
  40. it may be impossible for the compiler to prove that $b is
  41. numeric. Without such a guarantee, the compiler is forced to generate
  42. code for a possible string comparison on each trip through the loop. I
  43. have notes on how the test of $b's type might be detected as
  44. loop-invariant and hoisted out of the inner loop, but have not tried
  45. to implement such a thing, and the implementation looks decidedly
  46. non-trivial. If we change the loop to
  47.  
  48. for {set x $a} {+$x < +$b} {incr x} { ...}
  49.  
  50. the problem, while still non-trivial, is somewhat easier. There's no
  51. possibility of unexpected comparison semantics. There will still be
  52. unneeded type-checking code checking whether unary '+' needs to throw
  53. an error; again, I have notes on how this could be avoided, by
  54. unrolling the loop once.
  55.  
  56. As an alternative, we might want to adopt a second syntax for numeric
  57. comparisons. Something like [if {$a :<: $b} {...}] for forcing the
  58. comparison to be numeric would work. This wouldn't save many
  59. keystrokes, but might make the intent more obvious than a somewhat
  60. arcane use of unary '+'. I'll let others argue about notation, but
  61. take as a requirement that it has to be lexically compatible with
  62. whatever we do in today's arithmetic expressions.
  63.  
  64. A more radical proposal might be, once 'lt' and friends are in place,
  65. to deprecate the use of '<', '<=', etc. for non-numeric comparisons.
  66. I've actually learnt to be fairly careful in Tcl code with such
  67. things, and in my own code will usually write explicitly:
  68.  
  69. if {[string compare $a $b] < 0} { ... }
  70.  
  71. if string comparison is intended, precisely to avoid surprises if $a
  72. and $b both look like numbers. (That is, I already reserve <, <=, etc.
  73. for numeric comparisons.) Given our conservatism about breaking
  74. working code, no matter how weird the behaviour that it depends on, I
  75. suspect this last idea is a bridge too far, even though I'd like to
  76. see it.
  77.  
  78. So, what do people think? I'd be happy to draft a TIP to reflect a
  79. consensus - consider this to be the skeleton of such a TIP.
  80.