Posted to tcl by lm at Fri Nov 02 21:02:30 GMT 2007view raw

  1. <miguel> foo{"a"} =~ s/b/c/;
  2. <miguel> L Error: type error, want L_TYPE_STRING, got L_TYPE_POLY
  3. <miguel> what am I doing wrong?
  4. <miguel> (string) foo{"a"} =~ s/b/c/;
  5. <miguel> Program terminated with signal 11, Segmentation fault.
  6. <miguel> #0 0x080da839 in HashStringKey (tablePtr=0x82e7e18, keyPtr=0x0) at /home/TGZ/bitkeeper/tcl-l/unix/../generic/tclHash.c:912
  7. <miguel> 912 for (c=*string++ ; c ; c=*string++) {
  8. <miguel> (gdb) bt 5
  9. <miguel> #0 0x080da839 in HashStringKey (tablePtr=0x82e7e18, keyPtr=0x0) at /home/TGZ/bitkeeper/tcl-l/unix/../generic/tclHash.c:912
  10. <miguel> #1 0x080d9eb1 in Tcl_CreateHashEntry (tablePtr=0x82e7e18, key=0x0, newPtr=0x0) at /home/TGZ/bitkeeper/tcl-l/unix/../generic/tclHash.c:284
  11. <miguel> #2 0x080d9e25 in Tcl_FindHashEntry (tablePtr=0x82e7e18, key=0x0) at /home/TGZ/bitkeeper/tcl-l/unix/../generic/tclHash.c:234
  12. <miguel> #3 0x08165687 in L_get_symbol (name=0x82ecf48, error_p=1) at /home/TGZ/bitkeeper/tcl-l/unix/../generic/Lcompile.c:2362
  13. <miguel> #4 0x081655f7 in L_get_local_symbol (name=0x82ecf48, error_p=1) at /home/TGZ/bitkeeper/tcl-l/unix/../generic/Lcompile.c:2344
  14. <miguel> (
  15. <miguel> -poly ...
  16. <mcvoy> back.
  17. <miguel> hi
  18. <mcvoy> hashes are all type poly, they are like tcl strings. Try (string)foo{"a"} =~ ....
  19. <miguel> (old) paper says they are strings, threw me out
  20. <miguel> heh - that one caused a segfault
  21. <miguel> not hunting it right now; do we have a bug db?
  22. <mcvoy> Use bk sendbug
  23. <miguel> ok
  24. <miguel> lm mcvoy here?
  25. <miguel> Q: do we care about the precise stringRep of (say) arrays in L?
  26. <lm> I'm not sure what the question means?
  27. <miguel> where it comes in: is it a problem (please say "no") if ... just a sec
  28. <lm> We want to be able to pass stuff between L and tcl. At least I believe we do.
  29. <lm> Seems limiting if not.
  30. <miguel> sure: in Tcl you can have different string reps for "the same" list: {1 2 3} and { 1 2 3} are the same list
  31. <lm> OK, so long as a list is something tcl can parse then I don't care. Or don't know why I should. What difference does it make?
  32. <miguel> the thing is: assume I have the second rep, try to do (say) "a[4] +=1", get the error ... but now a has the first string rep. Is that OK?
  33. <lm> So we "normalize" "weird" lists? Yeah, seems fine. L doesn't know how arrays are implemented.
  34. <miguel> I do not think we care in L; in Tcl we are extremely careful about not changing/spoiling the string rep in the case of errors
  35. * Damonc (n=Damon@c-98-194-24-194.hsd1.tx.comcast.net) has joined ##l
  36. <Damonc> What up?
  37. <miguel> Diff could only be (maybe) in error messages
  38. <miguel> hey damon
  39. <Damonc> Hey miguel. 0-]
  40. <Damonc> So, this testing is proving to be rather depressing.
  41. <lm> So Damon, I called jeff and offered to bribe him with mac store goodies if he looked at the gets stuff.
  42. <lm> Oh, it's getting better, hang on.
  43. <lm> Test Perl Python Ruby Tcl L L vs perl
  44. <lm> cat 0.9 0.8 1.8 2.7 2.7 3.0x
  45. <lm> grep 0.9 3.0 2.1 3.9 3.9 4.3x
  46. <lm> hash 1.4 1.0 3.5 2.5 2.4 1.7x
  47. <lm> 9.3M 8.1M 20.3M 12.3M (data size was 3.3M)
  48. <lm> loop 0.3 0.3 0.52 0.13 .18 0.6x (whoohoo)
  49. <lm> proc 0.7 0.4 1.0 0.8 0.8 1.1x
  50. <lm> sort 4.9 4.4 10.6 11.1 9.4 1.9x
  51. <lm> 2.1x on average
  52. <Damonc> Oh, Perl smokes us fo' sho'.
  53. <Damonc> I dug in and learned a little Perl for the first time this week to do some teting of my own. That's what's depressing.
  54. <lm> Yes but it is mostly gets and some regexp.
  55. <lm> Oh, yeah, perl is _fast_. Usually only somewhat worse than C.
  56. <Damonc> Well, when I wrote the whole thing in C with stdio and just Tcl_* APIs to set variables, we're still at like .8 seconds to do the same thing in Perl in .3 or .4.
  57. <Damonc> Which is just wholly bumming me out. 0-]
  58. <lm> Good, share those tests with me & jeff. I think that's useful.
  59. <Damonc> Nod... hang on.
  60. <miguel> the key to perf is to avoid variables (especially if called by name) and return values
  61. <Damonc> I wrote a forlines proc that accepts a var name and a body. Basically a foreach for lines in a file.
  62. <Damonc> The key to performance SHOULD NOT be "don't use variables." That's lame. 0-]
  63. <miguel> varname as Tcl_Obj, I hope?
  64. <Damonc> Yes.
  65. <Damonc> I wrote it in C.
  66. <Damonc> Using stdio as the file handling.
  67. <Damonc> One call to Tcl_ObjSetVar2() and Tcl_EvalObjEx() per line.
  68. <lm> Yeah, I know you can make things go fast with tcl tricks but that is cheating. We need to fix the infrastructure and hold our heads high. Stay the course, a thousand points of light, we will find the weapons of mass destruction :)
  69. <Damonc> .8 seconds to process over a million lines. Which isn't horrible, but it ain't as fast as Perl, which did it in .3 seconds.
  70. <Damonc> NO MORE WIRE HANGERS!
  71. <Damonc> Oh, wait. What?
  72. <lm> Damon, was that both input and output?
  73. <Damonc> That's just input. The loop body was null.
  74. <miguel> the idiom 'while {![eof $f]} {set x [gets $f]; ...}' is way faster than 'while {[gets $f x] >= 0} {...}'
  75. <Damonc> miguel: But why?
  76. <miguel> (if within a proc)
  77. <Damonc> A normal user would not write it that way.
  78. <lm> BTW, where are rlogin/rsh in ports?
  79. <Damonc> I never have.
  80. <lm> Damon, exactly.
  81. <Damonc> 514
  82. <Damonc> I think
  83. <lm> 514?
  84. <miguel> because set has access to the var by indexing into a table, and gets does not
  85. <lm> Can we fix gets?
  86. <lm> Jeff was trying to make that byte coded and hit some wall w/ error handling.
  87. <Damonc> miguel: But why must it be so slow when all done in C? I'm not even at the Tcl level here.
  88. <Damonc> And I'm not even getting into Tcl's channels. I'm using straight stdio for the file handling. When I wrote it with Tcl channels, it got way worse.
  89. <miguel> because you are accessing the vars by name (Tcl_ObjSetVar*), and set has access to a faster api
  90. <lm> Damon, I think you are hitting the reasons why it is slow. The channels are bad but variables are also bad.
  91. <Damonc> Tcl channels doubled the time.
  92. <Damonc> 1.4 seconds or so.
  93. <miguel> variable handling will be at most 10-20%, I'd guess
  94. <lm> Miguel, in L at least, we could optimize the variable handling, could we not?
  95. <Damonc> Which, being a guy who doesn't do crazy file handling, I'm not necessarily complaining, but it still sucks when I look at how fast it COULD be (with other languages).
  96. <miguel> I could get something done in Tcl, I think. [gets], [regexp], [regsub] ...
  97. <Damonc> miguel: Can we expose the faster command that set is using?
  98. <Damonc> Or, is it just looking it up by-hand and not with a function?
  99. <miguel> ... although varName caching got better, it can't match the other stuff
  100. <Damonc> miguel: We set out to just open a file, read it line-by-line and close the file.
  101. <miguel> I am not really familiar with the [gets] part ... just talking abot how the result is communicated back to the scripts
  102. <Damonc> When I write it pure Tcl, I get almost 2 seconds.
  103. <Damonc> Blech.
  104. * hobb1 (n=jeffh@209.17.146.129) has joined ##l
  105. <lm> Hey Jeff.
  106. <hobb1> yo
  107. <lm> Lemme cut and paste the backlog.
  108.