Posted to tcl by lm at Fri Nov 02 21:02:30 GMT 2007view raw
- <miguel> foo{"a"} =~ s/b/c/;
- <miguel> L Error: type error, want L_TYPE_STRING, got L_TYPE_POLY
- <miguel> what am I doing wrong?
- <miguel> (string) foo{"a"} =~ s/b/c/;
- <miguel> Program terminated with signal 11, Segmentation fault.
- <miguel> #0 0x080da839 in HashStringKey (tablePtr=0x82e7e18, keyPtr=0x0) at /home/TGZ/bitkeeper/tcl-l/unix/../generic/tclHash.c:912
- <miguel> 912 for (c=*string++ ; c ; c=*string++) {
- <miguel> (gdb) bt 5
- <miguel> #0 0x080da839 in HashStringKey (tablePtr=0x82e7e18, keyPtr=0x0) at /home/TGZ/bitkeeper/tcl-l/unix/../generic/tclHash.c:912
- <miguel> #1 0x080d9eb1 in Tcl_CreateHashEntry (tablePtr=0x82e7e18, key=0x0, newPtr=0x0) at /home/TGZ/bitkeeper/tcl-l/unix/../generic/tclHash.c:284
- <miguel> #2 0x080d9e25 in Tcl_FindHashEntry (tablePtr=0x82e7e18, key=0x0) at /home/TGZ/bitkeeper/tcl-l/unix/../generic/tclHash.c:234
- <miguel> #3 0x08165687 in L_get_symbol (name=0x82ecf48, error_p=1) at /home/TGZ/bitkeeper/tcl-l/unix/../generic/Lcompile.c:2362
- <miguel> #4 0x081655f7 in L_get_local_symbol (name=0x82ecf48, error_p=1) at /home/TGZ/bitkeeper/tcl-l/unix/../generic/Lcompile.c:2344
- <miguel> (
- <miguel> -poly ...
- <mcvoy> back.
- <miguel> hi
- <mcvoy> hashes are all type poly, they are like tcl strings. Try (string)foo{"a"} =~ ....
- <miguel> (old) paper says they are strings, threw me out
- <miguel> heh - that one caused a segfault
- <miguel> not hunting it right now; do we have a bug db?
- <mcvoy> Use bk sendbug
- <miguel> ok
- <miguel> lm mcvoy here?
- <miguel> Q: do we care about the precise stringRep of (say) arrays in L?
- <lm> I'm not sure what the question means?
- <miguel> where it comes in: is it a problem (please say "no") if ... just a sec
- <lm> We want to be able to pass stuff between L and tcl. At least I believe we do.
- <lm> Seems limiting if not.
- <miguel> sure: in Tcl you can have different string reps for "the same" list: {1 2 3} and { 1 2 3} are the same list
- <lm> OK, so long as a list is something tcl can parse then I don't care. Or don't know why I should. What difference does it make?
- <miguel> the thing is: assume I have the second rep, try to do (say) "a[4] +=1", get the error ... but now a has the first string rep. Is that OK?
- <lm> So we "normalize" "weird" lists? Yeah, seems fine. L doesn't know how arrays are implemented.
- <miguel> I do not think we care in L; in Tcl we are extremely careful about not changing/spoiling the string rep in the case of errors
- * Damonc (n=Damon@c-98-194-24-194.hsd1.tx.comcast.net) has joined ##l
- <Damonc> What up?
- <miguel> Diff could only be (maybe) in error messages
- <miguel> hey damon
- <Damonc> Hey miguel. 0-]
- <Damonc> So, this testing is proving to be rather depressing.
- <lm> So Damon, I called jeff and offered to bribe him with mac store goodies if he looked at the gets stuff.
- <lm> Oh, it's getting better, hang on.
- <lm> Test Perl Python Ruby Tcl L L vs perl
- <lm> cat 0.9 0.8 1.8 2.7 2.7 3.0x
- <lm> grep 0.9 3.0 2.1 3.9 3.9 4.3x
- <lm> hash 1.4 1.0 3.5 2.5 2.4 1.7x
- <lm> 9.3M 8.1M 20.3M 12.3M (data size was 3.3M)
- <lm> loop 0.3 0.3 0.52 0.13 .18 0.6x (whoohoo)
- <lm> proc 0.7 0.4 1.0 0.8 0.8 1.1x
- <lm> sort 4.9 4.4 10.6 11.1 9.4 1.9x
- <lm> 2.1x on average
- <Damonc> Oh, Perl smokes us fo' sho'.
- <Damonc> I dug in and learned a little Perl for the first time this week to do some teting of my own. That's what's depressing.
- <lm> Yes but it is mostly gets and some regexp.
- <lm> Oh, yeah, perl is _fast_. Usually only somewhat worse than C.
- <Damonc> Well, when I wrote the whole thing in C with stdio and just Tcl_* APIs to set variables, we're still at like .8 seconds to do the same thing in Perl in .3 or .4.
- <Damonc> Which is just wholly bumming me out. 0-]
- <lm> Good, share those tests with me & jeff. I think that's useful.
- <Damonc> Nod... hang on.
- <miguel> the key to perf is to avoid variables (especially if called by name) and return values
- <Damonc> I wrote a forlines proc that accepts a var name and a body. Basically a foreach for lines in a file.
- <Damonc> The key to performance SHOULD NOT be "don't use variables." That's lame. 0-]
- <miguel> varname as Tcl_Obj, I hope?
- <Damonc> Yes.
- <Damonc> I wrote it in C.
- <Damonc> Using stdio as the file handling.
- <Damonc> One call to Tcl_ObjSetVar2() and Tcl_EvalObjEx() per line.
- <lm> Yeah, I know you can make things go fast with tcl tricks but that is cheating. We need to fix the infrastructure and hold our heads high. Stay the course, a thousand points of light, we will find the weapons of mass destruction :)
- <Damonc> .8 seconds to process over a million lines. Which isn't horrible, but it ain't as fast as Perl, which did it in .3 seconds.
- <Damonc> NO MORE WIRE HANGERS!
- <Damonc> Oh, wait. What?
- <lm> Damon, was that both input and output?
- <Damonc> That's just input. The loop body was null.
- <miguel> the idiom 'while {![eof $f]} {set x [gets $f]; ...}' is way faster than 'while {[gets $f x] >= 0} {...}'
- <Damonc> miguel: But why?
- <miguel> (if within a proc)
- <Damonc> A normal user would not write it that way.
- <lm> BTW, where are rlogin/rsh in ports?
- <Damonc> I never have.
- <lm> Damon, exactly.
- <Damonc> 514
- <Damonc> I think
- <lm> 514?
- <miguel> because set has access to the var by indexing into a table, and gets does not
- <lm> Can we fix gets?
- <lm> Jeff was trying to make that byte coded and hit some wall w/ error handling.
- <Damonc> miguel: But why must it be so slow when all done in C? I'm not even at the Tcl level here.
- <Damonc> And I'm not even getting into Tcl's channels. I'm using straight stdio for the file handling. When I wrote it with Tcl channels, it got way worse.
- <miguel> because you are accessing the vars by name (Tcl_ObjSetVar*), and set has access to a faster api
- <lm> Damon, I think you are hitting the reasons why it is slow. The channels are bad but variables are also bad.
- <Damonc> Tcl channels doubled the time.
- <Damonc> 1.4 seconds or so.
- <miguel> variable handling will be at most 10-20%, I'd guess
- <lm> Miguel, in L at least, we could optimize the variable handling, could we not?
- <Damonc> Which, being a guy who doesn't do crazy file handling, I'm not necessarily complaining, but it still sucks when I look at how fast it COULD be (with other languages).
- <miguel> I could get something done in Tcl, I think. [gets], [regexp], [regsub] ...
- <Damonc> miguel: Can we expose the faster command that set is using?
- <Damonc> Or, is it just looking it up by-hand and not with a function?
- <miguel> ... although varName caching got better, it can't match the other stuff
- <Damonc> miguel: We set out to just open a file, read it line-by-line and close the file.
- <miguel> I am not really familiar with the [gets] part ... just talking abot how the result is communicated back to the scripts
- <Damonc> When I write it pure Tcl, I get almost 2 seconds.
- <Damonc> Blech.
- * hobb1 (n=jeffh@209.17.146.129) has joined ##l
- <lm> Hey Jeff.
- <hobb1> yo
- <lm> Lemme cut and paste the backlog.