Posted to tcl by aspect at Tue Jul 08 05:26:25 GMT 2014view pretty

The bad combination is tls::socket + [fconfigure -blocking 0] + [chan push].  The result is [flush] raising EINVAL on first attempt to write to the socket.

Here's a minimal test case:

  puts Tcl:[info patchlevel]
  puts tls:[package require tls]

  namespace eval nopchan {
    proc initialize args {info procs}
    proc finalize args {}
    proc clear args {}
    proc write {h data} {
      puts [list ::nopchan::write $data]
      return $data
    }
    namespace export *
    namespace ensemble create
  }

  set sock [tls::socket google.com 443]
  chan push $sock nopchan
  chan configure $sock -blocking 0
  puts -nonewline $sock "GET"
  puts "Calling flush"
  flush $sock
  # flush: error writing "sockb41ef0": invalid argument

Reproduced with tcl trunk + tls trunk, both with and without dgp's patches at http://core.tcl.tk/tcl/info/c31ca233cacd7d6184877c54182f4b3a6ccb30f1


Building tls with -NDEBUG yields interesting results.  All failing cases reduce to this:

  Calling flush
  ::nopchan::write GET
  BIO_write(0xdc11b0, 15)
  WaitForConnect(0xdc11b0)
  BioWrite(0xdc1b30, <buf>, 150) [0xdc13b0]
  [0xdc13b0] BioWrite(150) -> 150 [0.0]
  BioCtrl(0xdc1b30, 0x6, 0x0, 0xdc1830)
  BioRead(0xdc1b30, <buf>, 5) [0xdc13b0]
  [0xdc13b0] BioRead(5) -> -1 [0.11]ERR(5, 11) 
  Output(15) -> -1
  error writing "sockdc1330": invalid argument


Trying to decouple WaitForConnect, I made a test case using [socket -async].

  # using same nopchan from above:
  proc test {} {
    set sock [tls::socket -async google.com 443]
    chan push $sock nopchan
    fconfigure $sock -blocking 0
    fileevent $sock writable [info coroutine]
    yield
    puts "CONNECTED"
    fileevent $sock writable {}
    puts -nonewline $sock GET
    puts "Calling flush"
    flush $sock
  }

  fconfigure stdout -buffering line
  coroutine coro test
  vwait forever

This exhibits exactly the same behaviour on tls trunk - including WaitForConnect only being called after [write]:

  Tcl:8.6.1
  tls:1.6.3
  TlsWatchProc(0x4)
  CONNECTED
  TlsWatchProc(0x0)
  Calling flush
  nopchan::write GET

  BIO_write(0x28c8c30, 3)
  WaitForConnect(0x28c8c30)
  BioWrite(0x28c94b0, <buf>, 150) [0x28c8d30]
  [0x28c8d30] BioWrite(150) -> 150 [0.0]
  BioCtrl(0x28c94b0, 0x6, 0x0, 0x28c91b0)
  BioRead(0x28c94b0, <buf>, 5) [0x28c8d30]
  [0x28c8d30] BioRead(5) -> -1 [0.11]ERR(5, 11) 
  Output(3) -> -1TlsWatchProc(0x0)
  error flushing "sock28c8cb0": invalid argument


But with dgp's patches, it succeeds:

  Tcl:8.6.1                                                                                                                                                                                                         
  tls:1.6.3
  TlsWatchProc(0x4)
 
  WaitForConnect(0x28ccc10)
  BioWrite(0x28cd490, <buf>, 150) [0x28ccd10]
  [0x28ccd10] BioWrite(150) -> 150 [0.0]
  BioCtrl(0x28cd490, 0x6, 0x0, 0x28cd190)
  BioRead(0x28cd490, <buf>, 5) [0x28ccd10]
  [0x28ccd10] BioRead(5) -> -1 [0.11]ERR(5, 11)
: WaitForConnect(0x28ccc10)
: BioRead(0x28cd490, <buf>, 5) [0x28ccd10]
: [0x28ccd10] BioRead(5) -> -1 [0.11]ERR(5, 11)

 (lines marked : repeated several hundred times)

  WaitForConnect(0x28ccc10)
  BioRead(0x28cd490, <buf>, 5) [0x28ccd10]
  [0x28ccd10] BioRead(5) -> 5 [0.0]
  BioRead(0x28cd490, <buf>, 1) [0x28ccd10]
  [0x28ccd10] BioRead(1) -> 1 [0.0]
  BioRead(0x28cd490, <buf>, 5) [0x28ccd10]
  [0x28ccd10] BioRead(5) -> 5 [0.0]
  BioRead(0x28cd490, <buf>, 60) [0x28ccd10]
  [0x28ccd10] BioRead(60) -> 60 [0.0]
  BioCtrl(0x28cd490, 0x7, 0x0, 0x28cd190)
  BioCtrl(0x28cd490, 0xb, 0x0, 0x0)BIO_CTRL_FLUSH
  R0! CONNECTED
  TlsWatchProc(0x0)
  Calling flush
  nopchan::write GET

  BIO_write(0x28ccc10, 3)
  BioWrite(0x28cd490, <buf>, 28) [0x28ccd10]
  [0x28ccd10] BioWrite(28) -> 28 [0.0]
  BIO_write(0x28ccc10, 3) -> [3]
  Output(3) -> TlsWatchProc(0x0)



.. possibly another clue, if it's not clear from the above, is that it's only the first attempt to flush on a newly-connected, async tls channel that fails.  The following code which connects before making the socket async, succeeds:

  set sock [tls::socket google.com 443]
  puts -nonewline $sock G
  flush $sock    ;# succeeds - socket is not async
  fconfigure $sock -blocking 0
  puts -nonewline $sock ET
  flush $sock    ;# succeeds - socket is fully connected