ahead of processing data arrived on the same socket, with EnqueueKill
that adds to same queue from which data's taken. So if device dies
immediately after sending data there won't be a race between closing
the cref (if this is the last open socket) and handling the data. I'm
still dying with assert fails when running 100 games at once, but much
less frequently
another wanted to operate on them. The root problem is that you can't
dispose of a mutex while somebody's blocking on it. So now the
locking mutexes live inside the cref class. When the lock owner
realizes the cref needs to die, it sets a flag and it's moved to a
recycled list. A thread blocking on the mutex will then get it, but
checks the flag and releases it immediately if it's being recycled.
(Also improve the http interface a bit.) With these changes I've run
31K (and counting) games against the relay without a crash or deadlock
(using sim_real.sh.) The main problem that remains is that sometimes
two games using the same cookie wind up with two crefs (and so never
connect.)
change the set listened on. There's still some debugging to do but
nothing that worked before is broken. Also begin to accept unique
prefixes (e.g. g for get) for commands and attributes on the control
port. Note that relay-related code in comms seems broken now, but is
without this checkin.
scheme where cookie is used only to connect, and is replaced for
reconnects by a relay-generated name that's supposed to be unique
across all games on all relays and includes a hostname read in from
config file; relay assign non-servers' hostIDs.
kill crefs via state machine, and protect access to a cref so it can
die without another thread being in it; do timers via timeout to
poll() rather than interrupt (and integrate into state machine);
detect when all players are present and change state so new
connections on that cookie will get a new cref.