Summary: | test many simultaneous connections | ||
---|---|---|---|
Product: | Telepathy | Reporter: | Simon McVittie <smcv> |
Component: | fargo | Assignee: | David Laban <david.laban> |
Status: | RESOLVED FIXED | QA Contact: | Simon McVittie <smcv> |
Severity: | enhancement | ||
Priority: | high | ||
Version: | git master | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | milestone4.9; done=20h; est=30h | ||
i915 platform: | i915 features: | ||
Bug Depends on: | 26142 | ||
Bug Blocks: | 26257, 26277 |
Description
Simon McVittie
2010-01-18 07:28:14 UTC
Using this bug to represent the actual testing, est=5h. I'll open another bug for using a pool of CMs, which might be necessary for scalability or robustness, or might be descoped. This is taking considerably longer than I'd hoped; reasonable numbers of parallel connections end up hitting what appear to be race conditions in the stress-test script. done+=5h est+=5h... little progress to show for today's work on this The manual loopback test has some problems. Since it's the building block for the multi-loopback stress test, I'm trying to make it more stable. One symptom is as follows: * run the manual loopback test: it passes * delete everything from the parameters table in the database * run the manual loopback test (it registers its user pair, which is needed this time): it fails * run the manual loopback test again (it registers its user pair, which is a no-op this time): it passes It appears that the receiver doesn't get the incoming channel, although there's currently no debug of NewChannels so it's hard to get a good idea of where the failure is. My first job for tomorrow is to add more debug. A note for anyone else hacking on this: it turns out that * telepathy-sofiasip doesn't reliably die with with-session-bus.sh * openser can get confused in the aftermath of a failed test (?) so it's worth killing all related processes and restarting openser if in doubt. Still not making significant progress here, unfortunately... Today's progress: * Discovered that for "a while" (~ a minute) after startup, the loopback test script doesn't work; possibly ejabberd and Fargo handshaking * Many test runs with 10-25 users * Improved debug logging * Disabled the registration part of the test, and instead registered users in the database directly, to get rid of a point of fragility * Disabled reconnecting (3 calls x 3 connections) and just did 10 calls on a single connection, to get rid of another point of fragility * The usual failure mode seems to be that session-initiate isn't sent to the receiving XMPP client, possibly because SetRemoteCodecs hasn't been emitted by the receiver's telepathy-sofiasip instance (?) Unaddressed review complaints for smcv/stress2 for reference: 18:39 < alsuren> +echo "delete from parameters;" | psql ${DB:-tpfargo} -- could you just make DB be $4 or tpfargo when you set it at the top of the file? 18:34 < alsuren> + (the script will exit successfully after//+ CALLS_PER_CONNECTION * CONNECTIONS calls) -- do you mean that * (n_max - n_min)? http://git.collabora.co.uk/?p=user/smcv/telepathy-fargo.git;a=commitdiff;h=86cea12ed22104b4cc521024613affb978e852a5 -- would be clearer if you didn't have to glue DelayedCall objects onto WaitState object, or if you had helpers to do it for you. otherwise, stress3 and stress4++ Swapping assignee, David's better at profiling Python than I am. Merging stress3 and stress4 with one unaddressed review complaint, which I think is addressed by <http://git.collabora.co.uk/?p=user/smcv/telepathy-fargo.git;a=shortlog;h=refs/heads/waiter>; please review? Merged smcv/waiter too. Seems to cope easily with a sustained rate of 1 call setup/teardown per second, now that we fixed http://bugs.freedesktop.org/show_bug.cgi?id=26698 |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.