UPDATE 2011-05-05 16:17 PST: It looks like this has been resolved!
UPDATE 2011-01-16 16:26 PST: I’ve updated the data collection gist to show conversion to CSV in case that’s useful. I’ve also included the code I used to calculate the statistics for each data set.
I use Ruby at work for systems administration tasks on a daily basis. One particularly useful tool in my arsenal is Capistrano. However, as I began to transition my daily taks to Ruby 1.9 I noticed Capistrano was running slowly. Tasks that normally took on the order of a couple seconds could take tens of seconds and sometimes even minutes. I do not yet know what the root cause of the problem is, and I’m not even sure anyone else is having such a problem. After some hours of Googling, I have yet to see a single page describing anything remotely like this. I’m posting this in case anyone else has noticed something similar.
The problem appears to only occur when I am using an ssh gateway, defined in my Capfile like so:
With that line, Capistrano connects via ssh to host.example.com first and from there connects via ssh to the target host. (Without the gateway defined, Capistrano will connect via ssh directly to the target host.) With an ssh gateway defined in my Capfile tasks run under Ruby 1.9.2 are painfully slow. I collected some statistics quantifying the problem.
I started with the following dead simple Capistrano task:
This task runs the uptime command on five different servers running Debian Linux. I ran the task 100 times in succession on a 2.6 GHz Core 2 Duo MacBook Pro running OS X 10.6.6 using Ruby Enterprise Edition 2010.02 and Ruby 1.9.2p136 both installed via RVM. I ran the tests with and without an ssh gateway. In each case I had capistrano 2.5.19, net-ssh 2.0.24 and net-ssh-gateway 1.0.1 installed. I timed each run.
The raw data for all the runs are available as a gist. I used this code to calculate the stats. Below is the analysis of those figures. First, running Ruby Enterprise Edition through an ssh gateway:
| real | user | sys | |
| min | 2.50s | 0.50s | 0.15s |
| max | 3.93s | 0.60s | 0.21s |
| mean | 2.61s | 0.56s | 0.17s |
| stddev | 0.17s | 0.02s | 0.01s |
| median | 2.56s | 0.56s | 0.17s |
And running the same task under Ruby Enterprise Edition without an ssh gateway:
| real | user | sys | |
| min | 2.54s | 0.47s | 0.13s |
| max | 3.13s | 0.62s | 0.18s |
| mean | 2.61s | 0.51s | 0.14s |
| stddev | 0.11s | 0.02s | 0.01s |
| median | 2.58s | 0.51s | 0.14s |
And now the same task under Ruby 1.9.2 without an ssh gateway:
| real | user | sys | |
| min | 2.29s | 0.64s | 0.21s |
| max | 3.35s | 0.72s | 0.23s |
| mean | 2.41s | 0.66s | 0.21s |
| stddev | 0.11s | 0.01s | 0.00s |
| median | 2.42s | 0.66s | 0.21s |
And finally the same task under Ruby 1.9.2 through an ssh gateway:
| real | user | sys | |
| min | 3.54s | 0.70s | 0.23s |
| max | 279.17s | 1.83s | 1.11s |
| mean | 66.80s | 0.96s | 0.43s |
| stddev | 58.61s | 0.22s | 0.18s |
| median | 50.64s | 0.92s | 0.39s |
So REE with and without a gateway and 1.9.2 without a gateway each took about 2.5 seconds to complete the task. That seems reasonable to me. But cripes, the average runtime for that task under 1.9.2 with a gateway was 66.80 seconds, and the peak runtime was 279.17 seconds! The shortest runtime was an almost forgiveable 3.54 seconds. If you look at the data for that run you will see the substantial swings in runtime. The standard deviation was a moody 58.61 seconds. With a median runtime of 50.64 seconds, it would seem most of the runs are shorter than average, but there are several exceptionally slow runs. Fourteen runs took two minutes or longer.
Again, apart from knowing it has something to do with using an ssh gateway, I haven’t identified the actual problem yet. If you know what I might be doing wrong, I’d love to know. If you don’t know what’s going on, but you’re having a similar problem, well, at least you’re not alone. In the meanwhile, I’ve modified my Capfile to let me conditionally enable the ssh gateway:
This allows me to continue using Ruby 1.9 for daily systems administration tasks while I’m on a trusted network. When working from an untrusted network, I can throw
© 2012 John Parker | Valid XHTML 1.0 Strict | Valid CSS 2.1