I recently sunk the better part of a day into running down strange behavior in Ruby’s Net::HTTP. I wound up staring at tcpdump output, reading OpenSSL’s C sources, patching my local Ruby installation and finally ended up at a nearly 2 year old bug report. Let’s dive in.

At Kenna, some of our client integrations require setting up a virtual tunnel into their data center. Traffic sent to a known port on a specific host inside of our VPC gets securely forwarded to our client’s network, where it’s routed on to its final destination. I was working on integrating a new client, and to ease the development and debugging process, I opened a second tunnel to send traffic from my laptop to the correct host and port in our VPC, which would then be forwarded on to our client’s network.

It took awhile to get my head around all of the tunneling and port forwarding, but once the tunnels were up, I could show that there was connectivity. A bogus request to the XML RPC API returned a reasonable error code:

$ > curl -k -H "Host: Ap00463.example.com" https://127.0.0.1:9000/api2xmlrpc
<?xml version="1.0" encoding="utf-8"?>
<methodResponse><fault><value>
<struct>
<member><name>faultCode</name><value><int>1120</int></value></member>
<member><name>faultString</name><value><string>Method not implemented: </string></value></member>
</struct>
</value></fault></methodResponse>

There were only two hiccups along the way:

  • The host’s certificate was expired, so I needed to pass the -k flag to cURL, to skip certificate checks.
  • I needed to spoof the Host header, presumably because the destination serves multiple virtual hosts.

I expected that configuring the tunnels would be the hard part, and that integrating with the API via Ruby’s built in XMLRPC::Client would be straightforward.

Other than its multiple constructors, the Client object exposes a simple interface and convienently allows you direct access to the underlying Net::HTTP object. That turned out to be necessary, because although you can set extra headers on the Client object directly, the directive skip certificate verification is given to the Net::HTTP object. The necessary configuration is thus:

irb(main):001:0> require "xmlrpc/client"
=> true
irb(main):002:0> client = XMLRPC::Client.new("127.0.0.1", "/api2xmlrpc", 9000, nil, nil, nil, nil, true)
=> #<XMLRPC::Client:0x0055f732cd1a48 @http_header_extra=nil, @http_last_response=nil, @cookie=nil, @host="127.0.0.1", @path="/api2xmlrpc", @proxy_host=nil, @proxy_port=nil, @use_ssl=true, @timeout=30, @port=9000, @password=nil, @user=nil, @auth=nil, @http=#<Net::HTTP 127.0.0.1:9000 open=false>, @parser=nil, @create=nil>
irb(main):003:0> client.http.verify_mode = OpenSSL::SSL::VERIFY_NONE
=> 0
irb(main):004:0> client.http_header_extra = { "Host" => "Ap00463.example.com", "Location" => "Ap00463.example.com" }
=> {"Host"=>"Ap00463.example.com", "Location"=>"Ap00463.example.com"}

(Unrelated to the greater narrative, I discovered that requests originating from Ruby needed to specify both Host and Location. I didn’t investigate why.)

However, an attempt to login generated an HTTP error:

irb(main):005:0> client.call("login", 2, 0)
RuntimeError: HTTP-Error: 400 Bad Request

Suspecting that Ruby was somehow overriding the Host header, I modified my local /etc/hosts file so that Ap00463.example.com would resolve to 127.0.0.1. With that change in place, everything started working as expected:

irb(main):001:0> require "xmlrpc/client"
=> true
irb(main):002:0> client = XMLRPC::Client.new("Ap00463.example.com", "/api2xmlrpc", 9000, nil, nil, nil, nil, true)
=> #<XMLRPC::Client:0x0055f732daa690 @http_header_extra=nil, @http_last_response=nil, @cookie=nil, @host="Ap00463.example.com", @path="/api2xmlrpc", @proxy_host=nil, @proxy_port=nil, @use_ssl=true, @timeout=30, @port=9000, @password=nil, @user=nil, @auth=nil, @http=#<Net::HTTP Ap00463.example.com:9000 open=false>, @parser=nil, @create=nil>
irb(main):003:0> client.http.verify_mode = OpenSSL::SSL::VERIFY_NONE
=> 0
irb(main):004:0> client.http_header_extra = { "Host" => "Ap00463.example.com", "Location" => "Ap00463.example.com" }
=> {"Host"=>"Ap00463.example.com", "Location"=>"Ap00463.example.com"}
irb(main):005:0> client.call("login", 2, 0)
XMLRPC::FaultException: Login expects 4 - 5 parameters, got: 2

Assuming that I was right about the Host header being overwritten, I dug into the Net::HTTP source. A little bit of code reading convinced me that the Host header was not being overwritten (In retrospect, that would have been very unusual behavior by Ruby.)

The XMLRPC client should have been sending the same data as the cURL command, so I spent a long time monkeypatching methods and injecting a fake socket to capture the client’s output. This wound up being a waste of time, because I later learned that the Net::HTTP#set_debug_output method did exactly what I needed. In any event, I was able to establish that the same data was being sent by cURL and Ruby, but still had no explanation for why the Ruby requests were being rejected.

It was during the second or third time of explaining the problem to a coworker that I idly remarked “The Net::HTTP code sets hostname on the socket, maybe I can override that somehow.” I never got that to work – the assignment happens in the middle of a 50 line method – but it did get me wondering why it was even necessary. It was clear that the hostname had to be affecting the connection somehow and my last good idea was to figure out if the hostname got sent across the socket after being set.

I had been avoiding tcpdump since the connection was using TLS and would be encrypted, but didn’t see any other option by this point. The command I used to show the full packet was sudo tcpdump -nnvvXSs 1514 -i lo port 9000. Nearly the entire output was unintelligible, but there was an early packet that included the hostname in plaintext!

21:28:43.777727 IP (tos 0x0, ttl 64, id 30775, offset 0, flags [DF], proto TCP (6), length 267)
    127.0.0.1.58744 > 127.0.0.1.9000: Flags [P.], cksum 0xfeff (incorrect -> 0x0d89), seq 1179504795:1179505010, ack 874090358, win 342, options [nop,nop,TS val 169519279 ecr 169519279], length 215
	0x0000:  4500 010b 7837 4000 4006 c3b3 7f00 0001  E...x7@.@.......
	0x0010:  7f00 0001 e578 2328 464d d09b 3419 8f76  .....x#(FM..4..v
	0x0020:  8018 0156 feff 0000 0101 080a 0a1a a8af  ...V............
	0x0030:  0a1a a8af 1603 0100 d201 0000 ce03 0358  ...............X
	0x0040:  6329 a4f7 fa36 e4e5 8f7b d78a 5bbd 4583  c)...6...{..[.E.
	0x0050:  3c0f 1b9e 1f8c 2698 7f9f a669 7630 f900  <.....&....iv0..
	0x0060:  0044 c02b c02f c02c c030 009e 00a2 009f  .D.+./.,.0......
	0x0070:  00a3 c023 c027 c009 c013 c024 c028 c00a  ...#.'.....$.(..
	0x0080:  c014 0067 006b 0033 0039 0040 006a 0032  ...g.k.3.9.@.j.2
	0x0090:  0038 009c 009d 003c 003d 002f 0035 c007  .8.....<.=./.5..
	0x00a0:  c011 0005 00ff 0100 0061 0000 001a 0018  .........a......
	0x00b0:  0000 1541 7030 3034 3633 2eXX XXXX XXXX  ...Ap00463.examp
	0x00c0:  XXXX XXXX XXXX XXXX 000b 0004 0300 0102  le.com..........
	0x00d0:  000a 000a 0008 0017 0019 0018 0016 0023  ...............#
	0x00e0:  0000 000d 0020 001e 0601 0602 0603 0501  ................
	0x00f0:  0502 0503 0401 0402 0403 0301 0302 0303  ................
	0x0100:  0201 0202 0203 000f 0001 01              ...........

Requests to 127.0.0.1 from Ruby included the dotted IP address in roughly the same position, but none of my requests from cURL included a plaintext hostname. This was the most encouraging discovery of the day, I just needed to figure out why Ruby was sending the hostname when cURL wasn’t.

I knew that Net::HTTP#connect called SSLSocket#hostname= and since I was already off the deep end, digging into the source for SSLSocket to figure out how it used the hostname seemed like a reasonable thing to do. I pulled down the ruby/openssl repo from GitHub and started poking around. I couldn’t find a hostname= method definition for the SSLSocket class in openssl/ssl.rb, but a quick grep uncovered ext/openssl/ossl_ssl.c. It turns out that most of the SSLSocket class is defined in that C file, not in the Ruby file, and the ossl_ssl_set_hostname method held the clue to what was going on:

#ifdef HAVE_SSL_SET_TLSEXT_HOST_NAME
/*
 * call-seq:
 *    ssl.hostname = hostname -> hostname
 *
 * Sets the server hostname used for SNI. This needs to be set before
 * SSLSocket#connect.
 */
static VALUE
ossl_ssl_set_hostname(VALUE self, VALUE arg)
{
  ...
}

OpenSSL uses SNI to tell the server which certificate to use for the request. cURL doesn’t perform SNI when certificate checking is disable (because it doesn’t matter what cert is being used,) which explains why requests worked with cURL and not with Ruby.

I was convinced that this was the root cause (partly because I had no idea where else to look if it wasn’t) but still wanted to actually prove it. Naturally, the easiest way to do that was to patch my local ruby installation. Because I use rbenv, the ruby source on my machine lives at /home/bschmeck/.rbenv/versions/2.3.1/lib/ruby/2.3.0/. I edited net/http.rb, commenting out the line that assigns the hostname and adding a debugging output line to verify my changes were being used. That change was enough to solve my problem and requests sent to localhost were handled properly by the remote server. (Incidentally, this investigation would have been a lot shorter had I noticed the comment on the line above the hostname assignment: # Server Name Indication (SNI) RFC 3546.)

I am not the first person bitten by SNI in Ruby, an issue (with a proposed patch) was created in December of 2014. I’m hopeful that Ruby will eventually give us some way to toggle SNI, but for the time being we resorted to skipping the SSLSocket#hostname= call by monkeypatching the Net::HTTP#connect method for all connections to this host. It’s an ugly fix, but it lets us talk to the server!