Sunday, November 16, 2014

HTTP Request Timings with cURL

When a client makes an HTTP request, the following things happen:

  1. DNS name resolution
  2. SSL/SSH/etc connect/handshake to the remote host if applicable
  3. TCP connection to the remote host
  4. Negotiations specific to the particular protocol(s) involved
  5. Redirects if applicable
  6. Content generation on remote host
  7. Content transfer

Have you ever wanted to know how long each phase of an HTTP request takes?

It turns out this is relatively straightforward to do with the command line tool cURL and its powerful "Write Out" option.

Write Out Option

cURL comes with an option to print out a lot of useful information related to a request.  The following is taken directly from the cURL man page:

-w, --write-out

Defines what to display on stdout after a completed and successful operation. The format is a string that may contain plain text mixed with any number of variables. The string can be specified as "string", to get read from a particular file you specify it "@filename" and to tell curl to read the format from stdin you write "@-".

The variables present in the output format will be substituted by the value or text that curl thinks  fit, as described below. All variables are specified as %{variable_name} and to output a normal % you just write them as %%. You can output a newline by using \n, a carriage return with \r and a tab space with \t.

NOTE: The %-symbol is a special symbol in the win32-environment, where all occurrences of % must be doubled when using this option.

The variables available are:

VariableDescriptionSince
content_typeThe Content-Type of the requested document, if there was any.
filename_effectiveThe ultimate filename that curl writes out to. This is only meaningful if curl is told to write to a file with the --remote-name or --output option. It's most useful in combination with the --remote-header-name option.7.25.1
ftp_entry_pathThe initial path curl ended up in when logging on to the remote FTP server.7.15.4
http_codeThe numerical response code that was found in the last retrieved HTTP(S) or FTP(s) transfer. In 7.18.2 the alias response_code was added to show the same info.
http_connectThe numerical code that was found in the last response (from a proxy) to a curl CONNECT request.7.12.4
local_ipThe IP address of the local end of the most recently done connection - can be either IPv4 or IPv6.7.29.0
local_portThe local port number of the most recently done connection.7.29.0
num_connectsNumber of new connects made in the recent transfer.7.12.3
num_redirectsNumber of redirects that were followed in the request.7.12.3
redirect_urlWhen an HTTP request was made without -L to follow redirects, this variable will show the actual URL a redirect would take you to.7.18.2
remote_ipThe remote IP address of the most recently done connection - can be either IPv4 or IPv6.7.29.0
remote_portThe remote port number of the most recently done connection.7.29.0
size_downloadThe total amount of bytes that were downloaded.
size_headerThe total amount of bytes of the downloaded headers.
size_requestThe total amount of bytes that were sent in the HTTP request.
size_uploadThe total amount of bytes that were uploaded.
speed_downloadThe average download speed that curl measured for the complete download. Bytes per second.
speed_uploadThe average upload speed that curl measured for the complete upload. Bytes per second.
ssl_verify_resultThe result of the SSL peer certificate verification that was requested. 0 means the verification was successful.7.19.0
time_appconnectThe time, in seconds, it took from the start until the SSL/SSH/etc connect/handshake to the remote host was completed.7.19.0
time_connectThe time, in seconds, it took from the start until the TCP connect to the remote host (or proxy) was completed.
time_namelookupThe time, in seconds, it took from the start until the name resolving was completed.
time_pretransferThe time, in seconds, it took from the start until the file transfer was just about to begin. This includes all pre-transfer commands and negotiations that are specific to the particular protocol(s) involved.
time_redirectThe time, in seconds, it took for all redirection steps include name lookup, connect, pretransfer and transfer before the final transaction was started. time_redirect shows the complete execution time for multiple redirections.7.12.3
time_starttransferThe time, in seconds, it took from the start until the first byte was just about to be transferred. This includes time_pretransfer and also the time the server needed to calculate the result.
time_totalThe total time, in seconds, that the full operation lasted. The time will be displayed with millisecond resolution.
url_effectiveThe URL that was fetched last. This is most meaningful if you've told curl to follow location: headers.

Here's a sample cURL command to show each one of these properties while making a request to Google's home page:

curl -w '\ncontent_type=%{content_type}\nfilename_effective=%{filename_effective}\nftp_entry_path=%{ftp_entry_path}\nhttp_code=%{http_code}\nhttp_connect=%{http_connect}\nlocal_ip=%{local_ip}\nlocal_port=%{local_port}\nnum_connects=%{num_connects}\nnum_redirects=%{num_redirects}\nredirect_url=%{redirect_url}\nremote_ip=%{remote_ip}\nremote_port=%{remote_port}\nsize_download=%{size_download}\nsize_header=%{size_header}\nsize_request=%{size_request}\nsize_upload=%{size_upload}\nspeed_download=%{speed_download}\nspeed_upload=%{speed_upload}\nssl_verify_result=%{ssl_verify_result}\ntime_appconnect=%{time_appconnect}\ntime_connect=%{time_connect}\ntime_namelookup=%{time_namelookup}\ntime_pretransfer=%{time_pretransfer}\ntime_redirect=%{time_redirect}\ntime_starttransfer=%{time_starttransfer}\ntime_total=%{time_total}\nurl_effective=%{url_effective}\n\n' -o /dev/null -s 'https://www.google.com/'

Where:
  • -w shows which properties to write out
  • -o /dev/null redirects the output of the request to /dev/null
  • -s tells cURL not to show a progress bar
  • http://www.cnn.com/ is the URL we are requesting

Timing

From the options above, the following are of interest when it comes to timing the phases of a request.

time_appconnectThe time, in seconds, it took from the start until the SSL/SSH/etc connect/handshake to the remote host was completed.7.19.0
time_connectThe time, in seconds, it took from the start until the TCP connect to the remote host (or proxy) was completed.
time_namelookupThe time, in seconds, it took from the start until the name resolving was completed.
time_pretransferThe time, in seconds, it took from the start until the file transfer was just about to begin. This includes all pre-transfer commands and negotiations that are specific to the particular protocol(s) involved.
time_redirectThe time, in seconds, it took for all redirection steps include name lookup, connect, pretransfer and transfer before the final transaction was started. time_redirect shows the complete execution time for multiple redirections.7.12.3
time_starttransferThe time, in seconds, it took from the start until the first byte was just about to be transferred. This includes time_pretransfer and also the time the server needed to calculate the result.
time_totalThe total time, in seconds, that the full operation lasted. The time will be displayed with millisecond resolution.


If we consider a simple request to a non-SSL page that involves no redirects, there would be 4 main phases:

HTTP Request Phases: DNS Lookup, TCP Connection, Content Generation, and Content Transfer
HTTP Request Phases



Let's run the following command to gather timings while loading CNN's home page:


curl -w '\ntime_namelookup=%{time_namelookup}\ntime_appconnect=%{time_appconnect}\ntime_connect=%{time_connect}\ntime_redirect=%{time_redirect}\ntime_pretransfer=%{time_pretransfer}\ntime_starttransfer=%{time_starttransfer}\ntime_total=%{time_total}\n\n' -o /dev/null -s 'http://www.cnn.com/'

Where:
  • -w shows which timing properties to write out
  • -o /dev/null redirects the output of the request to /dev/null
  • -s tells cURL not to show a progress bar
  • http://www.cnn.com/ is the URL we are requesting

This results in the following output:

time_namelookup=0.029
time_appconnect=0.000
time_connect=0.095
time_redirect=0.000
time_pretransfer=0.095
time_starttransfer=0.166
time_total=0.530

From this, we can compute the time taken in each of the 4 phases as follows:

DNS Lookup = DNS Lookup (29 ms) - Start (0 ms) = 29 milliseconds
TCP Connection = Pre Transfer (95 ms) - DNS Lookup (29 ms) =  66 millseconds
Content Generation = Start Transfer (166 ms) - Pre Transfer (95 ms) =  71 milliseconds
Content Transfer = Total (530 ms) - Start Transfer (166 ms) = 364 milliseconds

HTTP Request to http://www.cnn.com/
HTTP Request Timings for http://www.cnn.com/
So there you have it!  Next time a request seems slow, you can use cURL to help find out why. You can write cron-invoked scripts that invoke the cURL command, collect the timing information, and send the results to Splunk, Amazon Cloudwatch, or your favorite logging framework to get pretty graphs of data over time.




3 comments:

Unknown said...

Thanks for these infos, really useful to understand the different timings !

Anonymous said...

Thats so very useful....Thank you very much

Anonymous said...

Thank you for the detailed advice! However I face the following problem: I get different values for Content Generation if I measure parameters from different points of the world. For example this parameter almost equal when I measure from US and UE, but it in hundred times more when I measure from Australia. So I suppose this parameter includes not only content generation, but also something that depends on connectivity as well. Could you please help me to understand what it is and is it possible to measure pure content generation time?
Thank you!