Applications | Experts | Maps | Network Mgmt | OS | Packets | Problem Mgmt | Philosophy | SAN | Seminars


Applications

Latency Across Networks
Throughput
Making CNS Pie
Mechanics
Nostalgia

Applications

A Novice's Look at Application Performance Analysis

Latency Across Networks

Visualize traffic flowing through client, network, server, and storage elements.


Throughput Measurement

Download the Myth Busting Toolkit Version 1.3.3 released 2017-01-23, for *nix and for Windows. See the Myth Busting Homework for installation instructions.

What is the Myth Busting Toolkit? In this Toolkit, I have wrapped iPerf, wget, xcopy (Windows) plus cp -a (*nix), and gnuplot with scripts which automate the process of running these tools and produce charts summarizing the results.

Cycle-iPerf

Typically, I start with a baseline, in this case, two machines connected via a single cable, delivering a Mean Throughput of ~75 MB/s (74.59). Inserting an HP Ethernet switch drops Mean Throughput to ~59 MB/s (59.19). Enabling 9K Jumbo Frames drops throughput further to ~13 MM/s (13.53) -- ouch -- while enabling 4K Jumbo Frames increases throughput to ~103 MB/s (102.67). Replacing the HP switch with a MiktroTik RB750GL supporting 4K Jumbo Frames delivers ~63 Mb/s (63.02), which suggests that the HP1810 handles jumbos more effectively than the RB750GL does.

In this example, I combined numerous runs of the Toolkit plus a custom gnuplot config file to produce a meta-study, the effect of frame loss on two Windows 7 machines communicating over a simulated 'network nightmare' (lossy network). Each of the data points in this meta-study are backed by a thousand runs of iPerf (i.e. cycle-iperf run with "-i 1000" to specify a thousand iterations). This particular study suggests that, for the Windows 7 TCP-IP stack (summer of 2014), throughput begins to degrade substantially at .7% loss. To perform your own analysis, download the raw data.

iPerf tends to deliver variance during these runs, perhaps because the Client and the Server are not dedicated to this task and thus are intermittently distracted by other functions. Ideally, one deploys dedicated tools: here is an example of a NetScout Optiview XG running Performance Tests across the HP switch, see p. 11 for the XG's version of cycle-iperf's Throughput chart. Notice how throughput remains rock-solid at 1000 Mb/s, i.e. line-rate 1GigE -- no variance, likely because the XG / LinkRunner combination dedicates hardware to this task. Notice also how commercial tools report on all five parameters which together define a network: Throughput, Latency, Jitter, Loss, and Availability. In fact, the XG's Performance Test is Netscout's implementation of ITU Y.1564, a standardized test protocol for Ethernet networks. The other set of examples in this directory revolve around validating a wide-area circuit (a 10G Ethernet circuit linking our main building to the co-location site where we hop onto the Internet).

cycle-iperf, on the other hand, conflates Client, Network, and Server contributions when it produces its Throughput chart. Depending on what you're trying to do, this may or may not meet your needs. Mostly, I use the Toolkit for quick & dirty measurements, turning to tools like the Optiview XG when I need robustness.

In this WiFi meta-example, I used cycle-iperf to gather throughput measurements at 4-6 locations on each floor of my company's building (~260,000 sq feet), at both 2.4GHz and 5GHz. I manually copied the resulting output files to a directory tree and ran in-house code to summarize the results into per-location changes, 2.4GHz-specific changes, 5GHz-specific changes, and overall changes. As the reports show, migrating from an engineered WiFi network (each WAP manually configured for channel & power settings) to an automatic WiFi network (let the WAP controller dynamically instruct each WAP on channel & power settings) improved 2.4GHz performance modestly (~25% overall) as well as cut the number of lousy locations in half. For 5GHz coverage, the migration to full-auto did not affect the count of lousy locations but dramatically slashed throughput (overall). We speculate that this occurred because the earlier sum of throughput numbers was dominated by the occasional measurement which achieved 80MHz channel bonding, whereas, as part of our migration to 'fully-automatic', we disabled channel-bonding, eliminating these occasional bursts of spectacular throughput.


Cycle-Wget

cycle-wget takes the same approach as cycle-iperf but uses wget to iteratively download a URL, calculating and charting throughput from each iteration. Intended as a tool for measuring throughput across upper-layer devices, like firewalls and load-balancers.


Cycle-File-Copy

Similarly, cycle-file-copy takes the same approach but uses xcopy (Windows) or cp -a (*nix) to iteratively copy one directory to another, typically a directory on your client to a file server. That directory might contain a single large file, as in the example below, or it might contain many files and subdirectories.


Making Client / Network / Server Pie

I find that drawing the client-network-server pie can help focus my attention during application performance analysis. Tasting Client / Network / Server Pie is a shorter version, published in the February 2012 issue of ;login.

Code

Data


Mechanics

Send-UDP-Msg

Send-UDP-Msg is a Perl script for sending custom text in a UDP frame, useful for annotating trace files. Under Windows, fping includes the '-d {message}' switch for this purpose, but the Linux version does not, so I wrote my own.

host> send-udp-msg -h server.company.com -m "Starting NFS mount now"

Since writing this, I've realized that the standard Unix 'netcat' command does the same and more:

host> echo 'Starting NFS mount now' | nc -4 -w 1 -u server.company.com 678

where '-u' specified UDP and '678' specifies the UDP port. Or, more suitable for an NFS example (where we might be filtering on TCP port 2049):

host> echo 'Starting NFS mount now' | nc -4 -w 1 server.company.com 2049

See Definitive Diagnostic Data for a sketch of how and why this approach becomes useful.


How to Build a WAN Emulation Box

My notes around installing Linux and then WANsim onto a PC Engines apu1c, using Ubuntu booting from an mSATA card.

My notes around installing WANem onto a PC Engines apu1c, using a USB stick booting Knoppix.

Sean Hinde's notes around installing WANem onto a PC Engines apu1c, booting Debian from an mSATA card: you'll want his wanem-files collection, along with his apache config collection.


Nostalgia

Server Benching

Nostalgia: My first performance analysis paper, 1993, written in conjunction with my boss and mentor Dr. Stephen M. Erde, benching a beta version of AppleShare 4.0 against Netware 3.11.



Last modified: 2017-06-29