Load Testing Apache with AB (Apache Bench)

The only productive way to load test an Apache or WAMP (such as WampDeveloper Pro) web-server is to test a real-world page that itself performs -

  • Loading and processing of multiple PHP files.
  • Establishment of multiple MySQL connections, and performing multiple table reads.

This is the minimum, because the test of an almost empty and static page (used by most examples) tells us nothing about how the different parts of a web-server hold up under stress, nor how that web-server setup will handle real-world concurrent connections to websites running on web-apps such as WordPress.

* Ideally, this test would also a) perform GETs of all page assets (css, js, images) and b) simulate traffic of which 10% is DB writes (we’ll skip this because its more complicated to set up).

Using AB

Luckily, this type of test is very easy to do in a quick (and somewhat dirty) way by using Apache’s ab (Apache Bench) application (that’s included with each Apache version in its \bin directory).

This ab test won’t be the most extensive test, and it comes with its own caveats, but it will quickly show you -

  • If there is an immediate problem with the setup (this problem will manifest itself in Apache crashing).
  • How far you can push the Apache, PHP, and MySQL web-server (with concurrent connections and page request load).
  • And what Apache and PHP settings you should modify to get better performance and eliminate the crashes.

AB Issues

There are some problems with ab to be aware of -

  • ab will not parse HTML to get the additional assets of each page (css, images, etc).
  • ab can start to error out, breaking the test, as the number of requests to perform is increased, more connections are established but not returned, and as the load increases and more time passes (see ab -h for explanation of -r switch).
  • ab is an HTTP/1.0 client, not a HTTP/1.1 client, and “Connection: KeepAlive” (ab -k switch) requests of dynamic pages will not work (dynamic pages don’t have a predetermined “Content-Length: value“, and using “Transfer-Encoding: chunked” is not possible with HTTP/1.0 clients).

More on AB and the KeepAlive issue -

KeepAlive – Apache Directive

A Keep-Alive connection with an HTTP/1.0 client can only be used when the length of the content is known in advance. This implies that dynamic content will generally not use Keep-Alive connections to HTTP/1.0 clients.

Compatibility with HTTP/1.0 Persistent Connections – Hypertext Transfer Protocol HTTP/1.1 Standard

A persistent connection with an HTTP/1.0 client cannot make use of the chunked transfer-coding, and therefore MUST use a Content-Length for marking the ending boundary of each message.

Chunked transfer encoding – Wikipedia

Chunked transfer encoding allows a server to maintain an HTTP persistent connection for dynamically generated content. In this case the HTTP Content-Length header cannot be used to delimit the content and the next HTTP request/response, as the content size is as yet unknown.

Request Floods

ab will flood the Apache server with requests – as fast as it can generate them (not unlike in a DDoS attack). AB has no option to set a delay between these requests.

And given that these requests are generated from the same local system they are going to (i.e., the network layer is bypassed), this will create a peak level of requests that will cause Apache to stop responding and the OS to start blocking/dropping additional requests. Especially if the requested page is a simple PHP file that can be processed within a millisecond.

In this context, with ab, the bigger the -c (concurrent number of requests to do at the same time) is, the lower your -n (total number of requests to perform) should be… Even with a -c of 5, -n should not be more than 200.

Expect the behavior of the ab tests to be very non-deterministic under higher concurrent loads, they will fail and succeed randomly. Even a -c of 2 will cause issues.

These are the error messages displayed by ab -

apr_socket_recv: An existing connection was forcibly closed by the remote host. (730054)
apr_pollset_add(): Not enough space (12)

And the dialog displayed by Windows -
httpd-apache-crash

When this happens (a message is displayed that Apache has crashed), just ignore it (Apache is still running), and keep repeating the test until “Failed requests:” is reported as “0″, AND “Percentage of the requests served within a certain time (ms)” is about 2-20x between the 50% and 99% mark (and not 200x). Otherwise, the test is not reliable due to the issues that present themselves when ab floods Apache on loopback (and due to how the OS responds to that flood).

This is what you should see on a good test of a simple index.php page…

C:\WampDeveloper> ab -l -r -n 100 -c 10 -k -H "Accept-Encoding: gzip, deflate" http://www.example.com/

Benchmarking www.example.com (be patient).....done

Server Software:        Apache/2.4.10
Server Hostname:        www.example.com
Server Port:            80

Document Path:          /
Document Length:        Variable

Concurrency Level:      10
Time taken for tests:   0.046 seconds
Complete requests:      100
Failed requests:        0
Keep-Alive requests:    100
Total transferred:      198410 bytes
HTML transferred:       167500 bytes
Requests per second:    2173.91 [#/sec] (mean)
Time per request:       4.600 [ms] (mean)
Time per request:       0.460 [ms] (mean, across all concurrent requests)
Transfer rate:          4212.17 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.2      0       2
Processing:     1    4   5.9      3      33
Waiting:        1    4   5.9      3      32
Total:          1    4   6.0      3      33

Percentage of the requests served within a certain time (ms)
  50%      3
  66%      4
  75%      4
  80%      5
  90%      6
  95%     22
  98%     32
  99%     33
 100%     33 (longest request)

Before Performing The Load Test

Make sure that -

  • You’ve rebooted the system and don’t have anything extra open/running (i.e., YouTube videos playing in your Browser).
  • These extra PHP extensions are not loaded: Zend OPcache, APC, nor XDebug.
  • You wait 4 minutes before performing another ab test to avoiding TCP/IP Port Exhaustion (also known as ephemeral port exhaustion).
  • And in a test where KeepAlive works (it doesn’t in ab tests getting dynamic pages), the number of Apache Worker Threads are set to be greater than the number of concurrent users/visitors/connections.
  • If Apache or PHP crashes, you’ve rebooted the computer or VM before performing another test (some things get stuck and continue to persist after Apache and/or mod_fcgid’s PHP processes are restarted).

Start The AB Test

1. Install WordPress as http://www.example.com/blog

2. Open the command-line (cmd.exe).

3. Restart Apache and MySQL, and prime the web-server (with 1 request):
ab -n 1 -c 1 -k -H "Accept-Encoding: gzip, deflate" http://www.example.com/blog/

4. Run the Apache Bench program to simulate -

1 concurrent user doing 100 page hits

This is 100 sequential page loads by a single user:
ab -l -r -n 100 -c 1 -k -H "Accept-Encoding: gzip, deflate" http://www.example.com/blog/

This shows you how well the web-server will handle a simple load of 1 user doing a number of page loads.

5 concurrent users each doing 10 page hits

This is 100 page loads by 5 different concurrent users, each user is doing 10 sequential pages loads.
ab -l -r -n 50 -c 10 -k -H "Accept-Encoding: gzip, deflate" http://www.example.com/blog/

This represents a peak load of a website that gets about 50,000+ hits a month. Congratulations, your website / business / idea has made it (and no doubt is on its way up).

10 concurrent users each doing 10 page hits

This is 100 page loads by 10 different concurrent users, each user is doing 10 sequential pages loads.
ab -l -r -n 100 -c 10 -k -H "Accept-Encoding: gzip, deflate" http://www.example.com/blog/

This is where the load starts to really stress test the web-server, as 10 concurrent (simultaneous) users is a lot of traffic. Most websites will be lucky to see 1 or 2 users (visitors) a minute… So let me say it again, 10 users per second is a lot of traffic!

30 concurrent users each doing 20 page hits

This is 600 page loads by 30 different concurrent users, each user is doing 20 sequential pages loads.
ab -l -r -n 600 -c 30 -k -H "Accept-Encoding: gzip, deflate" http://www.example.com/blog/

This is the edge of what a non-cached WordPress setup will be able to handle without crashing or timing-out the web-server (and/or ab itself). This type of load represents an extremely active website or forum, the top 1%.

90 concurrent users each doing 30 page hits

This is 2700 page loads by 90 different concurrent users, each user is doing 30 sequential pages loads.
ab -n 2700 -c 90 -k -H "Accept-Encoding: gzip, deflate" http://www.example.com/blog/

Only a fully cached (using mod_cache) Apache setup will be able to handle this type of a load. This represents some of the busiest sites on the net, and there is no hope of this not maxing out and crashing (if your settings are not just right) the web-server with a non-cached WordPress setup.

Analyze the AB Results

We only care about 3 things:

1. How many Requests Per Second are we seeing? The other metrics are not really useful, as they are not representative of anything real in this ab context. * This value will remain somewhat the same regardless of the concurrency level used.

2. Are there any errors in the website’s or Apache’s (general) error and php logs? * When things stat to choke, PHP memory issues will start coming up. A lot of PHP scripts also begin to crash (and take out Apache + PHP processes) if they are not written with concurrency in mind.

3. At what concurrency level does Apache crash and/or time-out? * If this is happening at a lower concurrency level, something is wrong and you need to adjust these settings either lower of higher…

Adjust Settings to Gain Stability Under Load

Apache:
C:\WampDeveloper\Config\Apache\extra\httpd-mpm.conf

# The number of Apache threads (workers) to deploy. Each worker can handle a separate concurrent connection or request.
# This should never be set more than the expected traffic can bring in, otherwise it will waste server resources.
# Setting this value DOWN from the default 64-128 is a good idea.
ThreadsPerChild 64
# ThreadLimit should be about 50% more than ThreadsPerChild, as this has a side-effect of allotting more memory and enabling Apache to use that extra memory during peak load times.
ThreadLimit 96

# The number of requests (or connections if KeepAlive is On) after wich to recycle all Apache threads (to help control memory leaks and process bloat).
# Renamed to MaxConnectionsPerChild under Apache 2.4 ("MaxRequestsPerChild" is still valid).
MaxRequestsPerChild 16384

# The default stack size on Windows is less than or equal to 1MB, and 8MB on Linux.
# Increase size to help crashes from segmentation faults / stack overflows due to PHP scripts needing more stack size.
# Can decrease value to lower memory consumption by Apache when PHP is ran via mod_php (PHP as an Apache thread).
# Can decrease value even more when PHP is ran via mod_fcgid (PHP as a process outside of Apache).
# 8MB*1024*1024 is 8388608, 4MB*1024*1024 is 4194304, 1MB*1024*1024 is 1048576, 0.5MB*1024*1024 is 524288
<IfVersion >= 2.2>
ThreadStackSize 2097152
</IfVersion>

# The maximum number of free Kbytes that every Apache thread is allowed to hold without attempting to give it back to the OS.
# Apache 2.0 and 2.2 default to 0 / unlimited (bad), Apache 2.4 to 2MB (better).
# This might prevent the Apache process from growing too large as this will typically restrict its process max size to threads * MaxMemFree.
MaxMemFree 2048

# Backlog queue when all threads/workers are taken up.
# Increase to handle peak loads, and during TCP SYN flood attacks (default is 511).
ListenBacklog 2711

# TCP receive buffer size (in bytes). 0 specifies to use the OS default.
ReceiveBufferSize 0

# TCP send buffer size (in bytes). 0 specifies to use the OS default.
SendBufferSize 0

Apache:
C:\WampDeveloper\Config\Apache\extra\wampd-default.conf

# Turn BufferLogs On to buffer logs for multiple requests instead of writing them out individually to the log files
# Good for performance, but inconvenient for trying to detect or debug issues
BufferedLogs Off

# Use the OS's abilities to speed up memory access and file reading
# These settings are OFF to improve stability during concurrent and peak loads
# Note - EnableSendfile should be Off if a website's DocumentRoot is a network mounted location
# Note - EnableSendfile is set to Off under the default configuration of Apache 2.4
EnableMMAP Off
EnableSendfile Off

# Fixes issues but does disable a faster way of accepting network connections on Windows
# Implemented for the following known issues -
# A) The network layer (winsock) is often broken due to network, firewall, anti-virus, etc, software (s/w that adds its own filters to winsock)
# B) Apache 2.4 is being used (general issue on Windows with 2.4?...)
# C) Some requests do not start/complete (initial req is broken but sequential reqs, when performed within a 3-second window, complete)
<IfVersion < 2.3>
Win32DisableAcceptEx
</IfVersion>
<IfVersion >= 2.3.3>
AcceptFilter http none
AcceptFilter https none
</IfVersion>

PHP (for crashes):
C:\WampDeveloper\Config\Php\php.ini

; Determines the size of the realpath cache to be used by PHP. This value should
; be increased on systems where PHP opens many files to reflect the quantity of
; the file operations performed.
; http://php.net/realpath-cache-size
;realpath_cache_size = 16k
realpath_cache_size = 1M

For stability, also make sure to test both PHP and PHP-FCGI. The difference is PHP (mod_php) runs inside of Apache, PHP-FCGI (separate process via mod_fcgid) runs outside of Apache. PHP-FCGI might be more stable under some circumstances, and more fickle under others.

Performance Gains

For top performance gains use -

1. Apache’s mod_cache module to cache page requests/results. This will produce 5-10x the performance gains over all other methods combined.

2. PHP’s Zend OPcache extension to cache PHP scripts as compiled objects. This will produce a 3-5x Requests Per Second speed up.

3. memcached + php_memcache setup to cache PHP script’s or web-app’s internal data and results. This can produce a good 50%-100% performance gain.

4. Cache plugins and/or setting adjustments specific to the web-app: Cache plugins for WordPress, Speedup tips for PrestaShop, etc.

5. mod_expires to make the client’s (visitor’s) Browser cache pages and page assets for a given time, instead of re-getting those pages and assets on each page load.

* Some of these are more difficult to configure and set up than others.

Also, in my experience, the switch from 32 bit to 64 bit Apache, PHP, and MySQL versions only provides limited/marginal performance gains (and in some cases it’s even negative).

To sum everything up, 99% of all performance gains will come from utilizing Apache’s caching mechanisms (via mod_cache), using PHP Zend OPcache (extension), and afterwards (once the bottleneck is moved from Apache with PHP to MySQL), improving MySQL performance by tuning my.ini settings, and optimizing/restructuring MySQL queries by utilizing MySQL’s Slow Query log (to see what the problem is).

Having said that, there are also performance robing issues that can exist on the OS, in the Apache/MySQL/PHP settings, and even the client’s Browser, that are covered here -
http://www.devside.net/wamp-server/wamp-is-running-very-slow

3 thoughts on “Load Testing Apache with AB (Apache Bench)

  1. admin Post author

    ListenBacklog might be the critical directive to set for absorbing peak loads (on the low latency loopback test). Unfortunately, the maximum value that can be used seems to be around 200, and/or the OS’s SYN attack detection and prevention function screws everything up. To get around this, we’d have to use something other than ab, that is able to introduce latency into each request, or use another LAN system to perform the test.

    WinSock server applications use the listen() call to establish a socket for listening for incoming connections. This calls second parameter, backlog, is defined as the maximum length to which the queue of pending connections may grow.

    http://tangentsoft.net/wskfaq/advanced.html#backlog

    The traditional value for listen()’s backlog parameter is 5. This is actually the limit on the home and workstation class versions of Windows. On Windows Server, the maximum connection backlog size is 200, unless the dynamic backlog feature is enabled. (More info on dynamic backlogs below.) Because the stack will use its maximum backlog value if you pass in a larger value, you can pass a special constant, SOMAXCONN, to listen() which tells it to use whatever the platform maximum is, since the constant’s value is 0x7FFFFFFF. There is no standard way to find out what backlog value the stack chose to use if it overrides your requested value.

    http://www.linuxjournal.com/files/linuxjournal.com/linuxjournal/articles/023/2333/2333s2.html

    The backlog has an effect on the maximum rate at which a server can accept new TCP connections on a socket. The rate is a function of both the backlog value and the time that connections stay on the queue of partially open connections.

    http://blogs.technet.com/b/nettracer/archive/2010/08/11/where-have-those-afd-driver-related-registry-dynamicbackloggrowthdelta-enabledynamicbacklog-maximumdynamicbacklog-minimumdynamicbacklog-keys-gone.aspx

    Since SYN attack protection was built-in on Windows Vista, 2008, 2008 R2 or Windows 7 (and even couldn’t be disabled – please see this blog post for more information on TCP SYN attack protection on Windows Vista/2008/2008 R2/7), it wasn’t required to deal with SYN attacks at Winsock layer and as a result of that, the logic and the registry keys were removed from AFD driver.

    http://blogs.technet.com/b/nettracer/archive/2010/06/01/syn-attack-protection-on-windows-vista-windows-2008-windows-7-and-windows-2008-r2.aspx

    As of Windows Vista and onwards (Vista/2008/Win 7/2008 R2/Windows 8/Windows 2012/Windows 2012 R2), syn attack protection algorithm has been changed in the following ways: SynAttack protection is enabled by default and cannot be disabled!

    Reply
  2. Paizo

    How comes that “-n 50 -c 10″ is a “5 concurrent users each doing 10 page hits”?
    I think it should be “10 concurrent users each doing 5 page hits”

    Reply
    1. admin Post author

      AB is going to send 10 requests at a time until all 50 requests are done, so…

      50 page requests total, done 10 at a time is equivalent to 5 visitors requesting 10 pages each.

      Reply

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>