Load Testing: Finding Your Web Application's Breaking Point

15 min read 2,873 words
Load Testing: How to Find the Breaking Point of Your Web Application Before Users Do featured image

What load testing actually means for web applications

Load testing simulates realistic user traffic against your web application to see how it performs under pressure. Rather than waiting for a product launch, traffic spike, or marketing campaign to expose performance problems, you create a controlled test that reveals exactly where your application starts to struggle, how it fails, and what the user experience looks like when demand exceeds capacity.

The goal is not to break your application for fun. It is to understand the limits of your current setup before real users experience slow pages, failed transactions, or complete outages. A well-executed load test gives you data to make informed decisions about infrastructure, scaling strategy, and application optimisation.

If you are responsible for a business website or web application in the UK, load testing should be part of your standard deployment process. The cost of discovering performance problems after launch is almost always higher than investing time in testing beforehand.

Why most web applications are not tested until it is too late

Performance testing often gets skipped because it feels like extra work, requires specialist knowledge, or simply does not make the cut when release deadlines are tight. Teams assume the hosting plan is sufficient, the code is efficient enough, or traffic will not spike unexpectedly.

That assumption tends to hold until it does not. Common triggers for discovering load problems the hard way include viral social media posts, seasonal traffic spikes, marketing campaigns, and slow but steady growth that quietly pushes infrastructure past its comfort zone.

The consequences are familiar: pages that time out, checkout flows that fail, APIs that return errors, and users who leave before completing their intended action. For a business website, that is lost revenue and damaged trust. For a critical web application, it can mean service outages affecting customers directly.

Load testing does not eliminate all risk, but it significantly reduces the chance of a performance-related incident catching you off guard.

Key types of performance testing

Load testing is one part of a broader discipline that includes several different test types, each answering a specific question about application behaviour.

  • Load testing: Simulates expected user traffic to verify the application performs acceptably under normal and peak conditions.
  • Stress testing: Pushes the application beyond expected load to find the breaking point and understand how it fails.
  • Spike testing: Tests sudden, dramatic increases in traffic to see how the application reacts to rapid changes in demand.
  • Endurance testing: Runs a sustained load over an extended period to identify memory leaks, resource depletion, or degradation over time.

Most web applications benefit from at least load testing and stress testing. Load testing confirms the application handles normal traffic, while stress testing reveals what happens when traffic exceeds expectations.

Apache Bench: a quick way to test HTTP endpoints

Apache Bench, commonly abbreviated as AB, is a lightweight command-line tool that ships with Apache HTTP Server. It is straightforward to use and works well for getting a quick baseline measurement of how an endpoint responds under concurrent load.

You can install it on most Linux systems with a package manager.

sudo apt install apache2-utils

Once installed, you run a basic test by specifying the total number of requests and the number of concurrent requests.

ab -n 1000 -c 50 https://yourwebsite.com/

This sends 1,000 total requests to the target URL with 50 concurrent connections at any given time. The output includes useful metrics such as the average time per request, the number of requests served per second, and the percentage of requests completed within a given time threshold.

Here is what a typical output summary might look like after running Apache Bench against a basic endpoint.

Server Software:        nginx/1.18.0
Server Hostname:        yourwebsite.com
Server Port:            443
SSL/TLS Protocol:       TLSv1.3, TlsPhase=TLSv1.3
Document Path:          /
Document Length:        8456 bytes

Concurrency Level:      50
Time taken for tests:   12.345 seconds
Complete requests:      1000
Failed requests:        0
Non-2xx responses:      0
Requests per second:    81.01 [#/sec] (mean)
Time per request:       617.256 [ms] (mean)
Time per request:       12.345 [ms] (mean, across all concurrent requests)
Percentage of the requests served within a certain time
  50%     580ms
  66%     610ms
  90%     720ms
  95%     850ms
  98%     980ms
  99%    1050ms
 100%    1200ms

The latency percentiles are particularly useful. If 95% of your requests complete within 850 milliseconds during a 50-concurrent-user test, you have a baseline to compare against as your application grows or changes.

Apache Bench is useful for quick checks, but it has limitations. It tests a single endpoint at a time, does not handle complex authentication or session management well, and does not simulate realistic user journeys. For more comprehensive testing, a scriptable tool like k6 is a better choice.

k6: writing realistic load test scripts

k6 is an open-source load testing tool that lets you write test scripts in JavaScript. Unlike Apache Bench, k6 lets you define complex user journeys, handle authentication, pass dynamic data between requests, and simulate realistic browsing patterns rather than just hammering a single URL.

Installing k6 is straightforward on most systems.

sudo gpg -k
sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg --keyring /usr/share/keyrings/k6-archive-keyring.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt update
sudo apt install k6

A simple k6 test script starts with defining the test options, then writing the main test function that runs for each virtual user.

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '30s', target: 20 },
    { duration: '1m', target: 50 },
    { duration: '30s', target: 0 },
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'],
  },
};

export default function () {
  const res = http.get('https://yourwebsite.com/');
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time under 500ms': (r) => r.timings.duration < 500,
  });
  sleep(Math.random() * 3 + 1);
}

The stages section defines a ramp-up, sustain, and ramp-down pattern that simulates real traffic more accurately than a flat concurrent load. The thresholds section lets you define pass or fail conditions based on your performance targets.

A more complete workflow example shows how to test a multi-step user journey. Consider a script that logs in, browses a product listing, adds an item to cart, and checks out.

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '2m', target: 100 },
    { duration: '5m', target: 100 },
    { duration: '2m', target: 0 },
  ],
  thresholds: {
    http_req_duration: ['p(95)<2000'],
    http_req_failed: ['rate<0.01'],
  },
};

const BASE_URL = 'https://yourwebsite.com';

export default function () {
  // Browse homepage
  let res = http.get(BASE_URL);
  check(res, { 'homepage loaded': (r) => r.status === 200 });
  sleep(1);

  // Browse product listing
  res = http.get(`${BASE_URL}/products`);
  check(res, { 'product listing loaded': (r) => r.status === 200 });
  sleep(2);

  // Add item to cart
  const payload = { product_id: 123, quantity: 1 };
  res = http.post(`${BASE_URL}/cart/add`, JSON.stringify(payload), {
    headers: { 'Content-Type': 'application/json' },
  });
  check(res, { 'item added to cart': (r) => r.status === 200 });
  sleep(1);

  // Proceed to checkout
  res = http.get(`${BASE_URL}/checkout`);
  check(res, { 'checkout page loaded': (r) => r.status === 200 });
}

This script simulates a realistic user flow, measures performance across multiple endpoints, and uses a http_req_failed threshold to track error rates alongside response times.

If you want realistic scripts that simulate actual user behaviour rather than simple endpoint tests, it helps to approach load testing with the same discipline you apply to your deployment workflow. Writing clear, maintainable test scripts that match real user journeys makes the results far more meaningful than arbitrary endpoint checks.

Defining meaningful test scenarios

A load test is only as useful as the scenario it reproduces. Testing the homepage with 100 concurrent users tells you something, but it does not tell you how your application behaves when users actually use it.

Start by identifying the most critical user journeys in your application. These are typically the paths that directly affect revenue or user satisfaction.

  • Product browsing and search for an e-commerce site
  • Checkout and payment processing
  • User registration and login flows
  • API endpoints that power mobile apps or integrations
  • Dashboard and data retrieval for internal tools

For each journey, identify the key requests, expected response times, and the number of concurrent users you expect during normal and peak periods. Peak periods might include a product launch, a seasonal sale, or a marketing campaign you have planned.

When setting concurrency targets, err on the side of testing higher than your current peak. If your busiest day sees 200 concurrent users, test at 300, 400, and beyond. Understanding how your application degrades helps you plan scaling responses before you are under pressure.

Consider the duration of your tests as well. A 30-second test might show that an endpoint handles load initially, but it will not reveal memory leaks or resource depletion that only appear after sustained use. Endurance tests that run for hours can surface problems that short bursts miss entirely.

Setting up an appropriate test environment

Where you run your load tests matters almost as much as what you test. Ideally, your test environment should mirror your production setup as closely as possible, including server specifications, network configuration, database size, and third-party service connections.

Testing directly against production can yield the most accurate results, but it carries real risk. A load test that overwhelms production infrastructure causes the same downtime you are trying to prevent. If you test in production, schedule it during low-traffic periods, have monitoring in place, and ensure you can stop the test quickly if something goes wrong.

Always create a backup before running load tests against any environment that matters. While the risk of data loss during a properly configured load test is low, having a recent backup means you can restore quickly if something unexpected occurs.

A staging environment that matches production is often the safest choice. This lets you push the application hard without risking real user impact. The main requirement is that the staging setup reflects production accurately enough that test results are meaningful.

Some teams use dedicated performance testing environments that are similar but not identical to production. This approach works if you understand the differences and account for them when interpreting results.

Interpreting load test results

Raw numbers from a load test are only useful when you know what to look for. The most important metrics to examine are response time, error rate, and throughput.

Response time tells you how long each request takes to complete. Look at both the average and percentiles. A 500ms average might sound acceptable, but if the 99th percentile sits at 5 seconds, a significant portion of your users are experiencing serious delays.

Error rate shows how many requests fail or return unexpected status codes. A 1% error rate during a load test with 1,000 concurrent users means 10 users are encountering failures at any given moment. At scale, that compounds quickly.

Throughput measures how many requests your system processes per second. This metric helps you understand the actual capacity of your infrastructure and compare it against your expected traffic requirements.

As you review results, look for patterns. Does response time degrade gradually as concurrent users increase, or does it hold steady before suddenly spiking? Does performance degrade consistently across all endpoints, or is a specific page or API call the bottleneck?

These patterns point you toward the root cause and help you decide where to focus optimisation efforts.

Monitoring during load tests

Load test results tell you the outcome of your test, but server metrics tell you why performance changed. Combining application-level measurements with infrastructure monitoring gives you the complete picture.

Key server-side metrics to track during a load test include CPU usage, memory consumption, disk I/O, and network throughput. If CPU usage hits 100% during a test while response times climb, the application is CPU-bound and optimisation or additional resources will help. If memory usage grows continuously throughout an endurance test, you likely have a memory leak that needs investigation.

Database monitoring is equally important. Track query execution times, connection pool usage, and slow query logs. A sudden increase in database response times during your load test often points to missing indexes, connection pool exhaustion, or queries that perform acceptably with small datasets but poorly under load.

Application-level profiling during load tests can identify which specific functions or database calls consume the most time. Without this visibility, you risk optimising the wrong parts of your codebase based on assumptions rather than evidence.

Identifying and addressing bottlenecks

A bottleneck is any component that limits your application's ability to handle more load. Common culprits include the application code itself, the database, the web server configuration, and the underlying server resources.

Application-level bottlenecks often come from inefficient database queries, missing caching layers, synchronous operations that block requests, or memory leaks that accumulate over time. Profiling your application under load reveals where the code spends the most time.

A simple example: if a product listing page runs a query that fetches all product data without limiting results, adding pagination or eager loading for related data can reduce database response times dramatically. The difference between one query and ten queries per page load becomes significant under concurrent load.

Database bottlenecks frequently appear as slow query times, connection pool exhaustion, or replication lag. If your load test shows requests waiting

Related practical reading

These related guides can help you connect this topic with the wider website, server, security, and support decisions around it.

Practical load testing checklist before you start

A useful load test starts with a realistic target, not a random number of virtual users. Before running the test, confirm which journey matters most, such as viewing a product, submitting a form, signing in, searching, booking, or completing checkout.

Check the server baseline first. CPU, memory, disk I/O, database response time, PHP worker usage, queue depth, and cache hit rate tell you whether the bottleneck is the application, database, web server, or hosting layer.

  • Set a realistic scenario: Test the user journey that creates business value instead of only requesting the homepage.
  • Warm the cache separately: A cold-cache test and a warm-cache test answer different questions.
  • Watch error rates: A page can look fast while forms, API calls, or database writes are failing in the background.
  • Record the breaking point: Note the traffic level where response time, errors, or server resources become unacceptable.

How to interpret load testing results

The useful part of a load test is not the biggest traffic number. The useful part is understanding which part of the system starts to struggle first and whether that failure mode is acceptable for the business.

If response time climbs slowly while error rates stay low, the application may simply need capacity tuning, caching, query optimisation, or a better queue strategy. If errors appear suddenly, the issue may be a connection limit, exhausted PHP workers, database locks, external API timeouts, or a server rule that only triggers under pressure.

Look at the result in layers. Browser-facing response time tells you what users feel. Web server logs show status codes and request patterns. Application logs show exceptions and failed jobs. Database metrics show slow queries, locks, and connection pressure. Server metrics show whether CPU, memory, disk, or network limits are involved.

  • Good result: Response times stay within target, errors remain low, and resource use rises predictably.
  • Warning result: The site works, but one layer is close to a limit and needs tuning before a campaign, launch, or seasonal traffic spike.
  • Failed result: Users see errors, checkout or form submissions fail, or the server becomes unstable before the expected traffic level.

What to improve after a failed load test

The right fix depends on the bottleneck. A slow database query needs a different response from a saturated web server, a missing cache layer, or an external payment API that starts timing out under load.

For a small business website, the first improvements are often practical: reduce heavy plugins, cache static pages, optimise expensive queries, compress images, move background work into queues, increase PHP worker capacity, and make sure monitoring alerts trigger before customers notice the issue.

For a custom web application, the next step may be profiling the slow route, checking indexes, reviewing session storage, measuring API calls, and testing whether horizontal scaling actually helps. Adding a bigger server can hide the symptom for a while, but it will not fix inefficient code or unreliable dependencies.

Frequently Asked Questions

How many users should a load test simulate?
Start with the number of visitors or transactions the site realistically needs to handle, then test above that level to find headroom. Guessing a large number without business context can create noisy results.
Should load testing be done on a live website?
Live testing can be useful, but only with care. For many business websites, it is safer to test a staging environment first, then run a smaller controlled live test during a quiet period.
What should I monitor during a load test?
Monitor response times, error rates, CPU, memory, database queries, cache behaviour, web server logs, and any third-party services used by the tested journey.