Evaluat is in private access. Demos open through July. Book a slot

Blog Guides & best practices

What is an Apdex score? Measuring user satisfaction in performance testing

A load test can come back full of green percentiles and still not tell you whether the people behind them were satisfied or quietly giving up. An Apdex score answers that in one number from 0 to 1: you set a target response time, and it reports how many requests left users satisfied rather than merely tolerating, or frustrated.

Written by: Evaluat Staff ·

An Apdex score sorts every request into three buckets against a target time T: satisfied (at or under T, counted in full), tolerating (between T and 4T, counted as half), and frustrated (over 4T or errored, counted as zero). The formula, satisfied plus half the tolerating divided by the total, produces one user-satisfaction score between 0 and 1, shown here at 0.875.

What is an Apdex score?

An Apdex score (Application Performance Index) is a single number between 0 and 1 that summarizes how satisfied users were with response time. You set a target time, called T. The score counts how many requests came back fast enough to satisfy users, gives half credit to the ones users merely tolerated, and none to the rest.

A score of 1 means everyone was satisfied, and 0 means no one was. Think of it as a pass rate with partial credit: instead of listing every individual response time, or even a handful of percentiles, Apdex reports one figure, the share of requests that kept users happy, with the near misses counted at half. A product manager, an SRE, and an executive can all read 0.92 and agree on what it means.

Apdex is an open standard, not a vendor metric. It was defined by the Apdex Alliance, a group formed in 2004 by Peter Sevcik of NetForecast, and it is published as a free technical specification. That is why the same score turns up across monitoring and testing tools that otherwise share nothing: they are all computing the one formula.

The thing being measured is response time, the gap from a request being made to the response arriving. Apdex does not care whether that request is an API call, a page load, or a checkout step. You decide what to measure and what target to hold it to, and the formula does the rest.

Why measure user satisfaction with one number?

Because a distribution of response times is hard for a team to act on, and because response time maps to satisfaction in fairly predictable steps. A latency histogram means a lot to an engineer and little to a product owner. Apdex compresses it into one figure the whole team can track from release to release.

The “satisfaction” part is not marketing. Decades of usability research, summarized by the Nielsen Norman Group (1993, updated 2014), put firm limits on how long people will wait before a delay changes their behavior. About one second is the limit for a user’s flow of thought to stay uninterrupted. About ten seconds is the limit for keeping their attention on the task at all. Below a tenth of a second, an action feels instant. Those thresholds are why a target time means anything: cross them and satisfaction does not fade gently, it drops in ways users feel and act on.

That behavior has a price. Slower pages bounce more visitors and convert fewer of them, an effect documented across retail and B2B studies, and the post on eight metrics every report should include covers the revenue side in detail. Apdex is the metric that turns “the page got slower” into “user satisfaction dropped from 0.95 to 0.88 this release,” which is a sentence the whole organization can act on.

How is an Apdex score calculated?

Apdex sorts every measured request into three buckets against your target time T, weights them, and divides by the total. Satisfied requests count fully, tolerating requests count half, and frustrated requests count zero. The result always lands between 0 and 1, and it rises as more requests fall into the satisfied bucket.

The formula

The formula is:

Apdex = (satisfied count + (tolerating count / 2)) / total samples

Half credit for tolerating requests is the whole idea. A user who waited a little longer than ideal but stayed is not as happy as one served instantly, and not as unhappy as one who gave up. Counting them at one half puts the score between those two extremes.

Satisfied, tolerating, frustrated

The three buckets are defined by T and by four times T:

  • Satisfied: response time at or under T.
  • Tolerating: response time over T, up to and including 4T.
  • Frustrated: response time over 4T.

Most implementations also count a request that errored as frustrated, regardless of how fast it failed. New Relic, for example, treats any server-side error as frustrated. A fast error still leaves the user stuck, so it belongs in the unhappy bucket.

Here is a worked example. Suppose a checkout step is measured 1,000 times during a load test, with T set to 1 second. 800 responses came back at or under 1 second (satisfied), 150 came back between 1 and 4 seconds (tolerating), and 50 took over 4 seconds or errored (frustrated). The score is (800 + 150 / 2) divided by 1,000, which is (800 + 75) divided by 1,000, or 0.875.

Why 4T?

The standard fixes the frustration boundary at four times the satisfied threshold. It is a single multiplier meant to approximate the point where waiting tips into abandonment, so you only have to choose one number, T, and the other follows. It is a convention, not a measurement of your own users, and for an interaction that should feel instant, real frustration often arrives well before 4T. Treat the 4T rule as a sensible default, and revisit it if your action’s tolerance is genuinely tighter.

How do you choose the Apdex T threshold?

Set T to the response time at which users stop feeling an action is fast, for that specific action, then keep it fixed. T is the one input you control, so the score is only as meaningful as the target behind it. The same application can look excellent or poor depending solely on where you draw the line.

Anchor T in what the action is. A tap or a type-ahead that should feel immediate deserves a sub-second T, near the one-second flow-of-thought limit or below it. A heavy report that everyone expects to grind for a moment can carry a T of a few seconds. One global T for every interaction is the most common way to make the score lie. New Relic defaults its Apdex T to 0.5 seconds for application servers, a reasonable starting point for a backend, but a default is a starting point, not an answer.

The rule that matters most is the simplest: keep T fixed across runs. If you tighten or loosen T between builds, a change in the score tells you nothing about the application, only about your bookkeeping. Set T once per action, write it down, and hold it, so that a falling Apdex always means the same thing: it got slower for users.

What is a good Apdex score?

As a rule of thumb, 0.94 and above is treated as excellent and anything below about 0.5 as unacceptable, but a score only means something against a stated T. A 0.95 measured with a generous 10-second target can describe a slower experience than a 0.85 measured with a strict 1-second target. Read the score and its T together, always.

The Apdex standard itself defines only the 0-to-1 index. The familiar five-tier rating scale is a convention popularized by monitoring vendors; the version below is Dynatrace’s, updated in 2026.

Apdex scoreCommon rating
0.94 to 1.00Excellent
0.85 to 0.93Good
0.70 to 0.84Fair
0.49 to 0.69Poor
below 0.49Unacceptable

Exact boundaries vary slightly between vendors; some place the poor-to-unacceptable break at 0.50 rather than 0.49. The labels are useful shorthand, but do not let them stand in for a target you set deliberately. “Good” against a lazy T is not good.

Apdex vs response time percentiles

Apdex and response time percentiles answer different questions, and a good report carries both. Apdex gives you one satisfaction number for the whole run, easy to track and to report upward. Percentiles describe the shape of the distribution behind it, including the slow tail where regressions live.

A percentile is the value a given share of requests came in under: p95 is the time 95% of requests beat, and only the slowest 5% were worse.

Apdex scoreResponse time percentiles
AnswersHow satisfied were users overall?How slow was it, across the distribution?
OutputOne number from 0 to 1A value per percentile (p50, p95, p99)
Best forA headline the whole team can trackFinding and diagnosing the slow tail
Blind spotTwo different distributions can score the sameNo single number to rally around

Apdex is the headline; percentiles are the diagnosis. The same logic applies to Core Web Vitals, which are not a rival to Apdex but a different lens. Apdex scores the response time of an action against a target you pick, while Core Web Vitals are Google’s fixed-threshold measures of what the page actually did in the browser. Track an Apdex on the actions that matter and Core Web Vitals on the rendered experience; they sit side by side.

Common mistakes with Apdex scores

Most Apdex mistakes come from trusting the single number without the context that produced it. Five recur often enough to name.

  • Moving T between runs. Change the target and the score moves for reasons that have nothing to do with the application. Set T once per action and keep it fixed, or the trend is meaningless.
  • Trusting one number. Two very different distributions can produce the same Apdex, so the score can hold steady while the experience changes underneath it. Keep the percentiles, and the individual sessions, behind it.
  • Ignoring errors. A run full of fast failures can still post a healthy average response time. Make sure errors land in the frustrated bucket, and read the error rate beside the score; New Relic notes that a high error rate can leave a satisfying average response time but a poor Apdex.
  • Treating 4T as gospel. The four-times multiplier is a default, not a fact about your users. For interactions that should feel instant, frustration arrives long before 4T, a known limitation of a fixed threshold.
  • Reading one site-wide score. A healthy overall Apdex can hide a single revenue-critical page sitting in the frustrated bucket. Score per URL and per transaction, not just per site.

Read with these caveats, one honest number is genuinely useful, which is why Apdex is still in active use as of 2025. It is now best treated as the headline on a fuller report rather than the whole story.

How Evaluat reports Apdex

Evaluat is a real-browser performance testing platform, and every test report includes an Apdex score against thresholds you set. The score is the headline. What makes it trustworthy is everything the report keeps underneath it.

Because every virtual user runs in its own real browser, one report carries the satisfaction number alongside the detail a single figure cannot show: response time percentiles from p50 to p99, a per-URL and per-transaction breakdown, Core Web Vitals (Largest Contentful Paint, Interaction to Next Paint, Cumulative Layout Shift), and the evidence behind every number, namely session video, a network log, and a console log for each virtual user. When the Apdex on a page falls between builds, you do not just watch the number move. You open the slowest sessions and see what slowed. Forensic detail beats aggregate percentiles.

That is the honest case both for a single score and against leaning on it alone. Apdex is a headline, not a diagnosis, which is why it belongs on top of the distribution and the sessions, never instead of them. And if you are load-testing a pure API or chasing extreme request-per-second numbers with no page to render, a protocol-level tool like k6 or JMeter is the right instrument, and it will score Apdex on server response time perfectly well. When the question is what your users actually experienced under load, that takes a real browser. A failure at peak isn’t a percentile. It’s a session.

An Apdex score is the one number that tells a whole team whether users were satisfied, as long as you set an honest target and hold it fixed. Read it as the headline of a report, with the percentiles and the sessions behind it to explain why it moved, and a dropping score becomes a fix instead of a mystery.

Test in real browsers. Debug in real sessions. Book a demo.

Common questions

FAQ

What does Apdex stand for?

Apdex stands for Application Performance Index. It is an open standard, defined by the Apdex Alliance in 2004, for turning response time measurements into a single user-satisfaction score between 0 and 1. The aim is to give a whole organization one comparable number instead of a scatter of percentiles.

How is an Apdex score calculated?

Sort every measured request into three buckets against a target time T: satisfied (at or under T), tolerating (over T up to 4T), and frustrated (over 4T or errored). The score is the satisfied count plus half the tolerating count, divided by the total number of requests. For example, 800 satisfied, 150 tolerating, and 50 frustrated out of 1,000 gives (800 + 75) divided by 1,000, or 0.875.

What is a good Apdex score?

A common rating scale treats 0.94 and above as excellent, 0.85 to 0.93 as good, 0.70 to 0.84 as fair, 0.49 to 0.69 as poor, and below 0.49 as unacceptable. These bands are a vendor convention rather than part of the Apdex standard, and exact boundaries vary. A score only means something against a stated target T, so always read the two together.

How do you choose the Apdex T threshold?

Set T to the response time at which users stop feeling an action is fast, for that specific action, then keep it fixed. An interaction that should feel instant deserves a sub-second T, while a heavy report can carry a few seconds. New Relic uses 0.5 seconds as a default for application servers, but a default is a starting point, not an answer. Keep T fixed across runs so a falling score means the app slowed, not that you moved the goalposts.

Does Apdex account for errors?

In most implementations, yes. A request that returns a server-side error is counted as frustrated regardless of how quickly it failed, because a fast error still leaves the user stuck. This matters because a high error rate can otherwise hide behind a healthy average response time. Always read the error rate alongside the Apdex score.

Is Apdex still used?

Yes. Apdex remains in active use in monitoring and performance testing as of 2025, and the standard is still maintained, though it is now typically read as one of a broader set of metrics rather than on its own. It works best as a headline that sits on top of response time percentiles and per-URL detail, not as a single number read in isolation.

What is the difference between Apdex and response time percentiles?

Apdex compresses a run into one satisfaction score between 0 and 1; response time percentiles describe the shape of the distribution, including the slow tail. Apdex is the headline a whole team can track release over release, while percentiles such as p95 and p99 are where you diagnose a regression. They are complementary, so report both rather than choosing one.

What is the difference between Apdex and Core Web Vitals?

Apdex scores the response time of an action against a target T you choose, producing one number from 0 to 1. Core Web Vitals (Largest Contentful Paint, Interaction to Next Paint, Cumulative Layout Shift) are Google fixed-threshold measures of page experience in the browser. They measure different things, so you can track both: an Apdex on the actions that matter, and Core Web Vitals on what the page rendered.

See it on your site

Test in real browsers.
Debug in real sessions.

Want to see this measured on your app?

30 minutes. We build a scenario on your real customer journey, run a small test, and walk you through the report with your data in it.