Evaluat is in private access. Demos open through June. Book a slot
How it works

One browser per user. The numbers come out honest.

Evaluat runs performance tests with real browsers. One isolated browser per virtual user. That's the architectural decision everything else follows from. Here's what that means in practice and why it changes the numbers.

Evaluat fans one test plan out to 1,000 isolated browser instances, each with its own memory, cache, cookies and network, all loading your production server. ONE BROWSER PER VIRTUAL USER Test plan 1,000 virtual users Your server 1,000 isolated browser instances each with its own memory · cache · cookies · network 1,000 users means 1,000 browsers. The contention is real.
The architectural principle

1,000 users means 1,000 browsers.

When you start a 1,000-user test in Evaluat, the platform provisions 1,000 isolated browser instances. Each instance has its own memory. Its own CPU. Its own cache. Its own cookies. Its own network stack. Nothing crosses between them.

This sounds obvious until you notice that most "browser-based" load testing tools don't do it. They share one browser across many simulated users, because it's cheaper to run. The contention they're measuring isn't real. The numbers come out optimistic. Production then surprises you.

Performance under load is mostly about contention. If your test tool simulates 100 concurrent users inside one browser process, you're not measuring 100 concurrent experiences. You're measuring one browser doing 100 things in a row. The LCP you record under that model has nothing to do with what real users see at peak.

Evaluat's design forces the contention to be real. Browser instances are independent. The numbers come out matching what your customer's Chrome would record on their machine, against your production-shaped server, with the load you're testing.


What gets measured

Every virtual user gives you the full picture.

For every virtual user in every test, Evaluat captures the six things below. Aggregated across the run, addressable per session, and exportable as raw data if you need it.

Core Web Vitals

Largest Contentful Paint (LCP), Interaction to Next Paint (INP), Cumulative Layout Shift (CLS), First Contentful Paint (FCP). Captured natively by every real browser in the run.

Network activity

Every HTTP request the browser made: method, URL, status, timing breakdown, size, MIME type, originating page. Searchable across millions of rows per run.

Console output

Every console.log, warning, error, exception, and resource load failure. Deduplicated with counts so the loud ones surface first.

Session recording

A full video of the browser viewport for every virtual user's session. Scrub through it. Find the broken moment. No reconstruction from logs.

Step playback

Every scripted action (navigate, click, type, wait) timestamped to the millisecond, with the CSS selector targeted and pass/fail outcome.


Building a test

Four pieces. None of them require code.

An Evaluat test is composed from four reusable parts. Build them once, recombine them for performance tests, smoke tests, and monitors.

01

Test scenarios

A scenario is a user journey: a sequence of steps like "navigate to the homepage, click the product category, click a product, add to cart, proceed to checkout." Scenarios are reusable building blocks. A single test can run many scenarios in parallel with weighted distribution.

02

Datasets

Datasets inject variable data into scenarios. 1,000 different UTM combinations. 1,000 different search terms. 1,000 different user records. Each virtual user picks a row, so no two users follow exactly the same path. Cache effects can't pretend to be performance.

04

Test plan

The test plan ties everything together. Region. Timezone. Locale. Viewport. Browser speed. Load shape (Duration or Sessions). Ramp-up profile. Which scenarios run at what weight. The plan is what you click "run" on.


Configurable conditions

The dials you actually need.

Every test plan controls these. None of the dials are gated behind a "contact sales" tier. Plan limits (regions, domains, concurrency) scale with the tier; the configuration surface doesn't.

Region

Pick the geographic origin for your virtual users. Latency from London is different to latency from Frankfurt. The test should know which one you care about.

Timezone & locale

Match the segment you're modelling. A Dutch checkout should run with nl_NL locale and Europe/Amsterdam timezone, not en_US.

Viewport

Set the browser dimensions. Mobile (375×812) to desktop (1920×1080) and beyond. The viewport changes the layout, which changes LCP and CLS.

Speed

How fast actions happen between steps. Bot-speed clicking exposes different bugs than human-paced clicking. Pick the one that matches your test goal.

Load shape

Ramp-up duration, steady-state duration, ramp-down duration. Combined with target concurrency, this is how you describe load tests, stress tests, spike tests, and soak tests through one configuration.

Get a demo

Test in real browsers.
Debug in real sessions.

Want to see the report on your site?

30 minutes. We'll build a scenario on your real customer journey, run a small test live, and walk you through the five report views with your data in them.

Session replay preview
30s video · 16:9