How to “Grade” a page tested with

One of the more significant challenges in doing accessibility testing with an automated tool is trying to get an idea of how the page actually performs for users. The truth is, the only truly reliable thing a tool can claim is that it found X-number of instances where the code failed the tool’s specified tests. Assuming all of those tests reliable, the end user is still left with little more than a list of issues. Recently one of our customers asked “How do we know if we’re Compliant or not?” Although we’ve already discussed compliance in a previous post, we recognize that this is something users are often concerned with. Our gut reaction is to say that you’re not “compliant” if you have any errors, but that’s admittedly not very practical. Determining compliance is a lot trickier than that and in some cases there may even be exemptions that apply that no tool can be aware of.

In the very early days, Tenon did generate a “Grade” of sorts. We stripped the feature and for good reason. In our opinion, grading is not something an automated tool should do, because getting a good grade is likely to mislead the user into believing that their system is accessible. In reality, a good grade by a tool merely means you’ve passed those tests which the automated tool can perform. It doesn’t mean that you’ve passed all of other testing, such as manual testing, that is also required. An automated tool’s job is to find errors. That’s it. Anything beyond that is over reaching.

That being said, determining a grade can be valuable in prioritization efforts. It makes sense to focus your remediation efforts on pages that have the lowest grade. In that context, let’s discuss how you can use Tenon’s API response data to “Grade” a page.

Every response from Tenon includes a globalStats node. (See: Overview of the Tenon API Response) This section includes two important values: allDensity, which is the global average percentage of errors per KB of document source, and stdDev which is the standard deviation in all allDensity across all tested pages. These data points exist to inform the grade calculation which, ultimately, will be a score of how your page performs against all other tested pages on the web.

Here’s what the relevant globalStats information looks like:

"globalStats": {
        "errorDensity": "152",
        "warningDensity": "12",
        "allDensity": "164",
        "stdDev": "396"

Using the information above, here’s the math behind generating a percentage grade:

    max = allDensity + (3*stdDev);
    min = 0;
    if(score >= max){
      return 0;
    else if(score <= min){
      return 100;
      return 100 - ((score/ max) * 100);

These percentages can then be used to provide a letter grade. This table is based upon common letter grades in the United States:

Percent Letter Grade
98 – 100 A+
94 – 97 A
90 – 93 A-
87 – 89 B+
83 – 86 B
80 – 82 B-
77 – 79 C+
73 – 76 C
70 – 72 C-
67 – 69 D+
63 – 66 D
60 – 62 D-
60 F

What about a gut-check?

At this point, Tenon has tested almost 600,000 distinct URLs. This is a high-enough number for the below information to be statistically significant. This chart represents the distribution of issue density among tested pages. As a pure gut-check, if your page’s density falls on the higher end of percentages, it is performing significantly worse than most other pages on the web and you can assume that if most of your pages have high density, you’re probably facing higher-than-normal risk.

Tenon Global Density Chart, described in table below

Global Issue Density by Range
Stat # of Pages Pct. of Pages
Pages with 0% error Density 33417 8%
Pages with 1-10% error Density 5928 1%
Pages with 11-20% error Density 96077 23%
Pages with 21-30% error Density 61665 15%
Pages with 31-40% error Density 53234 13%
Pages with 41-50% error Density 38011 9%
Pages with 51-60% error Density 34042 8%
Pages with 61-70% error Density 24763 6%
Pages with 71-80% error Density 20379 5%
Pages with 81-90% error Density 19336 5%
Pages with 91-100% error Density 0 0%
Pages with 100%+ error Density 0 0%

Why Density?

Some may wonder why we’ve used Density as our measurement. This is because errors-per-kilobyte is going to be more reliable than a count of raw issues. A raw issue count can be deceiving. If you have 10 errors on a small page then that page is assumed to be performing worse than a larger page with the same number of errors.

Start your free trial of Tenon today!