f

unk rock


Limitations of the WAI-ARIA

As a follow up to my previous post, I wanted to learn more about the WAI-ARIA spec, so I went ahead and read it. I won’t claim to be an expert, or to have anything more than a basic understanding of the spec at this point, but I was not impressed with what I saw.

Unfortunately, the spec doesn’t seem particularly forward thinking. There are two main issues I have with the spec: first is that it’s a very limited extension of the DOM and of what we already have on the web; second is that there is no programmatic interface to the accessibility system.

The most obvious downside of assuming that the DOM will be the basic building block of future web applications is that it presumes all web apps will always be structured with DOM trees. But this already isn’t true today. Look at Mozilla’s Bespin editor, which renders text inside a canvas element, or SUN’s Lively Kernel, which implements an entire widget set in SVG. Because there is no underlying DOM structure, these two programs can not be made accessible under the WAI-ARIA spec without substantial hacks.

What about elements that aren’t part of the visual structure at all? For example, in Cappuccino, every application has a CPApplication object. This object contains information that would undoubtedly be useful to the accessibility system, but because it’s an abstract object and isn’t part of the render tree in any way, it will never be visible to WAI-ARIA compliant browsers.

This leads us to the second problem, not having programmatic access to the accessibility system. This spec is designed to target web applications, but it has taken the same document based approach as HTML. Applications are defined by requiring a programming language, which in the case of the browser is usually JavaScript. It seems short sighted to develop a standard for applications that uses a fundamentally different technology than the application does.

Even beyond the fact that elements in your program that don’t exist in the DOM tree can’t become part of the accessibility system, not having a programmatic interface means that you can’t dynamically compute accessibility values. Imagine some user interface element that changes frequently, or perhaps is a composite of several other objects. Under the current WAI-ARIA spec, every change to this user interface element needs to be immediately reflected in the DOM. This means a potentially substantial performance hit that is unnecessary for the vast majority of users, and even for people requiring the accessibility feature if they aren’t focused on that element.

Lack of an actual API also limits the potential uses of ARIA. I can’t, for example, implement my own accessibility tool within the browser. It also causes issues with the event system, and doesn’t add any enhanced functionality for simulating events using accessibility APIs, which could have enabled a lot of advanced automated testing.

These thoughts are a result of my very brief research into ARIA so far, and if anything is technically incorrect, I’d appreciate that feedback. At this point, I’m not sure where to go with ARIA and accessibility. As far as Cappuccino implementation is concerned, there is enough in ARIA that we can significantly enhance the accessibility of Cappuccino, if not 100%. I suspect the performance implications will be minimal for most UI elements (though I am concerned about text fields), but it’s impossible to say until we actually have ARIA implemented.

I’m interested to hear other opinions on ARIA, especially from those who are developing modern web applications. It would be interesting to know if anyone shares my concerns, and if there is any way to actually address some of them within the current framework of ARIA or perhaps another active project I’m not aware of. I don’t claim to have any answers, just a lot of questions, so please share your thoughts.

16 Responses to “Limitations of the WAI-ARIA”

  1. Cameron Westland Says:

    And here I thought the only reason they were releasing WAI-ARIA was to give developers programmatic access to the accessibility api’s. I guess I should have read it like you did! Thanks Ross.

  2. Victor Says:

    Hi Ross,
    I think you raise a lot of valid concerns, however, please keep in mind that ARIA was not designed or intended to solve API-related issues. You are correct in suggesting that ARIA does not fix the interoperability challenges between web apps and the accessibility layer but, again, it was not designed to do that. Accessibility clients still have to rely on the particular browser to understand and parse ARIA roles and states and make them available to the accessibility layer, such as MSAA, ATK or UA. Does ARIA solve all our web 2.0 accessibility problems? No. It does allow us, however, to master web apps that rely on the DOM and make them accessible.

  3. Ross Says:

    Victor: I wouldn’t say “master”, but yes, it does solve several problems specifically related to the DOM.

    But, if ARIA isn’t designed to fix the problems of web applications, who is working on that problem, if anybody? Perhaps some of the energy directed at individual projects like Cappuccino needs to be redirected at solving the larger problem.

  4. Alex Surkov Says:

    Mozilla accessibility team considers an ability to make custom application (like Bespin based on html:canvas) accessible by JS. Application author should be able to create accessible objects in JS and embed them into existing accessible hierarchy.

  5. Lloyd Says:

    I am certainly not an expert in this domain, but am a little more acquainted after this post.

    What does this missing api look like? Are we primarily talking about guiding the screen reader with text to utter? What other high level classes of functionality are there?

    More importantly, other than building support into frameworks, how do we ensure that it gets used? I’d posit that it need be mind numbingly simple…

    Ok, so now we’ve designed it. Can the implementation be grafted on *top* of a Dom based soln? Something like off screen or hidden nodes that are not visibly rendered, but are picked up by screen readers?

    Pardon my thinkin’ aloud…

  6. Ross Says:

    @Alex: Are you saying there’s a group in Mozilla working on this? Do they have any info published?

    @Lloyd: It definitely needs to be simple. I can’t say with any certainty how well it could be faked, but if it can, it’s going to be extremely difficult, which obviously goes against the entire purpose of ARIA.

  7. Isofarro Says:

    DOM is a data object. A browser takes a DOM and creates a visual rendering from it. It also maps DOM into MSAA updates.

    Now since DOM is an object representation of a document (or webapp) it aggregates HTML, CSS and JavaScript into one ‘application object’.

    ARIA then offers a few extra attributes to better define certain areas of that DOM. Like this sub-tree here is navigation. This node here is dynamically updated, and suggest that these changes are communicated in a polite non-important manner. Here we have a multi-pane tab widget, the active node is this node here, and the active pane is right over there.

    Because DOM is largely created from an HTML page that means the semantic structure of HTML also plays a very important part in accessibility. The ability to determine headers (because they use h1-h6 elements) mean that it is possible to determine that a certain piece of text is a header, and also allows a screenreader user to navigate the page by header – like an outliner. This is one small example of the benefits of using structured HTML properly, and how that directly improves the accessibility of a page. And why it is important to consider semantic structure when building pages or web-applications.

    DOM is just a data structure. And the WAI-ARIA introduces more data to that structure to better improve the flow of information between a web page and assistive technologies using the supported accessibility architecture.

    If you are really interested in API hooks, then you should spend time looking at Microsoft’s Accessibility Architecture (MSAA), and how browsers communicate with MSAA. That’s where those API hooks are. Then you can take the source of Firefox, for example, and alter it to do your required changes, and when you are done, feed them back to the Mozilla teams as implemented features. Yes, what you are asking for is down in the browser plumbing and the MSAA architecture.

    Think about it for a second, you don’t use JavaScript to write your own graphics primitive, because graphics cards do a far better job. These things are abstracted away for a reason. And it points to why canvas is such a bad surface for building the typical GUI-like applications – reimplementing GUI widgets in canvas may seem innovative and cool, but basically the end result is a three sided wheel – its better than the four sided wheel because it eliminates one bump.

  8. lloyd hilaiel Says:

    @ross my request to you would be that you then break the accessibility implementation you need in cappuccino into 2 parts, the cappuccino specific part, and the “scriptable accessibility interface”.. this is perhaps a meaningful bootstrap of the design work necessary.

    then to the extent that this api can be faked, we have a zero plugin, works everywhere subset. some sorta plugin (uh, browserplus?) could implement the api for real, and when present could provide the full functionality… all theoretical, of course — but a path to filling the hole for real.

  9. Ross Says:

    @Isofarro That’s an incredibly limiting viewpoint. Canvas is an amazing piece of technology in the browser — the ability to finally create *exactly* what you want, rather than an abysmal approximation limited by the features of HTML and CSS.

    The DOM doesn’t “aggregate JavaScript”. It provides hooks into the document for JavaScript. Sure, event handlers have references stored in the DOM, but the rest of the code does not sit in the DOM at all.

    Ultimately, the DOM is about pages, about static documents. Applications are more than that — they are, almost by definition, anything you can dream up. By making ARIA specifically limited to the DOM, you’re limiting its scope, which is unfortunate.

    As far as digging into browsers to implement the API I think we really need (this is relevant to @lloyd as well), that’s far beyond what my job should be. We’re just three people, trying to make great products, and sharing a lot of what we learn with the community. If we had to fix every browser shortcoming we deal with, our company would fail. All the browsers have full time employees available to work on this problem, but it doesn’t seem like anyone is pushing them to do so.

  10. Matt May Says:

    It’s all well and good that canvas allows visual fidelity, but as far as accessibility goes, it’s several generations behind Flash, which started supporting MSAA in 2002.

    But it’s not the job of the group specifying ARIA to make that accessible, it’s (now) the job of the HTML5 spec. So far, I’m not aware that work has even begun. I agree it needs to be done, but take it up with the HTML WG.

    I would still like to see what you’re doing that you can’t do with ARIA and the DOM, specifically.

  11. James Craig Says:

    Hi Ross, my response is almost as long as your original article, so I’m going to split up each point and counterpoint into separate comments. To begin:

    ARIA 1.0 is intended to allow existing web applications to be made accessible using today’s technologies, and the DOM is the lowest common denominator for web applications. You can have a document without CSS or without JavaScript, but you can’t have a script that executes outside the context of a document. The programming interfaces for ARIA are standard DOM methods.

  12. James Craig Says:

    You wrote:
    “The most obvious downside of assuming that the DOM will be the basic building block of future web applications is that it presumes all web apps will always be structured with DOM trees. But this already isn’t true today. Look at Mozilla’s Bespin editor, which renders text inside a canvas element, or SUN’s Lively Kernel, which implements an entire widget set in SVG. Because there is no underlying DOM structure, these two programs can not be made accessible under the WAI-ARIA spec without substantial hacks.”

    SVG is nothing but a DOM structure, and ARIA markup can be used inside SVG, completely hack-free. Although, I am unaware of an SVG viewer implementation that supports ARIA at this time.

    Before going into how Bespin could be made accessible via ARIA, I feel compelled to state that the canvas element is/was intended for drawing graphics. Using a drawing tool for a document goes against the WCAG 2 Principle that “content must be robust” and the HTML 5 spec even states, “Authors should not use the canvas element in a document when a more suitable element is available.” That said, I am aware of the drawbacks to using “more suitable elements” for something like a text editor, and applaud the Bespin creators for their experimentation and continued attempts to make a project like Bespin more accessible.

    Since you seem very familiar with Apple technologies, using canvas here is sort of like using a Quartz composition. Quartz compositions are most often useful in the context of an application, which can be made accessible, although it will take significantly more work since the application author is using a custom UI layer instead of using standard Cocoa controls. Likewise, the canvas element is always used in the context of a larger DOM that can be made accessible, although it will take significantly more work since the web application author is using a custom UI layer instead of using standard HTML elements and controls.

    Bespin could be made accessible in a number of ways, but one way that comes to mind is by managing focus of the entire application via the “roaming tabindex” technique, and dynamically inserting DOM elements for ARIA roles as needed. Todd Kloots (from YUI) has some blog posts discussing the “roaming tabindex” technique for specific widgets, but there is nothing to stop you from using the technique for an entire application. Although it is not the ideal solution because it would not allow for additional UI shortcuts that most AT has into the rest of the DOM (via methods like an item chooser or navigation by header), the Bespin editor could be made accessible by only allowing access to a single element at a time: whatever text is currently selected, or whatever menu item is currently focused. For example, updating a DOM element that represented the current line of the text editor would not cause a substantial performance hit.

  13. James Craig Says:

    You wrote:
    “What about elements that aren’t part of the visual structure at all? For example, in Cappuccino, every application has a CPApplication object. This object contains information that would undoubtedly be useful to the accessibility system, but because it’s an abstract object and isn’t part of the render tree in any way, it will never be visible to WAI-ARIA compliant browsers.”

    If it’s never perceivable to WAI-ARIA compliant browsers, how is it perceivable to sighted users?

    Yes, JavaScript can have arbitrary data structures but users don’t have any direct access to any of that data and, with the exception of a few specialized dialogs (alert, confirm, prompt) JavaScript cannot display anything to a user without inserting it into the DOM. Accessibility, is about the user, not about the API.

  14. James Craig Says:

    You wrote:
    “This leads us to the second problem, not having programmatic access to the accessibility system. This spec is designed to target web applications, but it has taken the same document based approach as HTML. Applications are defined by requiring a programming language, which in the case of the browser is usually JavaScript. It seems short sighted to develop a standard for applications that uses a fundamentally different technology than the application does.”

    Although most ARIA controls require interaction with JavaScript, many of the ARIA roles are static semantic roles that require no JavaScript, for example, the landmark roles can provide additional navigation for user agents and assistive technology.

    Web applications are always defined by the DOM, because you can’t have a script that executes outside the context of a document. Likewise SVG, or any other implementing host language will have a DOM and may or may not be accessed via JavaScript, another programming language, or directly by the user agent or assistive technology. The DOM is still the lowest common denominator.

  15. James Craig Says:

    You wrote:
    “Even beyond the fact that elements in your program that don’t exist in the DOM tree can’t become part of the accessibility system, not having a programmatic interface means that you can’t dynamically compute accessibility values. …snip… Lack of an actual API also limits the potential uses of ARIA. I can’t, for example, implement my own accessibility tool within the browser. It also causes issues with the event system, and doesn’t add any enhanced functionality for simulating events using accessibility APIs, which could have enabled a lot of advanced automated testing.”

    I don’t agree with any of those statements, but if you have specific concerns about the draft and send them in during the last call response period, the working group is required to address the issues formally. We have several issues deferred until a later version of ARIA, so they may already be addressed, but all constructive feedback is welcome.

  16. Max Design - standards based web design, development and training » Some links for light reading (3/3/09) Says:

    [...] Limitations of the WAI-ARIA [...]

Leave a Reply