Reflecting back on my career, I realised that this year marks a personal milestone for me. I have been working with HTML for 25 years now. I have seen the markup language significantly evolve and mature over that time.
HTML has become a semantic language that is more than just scaffolding for our page; it has become increasingly descriptive, providing meaning and context to our content.
Today is Blue Beanie Day, an annual celebration of web standards. I thought it would be a good opportunity to take a look at the benefits of writing semantic HTML and briefly examine the state of today's web.
In the beginning
In the early days of the Web (we're talking early 90s here), HTML was a relatively simple language. It existed to allow us to structure basic text documents through headings, paragraphs, lists and links. The
<img> element existed, but embedding images in a page wasn't a thing until the Mosaic browser was released in 1993.
HTML provided a number of now obsolete elements and attributes that we could style content with. Developers would use tags like
<font> and the notorious
<marquee>; those last two were non-standard and browser-specific, creating cross-browser issues in addition to the accessibility problems they posed.
Gone are the elements that were specifically introduced for styling; and in came elements like
<nav> that are more descriptive of their content than a plain old
According to the Oxford English Dictionary, semantics is:
the meaning of words, phrases, or systems
For HTML, that means that the tags we use give meaning to the content they wrap. This helps contribute to the end users' understanding of the content beyond just styling.
The benefits of semantics
We've established that HTML is semantic, but so what? Visually we can produce great looking websites using a limited number of elements and CSS. We can surely just use a
<div> rather than one of the Content Sectioning elements, or a
<span> instead of an Inline Text one.
Looks aren't everything!
Writing semantic HTML provides a good foundation for building accessible content.
Using content sectioning elements, for example
<nav>, provides a means for those with visual impairments to better understand the layout of a page. Inline text semantics like
<strong> allow screen readers to change the tone of what is being read out.
Navigation can also be improved. For example, many users of assistive technologies will use headings as a way of quickly navigating a page.
Where there are no appropriate HTML elements to describe things we can use ARIA. It provides a set of roles and attributes to enhance the user-experience for those using assistive technologies. However, it needs to be used with caution. WebAIM's survey of over one million websites found that homepages using ARIA had significantly more errors than those without.
Developers need to remember the first rule of ARIA:
If you can use a native HTML element or attribute with the semantics and behavior you require already built in, instead of re-purposing an element and adding an ARIA role, state or property to make it accessible, then do so.
Often, when we improve the accessibility of websites it can have a positive impact on the general user-experience. A good example is using the appropriate interactive elements for forms and navigation.
Filling in forms online can be a tiresome process. However, a well constructed form can make the user-experience much better. Associating input fields with their labels ensures larger clickable areas, particularly useful for radio buttons and checkboxes; and using the semantically correct elements insures users have a choice of how they interact with the form, whether that's touch, mouse or keyboard.
Much like it helps those using assistive technologies, Content Sectioning elements give the content an understandable structure the can help with search engine indexing and page ranking.
Search engine crawlers are essentially 'blind users', so a good use of semantic HTML helps them understand what they can't see from the styled visuals. For example, good heading structures are very helpful for SEO.
Similar to how accessibility has ARIA, for SEO we have Schema.org. Often shortened to just Schema, it provides a semantic vocabulary of microdata that you can add to your HTML. These can help search engines better understand what the content on a page describes. For example, we could be describing a movie with information on the title, director and genre.
My final benefit is that it leads to cleaner code.
Well written code is self explanatory. Using semantic elements helps make the code more readable and easier to maintain.
I remember early on in my career when working with Drupal, templates comprised of so many
<div> elements it was termed as having divitis. It could be frustrating working out where in the HTML code you were looking when one element type was so overused. In fairness, this was before HTML5 had introduced the many content sectioning elements now available to us. I would hope things are now better.
Like many developers, I would utilise comments in my markup to make it clearer what was going on. Where once I would comment what the code was doing, now I can use semantic elements that describe it. This leads to much more readable code.
State of the web
Intrigued by how well adopted the semantic HTML elements are today, I've been analysing 200 major websites (I appreciate this is a very small sample size).
There are currently over 100 valid HTML5 elements. I found that on average the homepages I analysed used just 29. In fairness, some of the available elements in the HTML standard are obscure. However, when I started to look at which elements were used, it highlighted that many do not utilise what you'd expect to be common elements of an average homepage.
I would expect most homepages to consist of a header, main content area and footer; and in addition to these a primary navigation. Yet, the utilisation of the relevant Content Sectioning elements for these page components was far from universal.
Just over half the homepages I analysed used the
<main> element (~54%). Much better was the use of the
<footer> elements at 70% and 79% respectively. These numbers still fall short of what I'd expect considering these elements have been widely supported by browsers for many years now.
I plan on exploring these findings further in a later blog post.
HTML has come a long way since I first started working with it in the 90s. Today, it is a semantic language that provides meaning and context to web content.
Well written semantic HTML provides many benefits, including more accessible content and a better user-experience. It also helps with SEO and leads to cleaner code.
Despite the benefits, some basic analysis of 200 major homepages indicates that there's some way to go for the power of semantic HTML to be fully unlocked.