Syntax highlighting: Usage of lang attribute on code blocks
If we have a JavaScript code block like:
console.log('1')
it will be rendered as:
<pre data-canonical-lang="js" ... lang="javascript">
<code>
<span ... lang="javascript">...</span>
</code>
</pre>
Note: This also seems to affect our syntax highlighting on Blob Views.
Google Chrome Lighthouse warns against the usage of the lang
attribute in this kind of manner. It seems like the lang
attribute is mainly to mark which "natural" language (e.g. en
or de
a certain part of the page is in:
The purpose of these requirements is primarily to allow assistive technologies such as screen readers to invoke the correct pronunciation. From: https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/lang#accessibility
The question is whether this trips up accessibility tooling and whether we should reconsider the usage of the lang
attribute in favor of some data attributes.
I raised this in Slack with @digitalmoksha. He also shared this SO entry: https://stackoverflow.com/questions/5134242/semantics-standards-and-using-the-lang-attribute-for-source-code-in-markup