Tangled web

A List Apart: Articles: How to Read W3C Specs

BNF stands for Backus Naur Form, or Backus Normal Form. It’s a compact way to represent the grammar of computer languages, and it’s been around, well, forever. Different specifications use different flavors of BNF, but they all translate long English descriptions into symbolic form. Take this example of what constitutes a sandwich:

A sandwich consists of a lower slice of bread, mustard or mayonnaise; optional lettuce, an optional slice of tomato; two to four slices of either bologna, salami, or ham (in any combination); one or more slices of cheese, and a top slice of bread.

This translates to:

sandwich ::=
lower_slice
[ mustard | mayonnaise ]
lettuce? tomato?
[ bologna | salami | ham ] {2,4}
cheese+
top_slice

The constituent parts of a definition are listed in order, separated by blanks. Items are grouped with square brackets, and choices within a group are separated by a vertical bar.

If an item is followed by a question mark, that means “one or none;” if followed by a plus sign, that means “one or more;” if followed by an asterisk, that means “zero or more;” and if followed by numbers inside braces, it gives the lower and upper limits for how many times an item can occur.

Parentheses, or more square brackets, are used to group items in more complex definitions. Sometimes a generic item (like a “color”) is enclosed in < and >, or fixed items will be enclosed in quote marks."
-- posted by Palema on 12/31/2006

A List Apart: Articles: How to Read W3C Specs

You know those <!DOCTYPE ...> declarations that you put in your documents to tell the browser which version of HTML or XHTML you're using? Those declarations refer to a Document Type Definition, or DTD, which defines which combinations of elements are legal in a document.

While learning to read a DTD is difficult, it's not an impossible task. And it's worth learning, because the DTD is the ultimate authority for what is and is not syntactically correct for a particular markup language.

A full explanation of how to read a DTD is well beyond the scope of this article, but it can be found in Elizabeth Castro’s XML for the World Wide Web Visual Quickstart Guide, or in Erik Ray’s Learning XML. Here's a brief example of something you might see in a DTD:

<!ENTITY %fontstyle '(tt | i | b)'>
<!ENTITY %inline '(#PCDATA | %fontstyle;)'>
<!ELEMENT div (p | %inline;)+>
<!ATTLIST div align (left | right | center) #IMPLIED>

And here’s what it means in English:

The font style elements are <tt>, <i>, and <b>. Inline elements consist of text or font style elements. A <div> can contain one or more <p> or inline elements in any order. A <div> has an optional align attribute with values of left, right, or center."
-- posted by Palema on 12/31/2006