Sunday, May 17, 2009

HTML name and id tokens

The HTML 4.01 spec says:

ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").

But it also says:

Use id or name? Authors should consider the following issues when deciding whether to use id or name for an anchor name:

  • The id attribute can act as more than just an anchor name (e.g., style sheet selector, processing identifier, etc.).
  • Some older user agents don't support anchors created with the id attribute.
  • The name attribute allows richer anchor names (with entities).

It is strange that name attributes are supposed to allow richer anchor names (with entities) when the none of the characters necessary for specifying entities, other than digits, are allowed in name tokens - the ampersand (&), without which one cannot specify an entity, is not allowed in a name token, nor a semicolon (;) nor a hash/pound (#).

Some browsers allow many characters other than those specified, including non-letter initial character, but not all. No doubt some browsers are attempting to conform strictly to the specification when they ignore anchors with disallowed characters in their name tokens.

No comments:

Labels