(I'm not sure if we already talked about this some years back.)
What is one to make of an UTF-8 encoded bitstream, for example? Or a UTF-16
bitstream, or whatever. What does it mean?
In the OpenURL work we deliberately introduced the notion of a "Format" as a
triple:
"A Format is a method to represent information constructs as character
strings.
Each Format consists of a Serialization, a Constraint Language, and a
Constraint Definition expressed using the Constraint Language. In this
Standard, the set of three items defining a Format is called a triple
and is represented by a short-hand notation as in:
{ Serialization, Constraint Language, Constraint Definition }
"
We also had character encoding as an orthogonal construct. And both formats
and encodings were resistered items.
So, the emphasis here was very much on validation at the semantic level.
I wonder if the notion of HVTE is focussed too much on the physical
representation of an arbitrary bitstream, rather than the semantics of the
carrier serialization and of the values thqt might be carried. Encoding
seems to be the least interesting facet.
On 13/5/08 16:40, "Christophe Blanchi" <cblanchi@cnri.reston.va.us> wrote: