[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Handle-info] D'oh! Handle URI Scheme: It's That Easy
Hi All:
I just realized today that defining a native handle URI scheme is pretty
much trivial. I think some previous efforts (more in terms of defining a
"doi:" scheme it's true to say) got bogged down with application semantics.
Away with the application semantics, away with the problems. (Spring
cleaning comes early this part of the world.)
RCC 3986 (Sect. 3) defines a URI thusly:
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
(Note that differently from RFC 2396, the fragment is now a part of the URI
production, and that RFCC 2396 has no single URI production.)
Now, RFC 2616 (Sect. 3.2.2) defines an HTTP URL as
http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]
(Note here that 2616 references 2396 so doesn't include a fragment
expicitly.)
So, it's really just that simple. :)
And from RFC 3651 (Sect. 2) here's the syntax for handle:
###
<Handle> = <NamingAuthority> "/" <LocalName>
<NamingAuthority> = *(<NamingAuthority> ".") <NAsegment>
<NAsegment> = 1*(%x00-2D / %x30-3F / %x41-FF )
; any octets that map to UTF-8 encoded
; Unicode 2.0 characters except
; octets '0x2E' and '0x2F' (which
; correspond to the ASCII characters '.',
; and '/').
<LocalName> = *(%x00-FF)
; any octets that map to UTF-8 encoded
; Unicode 2.0 characters
(Side note: In above productions from 3651, the "<",">" chars are
unnecessary, and maybe hyphenated lowercase is anyway a better fit with ABNF
than camelcase.)
###
This can be used to build the handle URI spec as
hdl-uri = "hdl" ":" handle [ "?" query ] [ "#" fragment ]
Well, it's just the above productions with one important caveat: that
<NamingAuthority> and <LocalName> are based on RFC 3986 "pchar" productions,
i.e.
handle = naming-authority> "/" local-name
naming-authority = 1*pchar
local-name = 1*(pchar / "/")
That is, the handle characters must be %-encoded and that <naming-authority>
cannot contain a "/" character.
(One could be nicer about this and reflect the subnaming authority
structures by recognizing the role of "." as per the 3651 productions.
Essentially, this means modifying production for unreserved characters in
3986 to exclude the period, i.e. Instead of
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
we would have
unreserved = ALPHA / DIGIT / "-" / "_" / "~"
And pchar builds on that. One could do this prettily, I'm sure, and maintain
alignment with 3986.)
*** Now ***
Still with me? The important thing is that the querystring SHOULD NOT be
defined and MUST NOT be defined. That's where application semantics creeps
in.
So instead of trying to define a syntax such as
hdl:1234/567?index=3&type=URL
Forget it. Out of scope. That's application stuff.
HTTP does not presume to direct users on how selectors (querysgtrings or
fragments) are composed. Nor should handle.
Benefits. Way simpler URI spec. Just a handful (3 or 4) productions as
above. Allows localization of querystring parameters or fragment anchors.
Allows applications to define their own syntaxes:
hdl:1234/567?foo=3&bar=URL
Of course, a *canonical* querystring syntax could be defined and recommended
(but doesn't have to be). It could even mentioned in the RFC as one
*example* of how to do it.
IMO there seems to be something flawed with attempting to hijack the
querystring syntax for exclusive purposes. It also removes any
futureproofing. And is anyway not required. So, why do it?
And then think about it. One para (with three or four productions) in an
appended subsection to "2. Handle System Namespace" and that's it. All done.
"hdl:"
Tony
********************************************************************************
DISCLAIMER: This e-mail is confidential and should not be used by anyone who is
not the original intended recipient. If you have received this e-mail in error
please inform the sender and delete it from your mailbox or any other storage
mechanism. Neither Macmillan Publishers Limited nor any of its agents accept
liability for any statements made which are clearly the sender's own and not
expressly made on behalf of Macmillan Publishers Limited or one of its agents.
Please note that neither Macmillan Publishers Limited nor any of its agents
accept any responsibility for viruses that may be contained in this e-mail or
its attachments and it is your responsibility to scan the e-mail and
attachments (if any). No contracts may be concluded on behalf of Macmillan
Publishers Limited or its agents by means of e-mail communication. Macmillan
Publishers Limited Registered in England and Wales with registered number 785998
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
********************************************************************************
_______________________________________________
Handle-Info mailing list
Handle-Info@cnri.reston.va.us
http://www.handle.net/mailman/listinfo/handle-info