Hi Dan:

Thanks for sharing that. I don’t see any substantial differences on a quick glance apart from that being based on 2396. And being rather more expository than needs be in an RFC.

As for “//” - that needs to be determined. A job for the technology authors, I would imagine. I guess there could be politics there as regards network authorities. (No getting away either from the IESG notes attatched to the RFCs.)

> My read of 2396 was that query and fragment were part of the URI scheme, and you couldn't get away without them, but don't know if these has changed in 3986.

I don’t understand. Of course, they are an integral part of the URI. I included them explicitly in the BNF. The only point I made was that the spec should not seek to define a querystring syntax. That should be left to handle applications. Same thing with format of fragment identifiers.



On 14/2/08 16:07, "Daniel Rehak" <daniel.rehak@gmail.com> wrote:

My two top issues would also be hdl: vs. hdl:// and the query string.

Here's a draft that I put together 3 years ago based on 2396, not 3986.

My read of 2396 was that query and fragment were part of the URI scheme, and you couldn't get away without them, but don't know if these has changed in 3986.

    - Dan

On Thu, Feb 14, 2008 at 7:44 AM, Hammond, Tony <t.hammond@nature.com> wrote:

Following up yesterday's post on URI scheme, here's one cut at a formal BNF.
(Needs to go under the microscope - and crib sheet attached at end with
productions from RFC 3986 and hex table for reference.)

Here are the two BNFs side-by-side for a) handle string and b) handle URI.
(The handle string is the same as the current RFC bar capitalization and a
couple punctuatin changes.)

The big thing - which I quite forgot yesterday - is the "//" question.
Should a handle URI have an explicit network authority (essentially 3986
<reg-name>)? I guess strictly it should, but would point out that for DOI we
have been in the custom of citing without any authority component, i.e. as
an opaque string. Are the two practices mutually compatible? So e.g. would
it be conceiveable to see




Is that feasible? With DOI being treated as a name alone, while the handle
is treated as a network protocol with retrieval actions? Or should handle be
opaque too? Or did we get it wrong with DOI?

The minor thing is the appearance of the hierarchical delimiter char "/"
within a handle string. Of course, this plays a hierarchical role up front,
but what about elsewhere in the string? According to 3986 if it's not
hierarchical it should be %-encoded. In the "info" spec we worded it thus:

" The "info"
   URI scheme is supportive of hierarchical processing as indicated by
   the presence of the slash "/" character, although the slash "/"
   character SHOULD NOT be interpreted as a strict hierarchy delimiter."

How does that look? I am still very much convinced that querystring and
fragments are application-level constructs and should not be mandated in a
protocol document.



       ;; ABNF for a Handle string

       handle             = naming-authority "/" local-name

       naming-authority   = *(naming-authority  ".") na-segment

       na-segment         = 1*(%x00-2D / %x30-3F / %x41-FF )
                            ; any octets that map to UTF-8 encoded
                            ; Unicode 2.0 characters except
                            ; octets "%x2E" and "%x2F" (which
                            ; correspond to the ASCII characters "."
                            ; and "/")

       local-name         = *(%x00-FF)
                            ; any octets that map to UTF-8 encoded
                            ; Unicode 2.0 characters

       ;; ABNF for a Handle URI

       hdl-uri            = "hdl" ":" "//" u-handle [ "?" query ]

                                [ "#" fragment ]

       ; All productions not starting "u-" are from RFC 3986

       u-handle           = u-naming-authority "/" u-local-name

       u-naming-authority = *(u-naming-authority  ".") u-na-segment

       u-na-segment       = *( u-unreserved / pct-encoded / sub-delims
                             ; native handle chars for <na-segment> other
                             ; than characters in the range
                             ; (%x21 / %x24 / %x26-2D / %x30-39 / %x3B
                             ;  / %x3D / %x41-5A / %x5F /%x61-7A / %x7E)
                             ; which must be %-encoded

       u-local-name       = *( pchar / "/" )
                             ; native handle chars for <local-name> other
                             ; than characters in the range
                             ; (%x21 / %x24 / %x26-2F / %x30-39 / %x3B
                             ;  / %x3D / %x41-5A / %x5F /%x61-7A / %x7E)
                             ; which must be %-encoded

       u-unreserved       = ALPHA / DIGIT / "-" / "_" / "~"


