[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Handle-info] D'oh! (Take Two)
Hi:
Following up yesterday's post on URI scheme, here's one cut at a formal BNF.
(Needs to go under the microscope - and crib sheet attached at end with
productions from RFC 3986 and hex table for reference.)
Here are the two BNFs side-by-side for a) handle string and b) handle URI.
(The handle string is the same as the current RFC bar capitalization and a
couple punctuatin changes.)
The big thing - which I quite forgot yesterday - is the "//" question.
Should a handle URI have an explicit network authority (essentially 3986
<reg-name>)? I guess strictly it should, but would point out that for DOI we
have been in the custom of citing without any authority component, i.e. as
an opaque string. Are the two practices mutually compatible? So e.g. would
it be conceiveable to see
hdl://10.1038/nature05509
alongside
doi:10.1038/nature05509
Is that feasible? With DOI being treated as a name alone, while the handle
is treated as a network protocol with retrieval actions? Or should handle be
opaque too? Or did we get it wrong with DOI?
The minor thing is the appearance of the hierarchical delimiter char "/"
within a handle string. Of course, this plays a hierarchical role up front,
but what about elsewhere in the string? According to 3986 if it's not
hierarchical it should be %-encoded. In the "info" spec we worded it thus:
" The "info"
URI scheme is supportive of hierarchical processing as indicated by
the presence of the slash "/" character, although the slash "/"
character SHOULD NOT be interpreted as a strict hierarchy delimiter."
How does that look? I am still very much convinced that querystring and
fragments are application-level constructs and should not be mandated in a
protocol document.
Tony
###
;; ABNF for a Handle string
handle = naming-authority "/" local-name
naming-authority = *(naming-authority ".") na-segment
na-segment = 1*(%x00-2D / %x30-3F / %x41-FF )
; any octets that map to UTF-8 encoded
; Unicode 2.0 characters except
; octets "%x2E" and "%x2F" (which
; correspond to the ASCII characters "."
; and "/")
local-name = *(%x00-FF)
; any octets that map to UTF-8 encoded
; Unicode 2.0 characters
;; ABNF for a Handle URI
hdl-uri = "hdl" ":" "//" u-handle [ "?" query ]
[ "#" fragment ]
; All productions not starting "u-" are from RFC 3986
u-handle = u-naming-authority "/" u-local-name
u-naming-authority = *(u-naming-authority ".") u-na-segment
u-na-segment = *( u-unreserved / pct-encoded / sub-delims
; native handle chars for <na-segment> other
; than characters in the range
; (%x21 / %x24 / %x26-2D / %x30-39 / %x3B
; / %x3D / %x41-5A / %x5F /%x61-7A / %x7E)
; which must be %-encoded
u-local-name = *( pchar / "/" )
; native handle chars for <local-name> other
; than characters in the range
; (%x21 / %x24 / %x26-2F / %x30-39 / %x3B
; / %x3D / %x41-5A / %x5F /%x61-7A / %x7E)
; which must be %-encoded
u-unreserved = ALPHA / DIGIT / "-" / "_" / "~"
###
###
Crib Sheet
00 nul 01 soh 02 stx 03 etx 04 eot 05 enq 06 ack 07 bel
08 bs 09 ht 0a nl 0b vt 0c np 0d cr 0e so 0f si
10 dle 11 dc1 12 dc2 13 dc3 14 dc4 15 nak 16 syn 17 etb
18 can 19 em 1a sub 1b esc 1c fs 1d gs 1e rs 1f us
20 sp 21 ! 22 " 23 # 24 $ 25 % 26 & 27 '
28 ( 29 ) 2a * 2b + 2c , 2d - 2e . 2f /
30 0 31 1 32 2 33 3 34 4 35 5 36 6 37 7
38 8 39 9 3a : 3b ; 3c < 3d = 3e > 3f ?
40 @ 41 A 42 B 43 C 44 D 45 E 46 F 47 G
48 H 49 I 4a J 4b K 4c L 4d M 4e N 4f O
50 P 51 Q 52 R 53 S 54 T 55 U 56 V 57 W
58 X 59 Y 5a Z 5b [ 5c \ 5d ] 5e ^ 5f _
60 ` 61 a 62 b 63 c 64 d 65 e 66 f 67 g
68 h 69 i 6a j 6b k 6c l 6d m 6e n 6f o
70 p 71 q 72 r 73 s 74 t 75 u 76 v 77 w
78 x 79 y 7a z 7b { 7c | 7d } 7e ~ 7f del
pchar = *( unreserved / pct-encoded / sub-delims / ":" / "@")
query = *( pchar / "/" / "?" )
fragment = *( pchar / "/" / "?" )
pct-encoded = "%" HEXDIG HEXDIG
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
###
********************************************************************************
DISCLAIMER: This e-mail is confidential and should not be used by anyone who is
not the original intended recipient. If you have received this e-mail in error
please inform the sender and delete it from your mailbox or any other storage
mechanism. Neither Macmillan Publishers Limited nor any of its agents accept
liability for any statements made which are clearly the sender's own and not
expressly made on behalf of Macmillan Publishers Limited or one of its agents.
Please note that neither Macmillan Publishers Limited nor any of its agents
accept any responsibility for viruses that may be contained in this e-mail or
its attachments and it is your responsibility to scan the e-mail and
attachments (if any). No contracts may be concluded on behalf of Macmillan
Publishers Limited or its agents by means of e-mail communication. Macmillan
Publishers Limited Registered in England and Wales with registered number 785998
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
********************************************************************************
_______________________________________________
Handle-Info mailing list
Handle-Info@cnri.reston.va.us
http://www.handle.net/mailman/listinfo/handle-info