[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[handle-dev] About right this time: Handle URI parameters
Larry:
Please ignore most of the previous message - I confused handle value and
"object" parameters - must have been thinking about the Harcourt sale :-)
[But not all of it should be ignored - URI is right up there with IP - and
in order to put handle "on the web" a URI encoding is a must have. Also
support for general user agents should probably be expedited over providing
support to browsers via plugins, Javascript, etc. (Users - who they?). Just
need to SWIG (or XS) that C client library. If scripting languages could
talk natively to handle and could further index into the handle...]
I guess I was a little confused with some of the terms used and also with
your original proposed syntax.
> hdl!handle param?returned value param#returned object param
This ordering is not conformant with RFC 2396. The string for the resource
side to "action" follows the "?" and not the "!" character.
I would say that there are only two generic syntax cases from a _client_
perspective
1. hdl:1234/5678?
2. hdl:1234/5678#
Case 1. is for querying into the handle and is "a string of information to
be interpreted by the resource" (or handle server in this case). This
refines the handle return set. This is the interesting bit.
Case 2. is a string held by the client and is "additional reference
information to be interpreted by the user agent". This corresponds to your
"object parameter" although I don't accept this terminology. The only
question here is not syntax but what action a client takes. Presumably the
client would just add on this URI fragment to any URI (or URL) returned by
the handle server.
So, Case 1. deals with both handle parameters and handle value parameters.
(Now I understand.) So handle parameters would be expressed as key/value
pairs, eg
hdl:1234/5678?index=3&type=URL&index=5
would return data values for indexes 3 and 5, and all data values of type
URL. An application would interpret these values and take appropriate
action. I don't think one can assume any ordering as a user agent may
simply parse the key/value pairs straight to a hash.
Now for handle value parameters. These are write-through parameters. (Not
the fragment identifiers I said they were earlier.) We only need to
separate them from the handle parameter key/value pairs and to ensure that
they are opaque to the URI - ie they are fully hex escaped, especially with
regard to ampersands. Suggest that one possible separator is the bang
character "!", as it's not reserved. One can then have something like
hdl:1234/5678?index=3!%20dsy%32dasyy%26&index=4!atet
Note that this does not support grouping but this may anyway be too
complex. This would also mean that
hdl:1234/5678?type=URL!tdtfi
would return all handle value URLs with the querystring "?tdtfi" appended.
(There is no way to pick out any specific URL other than by its index
value.)
An alternative solution - which could support grouping - would be to follow
the OpenURL-type packaging (now that's why OpenURL was mentioned, doh!
Sorry again Herbert!) and to employ a double hex encoding, where the handle
value parameter is fully hex escaped and the [...] groups below - the "[]"
are just for clarity - are further fully hex escaped (including the
ampersands),
hdl:1234/5678? [index=3¶m=%20dsy%32dasyy%26] &
[index=4¶m=atet&type=URL]
[...] indicates a hex escaped string.
This is more general at the risk of being more complex to implement. The
above syntax could allow for the second handle value parameter at index "4"
to be returned if and only if it is of type URL and with the querystring
"?atet" appended. (It is of course up to the application to do whatever
with the actual URI returned - reolve it, display it, etc.)
Think it's about right this time.
Tony
Larry Lannom <llannom@cnri.reston.va.us> on 30/10/2000 02:04:14
To: tony_hammond@harcourt.com
cc:
Subject: Re: [handle-dev] Re: [Crtwg] Handle URI parameters
Thanks Tony. I will try to extract a serious response from Sam, Sean, et
al.
tony_hammond@harcourt.com wrote:
>
> Larry:
>
> Since nobody else seems to be running with this, I wanted to revisit.
> (Feeling flameproof today.)
>
> "The trailing !?# delimiters approach seems to have several advantages
-
> mostly from fitting better into the URL legacy world with which we
have
> to live."
>
> That's a good start point.
>
> The internet is the triumph of "IP over everything". The web as a
universal
> information space is premised on "URI over everything". If we want to
> empower handle then we need to create handle as a first class object in
> this information space instead of gatewaying into a foreign protocol.
Hence
> the need to encode handles within URI syntax. (BTW I disagree with the
> notion of "imitating" URI syntax. RFC 2396 is prescriptive.) This of
course
> requires an "hdl:" URI scheme to be registered.
>
> [So let's keep this squarely on handle and put DOI aside. And then I can
> safely assert that the "legacy" world is the fragmented soup of URN, URL,
> etc., etc. Let's just focus on this URI thing. Its much cleaner.]
>
> Now for the 3 parameter types:
>
> 1. Handle parameter. Since this amounts to indexing into the handle (by
> type, index, or whatever) then a standard URI query component seems the
> appropriate syntax with key/value pairs. [RFC 2396 - "The query component
> is a string of information to be interpreted by the resource."] Example:
>
> hdl:1234/5678?type=<handle_type>&index=<handle_index>&....
> eg hdl:1234/5678?type=URL
>
> These would be cumulative.
>
> 2. Handle value parameter. I suggest that the requirement is much
simpler,
> ie just a write-through parameter which handle doesn't mess with. [RFC
2396
> - "When a URI reference is used to perform a retrieval action on the
> identified resource, the optional fragment identifier, separated from the
> URI by a crosshatch (#) character, consists of additional reference
> information to be interpreted by the user agent after the retrieval
action
> has been successfully completed."] Example:
>
> hdl:1234/5678#<data_string>
> eg hdl:1234/5678#AB3276D056FCAB
>
> The handle server returns the resolved handle value(s) and the user agent
> (as a direct handle client) holds on to the parameter and tacks it back
on
> when returning the values to the user.
>
> 3. Object parameter. Sorry - this has to be out of scope. AFAIK handle
> doesn't deal with objects but with resources :) A given handle value
> resolves to a piece of data. That's it.
>
> Now if we had some of these building blocks - URI scheme, parameter
syntax,
> return value syntax - then we could quickly build support into a number
of
> general classes of user agents, which is probably way more useful in the
> first instance then attempting to turn around mainstream browsers (ie
> personal user agents). I would really like to see simple lowlife
scripting
> languages (Perl, Python, etc.) capable of talking directly to handle, ie
> something like
>
> % lwp-request hdl:1234/5678 | my_do_something_useful_program
>
> or as simple calls from a CGI script, say. (We only need to implement
> handle reads.)
>
> As for the OpenURL thing. OpenURL is just (sorry Herbert ;-) a vehicle
for
> ferrying bibliographic metadata around the web. (Although it has the
> makings of being a really topnotch lingua franca to communicate with
> diverse bibliographic services. SFX being one such service type.) Since
> it's built on top of the HTTP protocol it's already hot plugged into the
> universal information space through its use of URIs. But I don't see what
> the connexion is with handle which is purely a name resolution service
> (operating on the internet- - though not as yet on the web), other than a
> handle (or handle value) as an item of bibliographic metadata could be
> trundled around within an OpenURL structure.
>
> Tony
>
> Larry Lannom <llannom@cnri.reston.va.us>@cnri.reston.va.us on 09/10/2000
> 02:41:29
>
> Sent by: crtwg-admin@cnri.reston.va.us
>
> To: CrossRef Technical Working Group <crtwg@cnri.reston.va.us>,
> handle-dev@cnri.reston.va.us, Herbert Van de Sompel
> <herbertv@CS.Cornell.EDU>, "Paskin, Norman" <n.paskin@doi.org>, Tim
> Ingoldsby <TINGOLDSBY@AIP.ORG>
> cc:
>
> Subject: [Crtwg] Handle URI parameters
>
> CNRI is proposing an extension to handle URI syntax to define and encode
> various types of parameters. Our initial proposal divides the parameters
> into three categories, in addition to which we have a number of
> different syntax proposals which differ on the positioning of the
> parameters and the delimiters used. The little exposition below
> describes the proposed parameter types first, followed by the syntaxes.
>
> A small group saw this a few weeks ago but had little comment. One
> significant set of comments, however, recommended that we try to follow
> the OpenURL syntax (one description of which is here -
> http://sfx1.exlibris-usa.com/OpenURL/openurl.html) which uses a
> controlled set of tags in tag/value pairs in a fairly standard URI query
> syntax. (I hope Herbert will correct me if I've misrepresented this.)
>
> I send this out somewhat hesitantly, as I am just leaving town for two
> weeks and will be reading mail only intermittently, but also know that a
> resolution of this issue is overdue.
>
> All comments appreciated. Thanks.
>
> Larry
>
> ======================================================
>
> 1. Handle parameter -- applies to the the whole handle and is expressed
> as attribute=value pairs. A common use of this would be to specify a
> particular handle type in a resolution request. Thus, adding type=EMAIL
> to a URI for a handle would serve as an instruction to a handle client
> to request only EMAIL types as returned values from the resolution of
> the given handle. This could also be used to parameterize the
> resolution, e.g., perform an authoritative query (don't ask a secondary
> server).
>
> 2. Handle value parameter -- applies to the returned handle value(s).
> Implementations would be driven by the returned values. Thus, while it
> is easy to imagine stating these parameters as attribute=value pairs,
> this would not be a requirement. A common use of this would be to append
> a given string to a returned URL, e.g., "source=JournalABC" would append
> the string "source=JournalABC" to the end of any returned handle values
> according to some further specification that would be implemented by the
> handle client. The further specification would probably be along the
> lines of append this only to values of type URL and separate with
> question marks.
>
> 3. Object parameter -- applies to objects referenced by a handle or by
> specific handle values. Again, implementation would be driven by the
> type of the returned object and the syntax specification would be silent
> on the specifics. A common use would be to identify a section in a
> returned object, e.g., "chapter23".
> So that leaves the question of syntax. As you probably know, we've
> experimented with and implemented a syntax for handle parameters. This
> is of the form
>
> (attribute=value)@hdl
>
> Thus
>
> (type=EMAIL)@200/1
>
> will return a handle value of type email. For this particular handle,
> this is equivalent to
>
> (index=4)@200/1
>
> This is currently implemented, for resolution in both the http/handle
> proxy and the handle resolver extension for web browsers, although not
> to our knowledge used in any production processes.
>
> We initially had two ideas for extending this to include the other two
> parameters. The first was to simply come up with two other sets of
> delimiters, e.g.,
>
> (handle parameter){returned value parameter}[returned object
> parameter]@hdl
>
> The second was to nest them, e.g.,
>
> (handle param(returned value param(returned obj param)))@hdl
>
> We didn't really like either. The discussions ranged around which would
> be easier to parse, which would be more confusing, what to do with
> compound statements, e.g., two handle parameters and one value
> parameter, and how well it would fit into current and potential future
> URI syntax.
>
> The third suggestion, and currently the local favorite, is to more
> closely imitate current URI syntax and delimiters as follows
>
> hdl!handle param?returned value param#returned object param
>
> thus
>
> hdl:10.123/456!type=email?subject=Request for Account
>
> or
>
> http://dx.doi.org/10.123/654?src=JournalABC#p12
>
> note that this last case is equivalent to
>
> http://dx.doi.org/10.123/654!type=URL?src=JournalABC#p12
>
> as URL is the default type for the proxy.
>
> If we wanted to allow multiple instances, we would probably define the &
> as the internal delimiter, e.g.,
>
> hdl:200/1!type=EMAIL&type=PUBKEY
>
> This example assumes, of course, that you have a client that you expect
> to do something specific with an email address and a public key.
>
> The trailing !?# delimiters approach seems to have several advantages -
> mostly from fitting better into the URL legacy world with which we have
> to live. Leading with the parameters had some appeal both in the way it
> scanned for human consumption and for encoding issues, but the trailing
> approach using common URI delimiters will probably give less trouble
> overall. The nested parens have a certain elegance, but require the two
> outer layers to get to the third layer even if you have nothing to say
> in the outer two.
>
> Note that these proposals deal only with URI syntax and with client
> functionality. They do not affect in any way the core handle system
> protocols or data model, but instead provide a way to leverage existing
> features, such as handle value typing, and provide a common approach to
> solving some current problems, e.g., passing along the source of a link.
>
> _______________________________________________
> CRTWG mailing list
> CRTWG@cnri.reston.va.us
> http://www.cnri.reston.va.us/mailman/listinfo/crtwg
>
> _______________________________________________
> handle-dev mailing list
> handle-dev@cnri.reston.va.us
> http://www.cnri.reston.va.us/mailman/listinfo/handle-dev
_______________________________________________
handle-dev mailing list
handle-dev@cnri.reston.va.us
http://www.cnri.reston.va.us/mailman/listinfo/handle-dev