[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[handle-dev] Re: [Crtwg] Handle URI parameters
Larry:
Since nobody else seems to be running with this, I wanted to revisit.
(Feeling flameproof today.)
"The trailing !?# delimiters approach seems to have several advantages -
mostly from fitting better into the URL legacy world with which we have
to live."
That's a good start point.
The internet is the triumph of "IP over everything". The web as a universal
information space is premised on "URI over everything". If we want to
empower handle then we need to create handle as a first class object in
this information space instead of gatewaying into a foreign protocol. Hence
the need to encode handles within URI syntax. (BTW I disagree with the
notion of "imitating" URI syntax. RFC 2396 is prescriptive.) This of course
requires an "hdl:" URI scheme to be registered.
[So let's keep this squarely on handle and put DOI aside. And then I can
safely assert that the "legacy" world is the fragmented soup of URN, URL,
etc., etc. Let's just focus on this URI thing. Its much cleaner.]
Now for the 3 parameter types:
1. Handle parameter. Since this amounts to indexing into the handle (by
type, index, or whatever) then a standard URI query component seems the
appropriate syntax with key/value pairs. [RFC 2396 - "The query component
is a string of information to be interpreted by the resource."] Example:
hdl:1234/5678?type=<handle_type>&index=<handle_index>&....
eg hdl:1234/5678?type=URL
These would be cumulative.
2. Handle value parameter. I suggest that the requirement is much simpler,
ie just a write-through parameter which handle doesn't mess with. [RFC 2396
- "When a URI reference is used to perform a retrieval action on the
identified resource, the optional fragment identifier, separated from the
URI by a crosshatch (#) character, consists of additional reference
information to be interpreted by the user agent after the retrieval action
has been successfully completed."] Example:
hdl:1234/5678#<data_string>
eg hdl:1234/5678#AB3276D056FCAB
The handle server returns the resolved handle value(s) and the user agent
(as a direct handle client) holds on to the parameter and tacks it back on
when returning the values to the user.
3. Object parameter. Sorry - this has to be out of scope. AFAIK handle
doesn't deal with objects but with resources :) A given handle value
resolves to a piece of data. That's it.
Now if we had some of these building blocks - URI scheme, parameter syntax,
return value syntax - then we could quickly build support into a number of
general classes of user agents, which is probably way more useful in the
first instance then attempting to turn around mainstream browsers (ie
personal user agents). I would really like to see simple lowlife scripting
languages (Perl, Python, etc.) capable of talking directly to handle, ie
something like
% lwp-request hdl:1234/5678 | my_do_something_useful_program
or as simple calls from a CGI script, say. (We only need to implement
handle reads.)
As for the OpenURL thing. OpenURL is just (sorry Herbert ;-) a vehicle for
ferrying bibliographic metadata around the web. (Although it has the
makings of being a really topnotch lingua franca to communicate with
diverse bibliographic services. SFX being one such service type.) Since
it's built on top of the HTTP protocol it's already hot plugged into the
universal information space through its use of URIs. But I don't see what
the connexion is with handle which is purely a name resolution service
(operating on the internet- - though not as yet on the web), other than a
handle (or handle value) as an item of bibliographic metadata could be
trundled around within an OpenURL structure.
Tony
Larry Lannom <llannom@cnri.reston.va.us>@cnri.reston.va.us on 09/10/2000
02:41:29
Sent by: crtwg-admin@cnri.reston.va.us
To: CrossRef Technical Working Group <crtwg@cnri.reston.va.us>,
handle-dev@cnri.reston.va.us, Herbert Van de Sompel
<herbertv@CS.Cornell.EDU>, "Paskin, Norman" <n.paskin@doi.org>, Tim
Ingoldsby <TINGOLDSBY@AIP.ORG>
cc:
Subject: [Crtwg] Handle URI parameters
CNRI is proposing an extension to handle URI syntax to define and encode
various types of parameters. Our initial proposal divides the parameters
into three categories, in addition to which we have a number of
different syntax proposals which differ on the positioning of the
parameters and the delimiters used. The little exposition below
describes the proposed parameter types first, followed by the syntaxes.
A small group saw this a few weeks ago but had little comment. One
significant set of comments, however, recommended that we try to follow
the OpenURL syntax (one description of which is here -
http://sfx1.exlibris-usa.com/OpenURL/openurl.html) which uses a
controlled set of tags in tag/value pairs in a fairly standard URI query
syntax. (I hope Herbert will correct me if I've misrepresented this.)
I send this out somewhat hesitantly, as I am just leaving town for two
weeks and will be reading mail only intermittently, but also know that a
resolution of this issue is overdue.
All comments appreciated. Thanks.
Larry
======================================================
1. Handle parameter -- applies to the the whole handle and is expressed
as attribute=value pairs. A common use of this would be to specify a
particular handle type in a resolution request. Thus, adding type=EMAIL
to a URI for a handle would serve as an instruction to a handle client
to request only EMAIL types as returned values from the resolution of
the given handle. This could also be used to parameterize the
resolution, e.g., perform an authoritative query (don't ask a secondary
server).
2. Handle value parameter -- applies to the returned handle value(s).
Implementations would be driven by the returned values. Thus, while it
is easy to imagine stating these parameters as attribute=value pairs,
this would not be a requirement. A common use of this would be to append
a given string to a returned URL, e.g., "source=JournalABC" would append
the string "source=JournalABC" to the end of any returned handle values
according to some further specification that would be implemented by the
handle client. The further specification would probably be along the
lines of append this only to values of type URL and separate with
question marks.
3. Object parameter -- applies to objects referenced by a handle or by
specific handle values. Again, implementation would be driven by the
type of the returned object and the syntax specification would be silent
on the specifics. A common use would be to identify a section in a
returned object, e.g., "chapter23".
So that leaves the question of syntax. As you probably know, we've
experimented with and implemented a syntax for handle parameters. This
is of the form
(attribute=value)@hdl
Thus
(type=EMAIL)@200/1
will return a handle value of type email. For this particular handle,
this is equivalent to
(index=4)@200/1
This is currently implemented, for resolution in both the http/handle
proxy and the handle resolver extension for web browsers, although not
to our knowledge used in any production processes.
We initially had two ideas for extending this to include the other two
parameters. The first was to simply come up with two other sets of
delimiters, e.g.,
(handle parameter){returned value parameter}[returned object
parameter]@hdl
The second was to nest them, e.g.,
(handle param(returned value param(returned obj param)))@hdl
We didn't really like either. The discussions ranged around which would
be easier to parse, which would be more confusing, what to do with
compound statements, e.g., two handle parameters and one value
parameter, and how well it would fit into current and potential future
URI syntax.
The third suggestion, and currently the local favorite, is to more
closely imitate current URI syntax and delimiters as follows
hdl!handle param?returned value param#returned object param
thus
hdl:10.123/456!type=email?subject=Request for Account
or
http://dx.doi.org/10.123/654?src=JournalABC#p12
note that this last case is equivalent to
http://dx.doi.org/10.123/654!type=URL?src=JournalABC#p12
as URL is the default type for the proxy.
If we wanted to allow multiple instances, we would probably define the &
as the internal delimiter, e.g.,
hdl:200/1!type=EMAIL&type=PUBKEY
This example assumes, of course, that you have a client that you expect
to do something specific with an email address and a public key.
The trailing !?# delimiters approach seems to have several advantages -
mostly from fitting better into the URL legacy world with which we have
to live. Leading with the parameters had some appeal both in the way it
scanned for human consumption and for encoding issues, but the trailing
approach using common URI delimiters will probably give less trouble
overall. The nested parens have a certain elegance, but require the two
outer layers to get to the third layer even if you have nothing to say
in the outer two.
Note that these proposals deal only with URI syntax and with client
functionality. They do not affect in any way the core handle system
protocols or data model, but instead provide a way to leverage existing
features, such as handle value typing, and provide a common approach to
solving some current problems, e.g., passing along the source of a link.
_______________________________________________
CRTWG mailing list
CRTWG@cnri.reston.va.us
http://www.cnri.reston.va.us/mailman/listinfo/crtwg
_______________________________________________
handle-dev mailing list
handle-dev@cnri.reston.va.us
http://www.cnri.reston.va.us/mailman/listinfo/handle-dev