Wilbur: RDF Parser
-
Using the RDF Parser
-
RDF Parser API
-
Condition Classes
-
URI Constants
1. Using the RDF Parser
Invoking the RDF parser is done using either the function parse-db-from-file
or the function parse-db-from-stream.
Underneath, the parser itself is implemented by an instance of the class
rdf-parser;
an API is provided that allows the extension of this class, in case one
wants to add syntactic features or other functionality (the Wilbur DAML
parser is implemented as a subclass of the RDF parser). During parsing,
the parser may signal various conditions to indicate
that something unexpected has happened (such as some syntax error, for
example). All RDF-related errors are continuable, and can be caught and
silently ignored if desirable; XML-related
errors typically are not continuable.
parse-db-from-file (file
&rest options &key parser-class error-handling) [Function]
This function will read the specified file (the parameter file should
be a string or a pathname, generally anything acceptable to the Common
Lisp function open) and parse its contents. The parser is created
from the class passed in the parameter parser-class (it defaults
to rdf-parser). The function returns
three values: the parser's triple database, the node denoting the source
of the generated triples, and a list of error instances (in case the parameter
error-handling
had the value :collect; it defaults to :signal, indicating
that errors should be signaled and the system should break execution).
Other keyword parameters (parameter
options) are passed to make-instance
when the parser is instantiated. This function is implemented using parse-db-from-stream.
Normally, instantiating an RDF Parser (instance of class rdf-parser)
will create a new triple database (instance of class db). However,
passing the :db option with an existing database to parse-db-from-file
will prevent the creation of a new database.
parse-db-from-stream (stream
locator &rest options &key
parser-class error-handling)
[Function]
This function is similar to parse-db-from-file,
except that it takes an open input stream and a document locator
(a URI string) instead of a file pathname.
2. RDF Parser API
The following generic functions form the core API of RDF parsers (base
class rdf-parser, described below).
parser-db (rdf-parser)
[Generic function]
This function accesses the triple database
of an RDF parser.
make-container (parser
elements &optional container-uri container-type-uri) [Generic
function]
This function creates an RDF container from the nodes or other objects
passed as the list elements. The optional parameter container-uri
will be the name of the container node, and container-type-uri (a
string) will name the node that will be the value of the rdf:type
property of the container node (the parameter defaults to -rdf-bag-uri-).
The container node is returned.
attach-to-parent (parser
parent child) [Generic function]
This function will create a triple which "attaches" a parent node to a
child node. It is up to the parser to determine which predicate is used
(typically this is evident from the parser's dynamic state). The default
method will merely create one triple and use the most recently parsed property
as the predicate.
parse-using-parsetype
(parser node property parsetype) [Generic function]
This function will handle the parsing of the different parse types (indicated
using different values to the attribute rdf:parseType). The idea
of this function is that by overriding it one can introduce more parsetypes.
The parameter node is the current description, the parameter property
is the element where the rdf:parseType attribute was encountered,
and parsetype is the value of that attribute (a string).
defer-task (parser type node
&rest args) [Generic function]
Creates a deferred task (an instance of the structure class task),
to be "execute" later (when the enclosing XML element is closed, during
the execution of the nox:end-element
method). Tasks are executed by methods of execute-deferred-task.
See the description of the structure class for a definition of the type,
node
and args parameters. Note that a task is only added to the deferred
tasks if it is distinct from the tasks already in the queue (tasks are
considered not to be distinct if their type components are eq
and their node components are eq).
execute-deferred-task
(parser task type) [Generic function]
Executes a deferred task (an instance of the structure class task
created by calling defer-task).
The type parameter gets passed the value of (task-typetask),
methods can thus be eql-specialized for various task types. The
default method executes tasks of type :container (for checking
that a node is an RDF container), :abouteach (for "exploding"
an "rdf:aboutEach" reference), and :bagid (for reifying
statements following an "rdf:bagID" declaration). Subclasses of
rdf-parser
can add new task types.
rdf-parser [Class]
:db [Initarg]
This is the base class of RDF parsers, and a subclass of nox:sax-consumer.
The initarg db can be used to initialize a parser with an existing
triple
database; otherwise the parser will create an empty database.
close-rdf-element [Condition
class]
The parser will signal this condition when encountering a closing rdf:RDF
tag. By handling this condition and not continuing, one can stop the parser
after the first <rdf:RDF>...</rdf:RDF> element has been
parsed (this is useful, for example, when scanning metadata from XHTML
files, otherwise the parser will keep reading until the end of the file
is reached).
task [Structure class]
Instances of this structure class represent deferred tasks inside the parser.
Tasks have a type (typically some keyword symbol), an associated
node
(thought of as being the target of the task when executed) and parameters
(these are task-dependent, and are identified using keyword symbols). Instances
of task are created by calling defer-task.
task-type (task) [Structure accessor]
Accesses the type component of a task.
task-node (task) [Structure accessor]
Accesses the node component of a task.
task-parameter (task parameter)
[Macro]
Accesses a parameter of a task, with parameter as the name of the
parameter, typically a keyword symbol. Although this is a macro it is intended
to be used just like an access function.
parser-node (parser) [Function]
Accesses the current node of the parser (i.e., the target node of the current
state of the parser).
parser-property (parser)
[Function]
Accesses the current property of the parser (i.e., the target property
of the current state of the parser). For example, this function is useful
in any implementation of attach-to-parent.
rdf-syntax-normalizer [Class]
This class, a subclass of nox:sax-consumer
and nox:sax-producer,
will normalize RDF syntax by "opening" the abbreviated syntax to full syntax.
3. Condition Classes
The class hierarchy of RDF condition classes is shown in a
figure in the XML Parser manual.
rdf-error [Condition class]
Subclass of nox:xml-error,
abstract base class of all RDF-related errors. All concrete subclasses
of this class are used to signal continuable errors (using cerror).
feature-not-supported
[Condition class]
Subclass of rdf-error.
Signals a condition where an unsupported feature is encountered during
parsing. The function nox:error-thing
accesses the missing feature. Note that this is not the same class as nox:feature-not-supported.
about-and-id-both-present
[Condition class]
Subclass of rdf-error.
Signals a condition where a description element is encountered which has
both the rdf:ID and rdf:about attributes. Continuing
from this error causes the value of rdf:about to be used.
unknown-parsetype [Condition
class]
Subclass of rdf-error.
Signals a condition where an unknown value of the rdf:parsetype
attribute is encountered. Continuing means ignoring the existence of any
parsetype declaration. The function nox:error-thing
accesses the unknown parsetype.
illegal-character-content
[Condition class]
Subclass of rdf-error.
Signals a condition where XML character content is encountered where according
to RDF syntax rules there shouldn't be any (such as in the presence of
rdf:resource
attribute). The function nox:error-thing
accesses the illegal content string.
container-required [Condition
class]
Subclass of rdf-error.
Signaled by the function is-container-p;
continuing causes this function to return false. The function nox:error-thing
accesses the tested node.
out-of-sequence-index
[Condition class]
Subclass of rdf-error.
Signals a condition where an attempt was made to create index URIs out
of sequence (using function index-uri).
Continuing causes index-uri
to return nil.
duplicate-namespace-prefix
[Condition class]
Subclass of rdf-error.
Signals a condition where an attempt was made to rename a namespace prefix
to one that already exists (using the function dictionary-rename-namespace).
Continuing causes rename attempt to be ignored. The function nox:error-thing
accesses the new prefix.
cannot-merge [Condition class]
Subclass of rdf-error.
Signals a condition where an attempt was made to merge two databases but
the function db-allow-merge-p
returned false. The function nox:error-thing
accesses a string which gives the reason of the failure.
4. URI Constants
The system defines a set of constants for all useful RDF URIs (strings).
In the following table, the prefix rdf is assumed to denote the
value of -rdf-uri- and the prefix rdfs to denote the
value of -rdfs-uri-. Note that the URI constants are defined by
the XML parser, exported (from the package "NOX"), imported into
the package "WILBUR" and then re-exported.
Constant |
Value |
-rdf-uri- |
"http://www.w3.org/1999/02/22-rdf-syntax-ns#" (in the current
implementation) |
-rdfs-uri- |
"http://www.w3.org/2000/01/rdf-schema#" (in the current implementation) |
-rdf-id-uri- |
rdf:ID |
-rdf-resource-uri- |
rdf:resource |
-rdf-about-uri- |
rdf:about |
-rdf-abouteach-uri- |
rdf:aboutEach |
-rdf-abouteachprefix-uri- |
rdf:aboutEachPrefix |
-rdf-bagid-uri- |
rdf:bagID |
-rdf-parsetype-uri- |
rdf:parseType |
-rdf-description-uri- |
rdf:Description |
-rdf-type-uri- |
rdf:type |
-rdf-rdf-uri- |
rdf:RDF |
-rdf-li-uri- |
rdf:li |
-rdf-statement-uri- |
rdf:Statement |
-rdf-subject-uri- |
rdf:subject |
-rdf-predicate-uri- |
rdf:predicate |
-rdf-object-uri- |
rdf:object |
-rdf-bag-uri- |
rdf:Bag |
-rdf-seq-uri- |
rdf:Seq |
-rdf-alt-uri- |
rdf:Alt |
-rdfs-resource-uri- |
rdfs:Resource |
-rdfs-class-uri- |
rdfs:Class |
-rdfs-subclassof-uri- |
rdfs:subClassOf |
-rdfs-subpropertyof-uri- |
rdfs:subPropertyOf |
-rdfs-seealso-uri- |
rdfs:seeAlso |
-rdfs-isdefinedby-uri- |
rdfs:isDefinedBy |
-rdfs-constraintresource-uri- |
rdfs:ConstraintResource |
-rdfs-constraintproperty-uri- |
rdfs:ConstraintProperty |
-rdfs-range-uri- |
rdfs:range |
-rdfs-domain-uri- |
rdfs:domain |
-rdfs-comment-uri- |
rdfs:comment |
-rdfs-label-uri- |
rdfs:labl |
-rdfs-literal-uri- |
rdfs:Literal |
-rdfs-container-uri- |
rdfs:Container |
Copyright © 2001 Nokia. All Rights Reserved.
Subject to the NOKOS License version 1.0
Author: Ora Lassila (ora.lassila@nokia.com)