Wilbur: RDF Parser

  1. Using the RDF Parser
  2. RDF Parser API
  3. Condition Classes
  4. URI Constants

1. Using the RDF Parser

Invoking the RDF parser is done using either the function parse-db-from-file or the function parse-db-from-stream. Underneath, the parser itself is implemented by an instance of the class rdf-parser; an API is provided that allows the extension of this class, in case one wants to add syntactic features or other functionality (the Wilbur DAML parser is implemented as a subclass of the RDF parser). During parsing, the parser may signal various conditions to indicate that something unexpected has happened (such as some syntax error, for example). All RDF-related errors are continuable, and can be caught and silently ignored if desirable; XML-related errors typically are not continuable.

parse-db-from-file (file &rest options &key parser-class error-handling) [Function]

This function will read the specified file (the parameter file should be a string or a pathname, generally anything acceptable to the Common Lisp function open) and parse its contents. The parser is created from the class passed in the parameter parser-class (it defaults to rdf-parser). The function returns three values: the parser's triple database, the node denoting the source of the generated triples, and a list of error instances (in case the parameter error-handling had the value :collect; it defaults to :signal, indicating that errors should be signaled and the system should break execution). Other keyword parameters (parameter options) are passed to make-instance when the parser is instantiated. This function is implemented using parse-db-from-stream.

Normally, instantiating an RDF Parser (instance of class rdf-parser) will create a new triple database (instance of class db). However, passing the :db option with an existing database to parse-db-from-file will prevent the creation of a new database.

parse-db-from-stream (stream locator &rest options &key parser-class error-handling) [Function]

This function is similar to parse-db-from-file, except that it takes an open input stream and a document locator (a URI string) instead of a file pathname.

2. RDF Parser API

The following generic functions form the core API of RDF parsers (base class rdf-parser, described below).

parser-db (rdf-parser) [Generic function]

This function accesses the triple database of an RDF parser.

make-container (parser elements &optional container-uri container-type-uri) [Generic function]

This function creates an RDF container from the nodes or other objects passed as the list elements. The optional parameter container-uri will be the name of the container node, and container-type-uri (a string) will name the node that will be the value of the rdf:type property of the container node (the parameter defaults to -rdf-bag-uri-). The container node is returned.

attach-to-parent (parser parent child) [Generic function]

This function will create a triple which "attaches" a parent node to a child node. It is up to the parser to determine which predicate is used (typically this is evident from the parser's dynamic state). The default method will merely create one triple and use the most recently parsed property as the predicate.

parse-using-parsetype (parser node property parsetype) [Generic function]

This function will handle the parsing of the different parse types (indicated using different values to the attribute rdf:parseType). The idea of this function is that by overriding it one can introduce more parsetypes. The parameter node is the current description, the parameter property is the element where the rdf:parseType attribute was encountered, and parsetype is the value of that attribute (a string).

defer-task (parser type node &rest args) [Generic function]

Creates a deferred task (an instance of the structure class task), to be "execute" later (when the enclosing XML element is closed, during the execution of the nox:end-element method). Tasks are executed by methods of execute-deferred-task. See the description of the structure class for a definition of the type, node and args parameters. Note that a task is only added to the deferred tasks if it is distinct from the tasks already in the queue (tasks are considered not to be distinct if their type components are eq and their node components are eq).

execute-deferred-task (parser task type) [Generic function]

Executes a deferred task (an instance of the structure class task created by calling defer-task). The type parameter gets passed the value of (task-typetask), methods can thus be eql-specialized for various task types. The default method executes tasks of type :container (for checking that a node is an RDF container), :abouteach (for "exploding" an "rdf:aboutEach" reference), and :bagid (for reifying statements following an "rdf:bagID" declaration). Subclasses of rdf-parser can add new task types.

rdf-parser [Class]
:db [Initarg]

This is the base class of RDF parsers, and a subclass of nox:sax-consumer. The initarg db can be used to initialize a parser with an existing triple database; otherwise the parser will create an empty database.

close-rdf-element [Condition class]

The parser will signal this condition when encountering a closing rdf:RDF tag. By handling this condition and not continuing, one can stop the parser after the first <rdf:RDF>...</rdf:RDF> element has been parsed (this is useful, for example, when scanning metadata from XHTML files, otherwise the parser will keep reading until the end of the file is reached).

task [Structure class]

Instances of this structure class represent deferred tasks inside the parser. Tasks have a type (typically some keyword symbol), an associated node (thought of as being the target of the task when executed) and parameters (these are task-dependent, and are identified using keyword symbols). Instances of task are created by calling defer-task.

task-type (task) [Structure accessor]

Accesses the type component of a task.

task-node (task) [Structure accessor]

Accesses the node component of a task.

task-parameter (task parameter) [Macro]

Accesses a parameter of a task, with parameter as the name of the parameter, typically a keyword symbol. Although this is a macro it is intended to be used just like an access function.

parser-node (parser) [Function]

Accesses the current node of the parser (i.e., the target node of the current state of the parser).

parser-property (parser) [Function]

Accesses the current property of the parser (i.e., the target property of the current state of the parser). For example, this function is useful in any implementation of attach-to-parent.

rdf-syntax-normalizer [Class]

This class, a subclass of nox:sax-consumer and nox:sax-producer, will normalize RDF syntax by "opening" the abbreviated syntax to full syntax.

3. Condition Classes

The class hierarchy of RDF condition classes is shown in a figure in the XML Parser manual.

rdf-error [Condition class]

Subclass of nox:xml-error, abstract base class of all RDF-related errors. All concrete subclasses of this class are used to signal continuable errors (using cerror).

feature-not-supported [Condition class]

Subclass of rdf-error. Signals a condition where an unsupported feature is encountered during parsing. The function nox:error-thing accesses the missing feature. Note that this is not the same class as nox:feature-not-supported.

about-and-id-both-present [Condition class]

Subclass of rdf-error. Signals a condition where a description element is encountered which has both the rdf:ID and rdf:about attributes. Continuing from this error causes the value of rdf:about to be used.

unknown-parsetype [Condition class]

Subclass of rdf-error. Signals a condition where an unknown value of the rdf:parsetype attribute is encountered. Continuing means ignoring the existence of any parsetype declaration. The function nox:error-thing accesses the unknown parsetype.

illegal-character-content [Condition class]

Subclass of rdf-error. Signals a condition where XML character content is encountered where according to RDF syntax rules there shouldn't be any (such as in the presence of rdf:resource attribute). The function nox:error-thing accesses the illegal content string.

container-required [Condition class]

Subclass of rdf-error. Signaled by the function is-container-p; continuing causes this function to return false. The function nox:error-thing accesses the tested node.

out-of-sequence-index [Condition class]

Subclass of rdf-error. Signals a condition where an attempt was made to create index URIs out of sequence (using function index-uri). Continuing causes index-uri to return nil.

duplicate-namespace-prefix [Condition class]

Subclass of rdf-error. Signals a condition where an attempt was made to rename a namespace prefix to one that already exists (using the function dictionary-rename-namespace). Continuing causes rename attempt to be ignored. The function nox:error-thing accesses the new prefix.

cannot-merge [Condition class]

Subclass of rdf-error. Signals a condition where an attempt was made to merge two databases but the function db-allow-merge-p returned false. The function nox:error-thing accesses a string which gives the reason of the failure.

4. URI Constants

The system defines a set of constants for all useful RDF URIs (strings). In the following table, the prefix rdf is assumed to denote the value of -rdf-uri- and the prefix rdfs to denote the value of -rdfs-uri-. Note that the URI constants are defined by the XML parser, exported (from the package "NOX"), imported into the package "WILBUR" and then re-exported.
 
Constant Value
-rdf-uri- "http://www.w3.org/1999/02/22-rdf-syntax-ns#" (in the current implementation)
-rdfs-uri- "http://www.w3.org/2000/01/rdf-schema#" (in the current implementation)
-rdf-id-uri- rdf:ID
-rdf-resource-uri- rdf:resource
-rdf-about-uri- rdf:about
-rdf-abouteach-uri- rdf:aboutEach
-rdf-abouteachprefix-uri- rdf:aboutEachPrefix
-rdf-bagid-uri- rdf:bagID
-rdf-parsetype-uri- rdf:parseType
-rdf-description-uri- rdf:Description
-rdf-type-uri- rdf:type
-rdf-rdf-uri- rdf:RDF
-rdf-li-uri- rdf:li
-rdf-statement-uri- rdf:Statement
-rdf-subject-uri- rdf:subject
-rdf-predicate-uri- rdf:predicate
-rdf-object-uri- rdf:object
-rdf-bag-uri- rdf:Bag
-rdf-seq-uri- rdf:Seq
-rdf-alt-uri- rdf:Alt
-rdfs-resource-uri- rdfs:Resource
-rdfs-class-uri- rdfs:Class
-rdfs-subclassof-uri- rdfs:subClassOf
-rdfs-subpropertyof-uri- rdfs:subPropertyOf
-rdfs-seealso-uri- rdfs:seeAlso
-rdfs-isdefinedby-uri- rdfs:isDefinedBy
-rdfs-constraintresource-uri- rdfs:ConstraintResource
-rdfs-constraintproperty-uri- rdfs:ConstraintProperty
-rdfs-range-uri- rdfs:range
-rdfs-domain-uri- rdfs:domain
-rdfs-comment-uri- rdfs:comment
-rdfs-label-uri- rdfs:labl
-rdfs-literal-uri- rdfs:Literal
-rdfs-container-uri- rdfs:Container


Copyright © 2001 Nokia. All Rights Reserved.
Subject to the NOKOS License version 1.0
Author: Ora Lassila (ora.lassila@nokia.com)