:-$

Ryan's work blog

My Links

News

The WeatherPixie
Subscribe with Bloglines
About this blog

Tools I use:

Post Categories

Article Categories

Archives

Image Galleries

Blog Stats

Personal

Projects

Random Blogs

Random other

Reference

Web comics

Work

RDF Primer notes (part 1)

I read the RDF Primer on w3.org, and here is my summary/notes, many of which are just things copied directly from the primer. The purpose of writing this down is so I can understand what this all is, everyone can read a shorter version (The primer is 104 pages) of the primer, and to define common terms so we can talk about this stuff without horrid miscommunication. I've linked each term to the relevant place (or as close as I can) in the RDF Primer. Either this is going to be very helpful or a colossal waste of time.
RDF Statement: The statement of a fact using a subject, predicate, and object.
Sample RDF statement in English:
http://www.example.org/index.html has a creator whose value is John Smith
RDF breakdown of that sentence:
  • the subject is the URI http://www.example.org/index.html
  • the predicate is the word "creator"
  • the object is the phrase "John Smith"
Note that this specifies binary relationships only. This is used as a foundation to create N:N relationships.
RDF Resource: anything that is identifiable by a URI reference. Subject, predicate, and object can all be RDF Resources.
URIref: Universal Resource Identifier Reference. A URIref is made of a URI, and optionally a '#' and fragment identifier.
Sample URIref:
http://www.example.org/index.html#section2
Breakdown:
  • URI: http://www.example.org/index.html
  • Fragment Identifier: section2
These are defined in a completely arbitrary manner, so when making up URIs, keep in mind the words involved are meant to convey meaning to people, and the machine doesn't care as long as the URI is unique. There is already a list of common URIrefs defined and available for use at http://dublincore.org/documents/dces/. Those are good for use as predicates, since they clearly define what that thing means, the semantics of the predicate. For those that read Quicksilver or are familiar with early English scientists, this should remind you of the Philosophick Language that Wilkins was working on.
Namespace: To make things less verbose, you can map a prefix to a URI, using that prefix anywhere you would use that URI.
For example, rather than using http://www.example.org/index.html, http://www.example.org/index2.html, etc, you can define a namespace by mapping prefix ex to URI http://www.example.org/, and then refer to those as ex:index.html, ex:index2.html, etc. Those are called QNames.
QName: a Qualified Name, which is made of a namespace prefix and a local name, like ex:index.html
Vocabulary: a set of URIrefs intended for a specific purpose. The URIrefs that make up the common Friend-of-a-friend (FOAF) RDF application would be a vocabulary. When we define how we want the types for the billing program, we'll be inventing our own URIrefs, and that will another vocabulary.
Blank node: in some cases, you want the object of a statement to be structured data. The standard way to do this would be to have the object be a URIref, and then put more predicates on that. If for some reason you don't want to assign a URIref to it, you can use a blank node, which are referred to as QNames with a prefix of "_". Example: _:name. These can be used as subjects and objects, but not predicates. Blank nodes are used to create anonymous intermediates for N:N relationships.
Typed Literal: You can specify another URIref to go with each literal to give it a type. Standard types are defined at http://www.w3.org/2001/XMLSchema.xsd. Specifying a type is optional, and RDF itself doesn't care. You can have a literal of "pumpkin", say its an integer, and it would still be valid RDF. Complying to the meaning of "integer" is the application's job, not RDF's.
RDF/XML: a way of specifying RDF statements in XML format. You define RDF namespaces as XML namespaces, with xmlns:prefix="URI", and then each XML element is either a node or a literal. It's pretty verbose, and not terribly human-friendly. See the examples in the primer, you'll get the picture pretty quickly. There's a slew of ways in the primer to specify the RDF namespaces and shorten the XML. When wanting to minimize the size of RDF/XML files being transferred around, this information will be helpful.
RDF containers: there are some predefined types that can be used to make groups (following list stolen shamelessly):
  • rdf:Bag : A Bag (a resource having type rdf:Bag) represents a group of resources or literals, possibly including duplicate members, where there is no significance in the order of the members.
  • rdf:Seq : A Sequence or Seq (a resource having type rdf:Seq) represents a group of resources or literals, possibly including duplicate members, where the order of the members is significant.
  • rdf:Alt : An Alternative or Alt (a resource having type rdf:Alt) represents a group of resources or literals that are alternatives (typically for a single value of a property). For example, an Alt might be used to describe alternative language translations for the title of a book, or to describe a list of alternative Internet sites at which a resource might be found. An application using a property whose value is an Alt container should be aware that it can choose any one of the members of the group as appropriate.
These are implemented with blank nodes. The actual <rdf:Bag> element has no URIref, and really doesn't need to. Containers let us specify a set of objects A, but it does not specify that those objects are the only members of A. In other words, containers say what is in there, but might not be all inclusive. There might be another RDF graph somewhere that defines additional objects to set A.
Collections: these are meant to solve the closure problems of containers. A collection is represented as a linked list in the RDF graph, the classic {first, rest} set we see everywhere. With this scheme, even if the collection is defined in two seperate valid RDF graphs, the trail of URIrefs will work out. The primer lists some syntactic sugar for specifying this in RDF/XML.
Reification: this is the process of recording meta-data about RDF statements, in RDF. The example in the primer is recording who created an RDF statement. This is made using an "reification quad" of 4 RDF statements:
  1. rdf:Type
  2. rdf:subject
  3. rdf:predicate
  4. rdf:object
Then you add your meta-data to that quad. This seems mainly used to provide source information, and could be used to form a trust-network, where you use reification to find the source of a statement, and weight that statement based on how much you trust the source.
I stopped at Section 5, Defining RDF Vocabularies: RDF Schema. I glossed over an awful lot of RDF/XML syntactic sugar, trying to focus on concepts. I'm about halfway through the primer, and plan to finish out the damn thing in the next couple of days.

posted on Wednesday, October 27, 2004 5:28 PM