I read the
RDF Primer on
w3.org, and here is my
summary/notes, many of which are just things copied directly from the primer.
The purpose of writing this down is so I can understand what this all is, everyone can read a shorter version
(The primer is 104 pages) of the primer, and to define common terms so we can talk about this stuff
without horrid miscommunication. I've linked each term to the relevant place (or as close as I can) in the RDF Primer.
Either this is going to be very helpful or a colossal waste of time.
RDF Statement: The statement of a fact using a subject, predicate, and object.
Sample RDF statement in English:
http://www.example.org/index.html has a creator whose value is John Smith
RDF breakdown of that sentence:
- the subject is the URI
http://www.example.org/index.html
- the predicate is the word "creator"
- the object is the phrase "John Smith"
Note that this specifies
binary relationships only. This is used as a foundation to create N:N relationships.
RDF Resource: anything that is identifiable by a URI reference. Subject, predicate, and object can all be RDF Resources.
URIref: Universal Resource Identifier Reference. A URIref is made of a URI, and optionally a '#' and fragment identifier.
Sample URIref:
http://www.example.org/index.html#section2
Breakdown:
- URI:
http://www.example.org/index.html
- Fragment Identifier:
section2
These are defined in a completely arbitrary manner, so when making up URIs, keep in mind the words involved are meant
to convey meaning to people, and the machine doesn't care as long as the URI is unique. There is already a list of common URIrefs defined and
available for use at
http://dublincore.org/documents/dces/. Those are
good for use as predicates, since they clearly define what that thing means, the semantics of the predicate. For those that read
Quicksilver or are familiar with early English scientists, this should remind you of the Philosophick Language that Wilkins was working on.
Namespace: To make things less verbose, you can map a prefix to a URI, using that prefix anywhere you would use that URI.
For example, rather than using
http://www.example.org/index.html,
http://www.example.org/index2.html, etc, you
can define a namespace by mapping prefix
ex to URI
http://www.example.org/, and then
refer to those as
ex:index.html,
ex:index2.html, etc. Those are called QNames.
QName: a Qualified Name, which is made of a namespace prefix and a local name, like
ex:index.html
Vocabulary: a set of URIrefs intended for a specific purpose. The URIrefs that make up the common Friend-of-a-friend (FOAF) RDF application would be a vocabulary.
When we define how we want the types for the billing program, we'll be inventing our own URIrefs, and that will another vocabulary.
Blank node: in some cases, you want the object of a statement to be structured data. The standard way to do this would be
to have the object be a URIref, and then put more predicates on that. If for some reason you don't want to assign a URIref to it, you
can use a blank node, which are referred to as QNames with a prefix of "_". Example:
_:name. These can be used as subjects and objects, but
not predicates. Blank nodes are used to create anonymous intermediates for N:N relationships.
Typed Literal: You can specify another URIref to go with each
literal to give it a type. Standard types are defined at
http://www.w3.org/2001/XMLSchema.xsd.
Specifying a type is optional, and RDF itself doesn't care. You can have a literal of "pumpkin", say its an integer, and it would still be
valid RDF. Complying to the meaning of "integer" is the application's job, not RDF's.
RDF/XML: a way of specifying RDF statements in XML format. You
define RDF namespaces as XML namespaces, with
xmlns:prefix="URI", and then each XML element is either a node or a literal.
It's pretty verbose, and not terribly human-friendly. See the examples in the primer, you'll get the picture pretty quickly. There's a
slew of ways in the primer to specify the RDF namespaces and shorten the XML. When wanting to minimize the size of RDF/XML files
being transferred around, this information will be helpful.
RDF containers: there are some predefined types that can be used to make
groups (following list stolen shamelessly):
rdf:Bag : A Bag (a resource having type rdf:Bag)
represents a group of resources or literals, possibly including duplicate
members, where there is no significance in the order of the members.
rdf:Seq : A Sequence or Seq (a resource having type
rdf:Seq) represents a group of resources or literals, possibly
including duplicate members, where the order of the members is significant.
rdf:Alt : An Alternative or Alt (a resource having
type rdf:Alt) represents a group of resources or literals that
are alternatives (typically for a single value of a
property). For example, an Alt might be used to describe
alternative language translations for the title of a book, or
to describe a list of alternative Internet sites at which a
resource might be found. An application using a property whose
value is an Alt container should be aware that it can choose
any one of the members of the group as appropriate.
These are implemented with blank nodes. The actual
<rdf:Bag> element has
no URIref, and really doesn't need to. Containers let us specify a set of objects
A, but it does not specify that those
objects are the only members of
A. In other words, containers say what is in there, but might not be all inclusive. There
might be another RDF graph somewhere that defines additional objects to set
A.
Collections: these are meant to solve the closure problems of
containers. A collection is represented as a linked list in the RDF graph, the classic
{first, rest} set we see everywhere.
With this scheme, even if the collection is defined in two seperate valid RDF graphs, the trail of URIrefs will work out.
The primer lists some syntactic sugar for specifying this in RDF/XML.
Reification: this is the process of recording meta-data about
RDF statements, in RDF. The example in the primer is recording who created an RDF statement. This is made using an "reification quad"
of 4 RDF statements:
rdf:Type
rdf:subject
rdf:predicate
rdf:object
Then you add your meta-data to that quad. This seems mainly used to provide source information,
and could be used to form a trust-network, where you use reification to find the source of a statement, and weight that statement based on
how much you trust the source.
I stopped at Section 5,
Defining RDF Vocabularies: RDF Schema. I glossed over
an awful lot of RDF/XML syntactic sugar, trying to focus on concepts. I'm about halfway through the primer, and plan to finish out the
damn thing in the next couple of days.