This is my effort at documenting the issues that have been raised during the design of the LOM RDF binding. It has grown longer than I had envisioned, but it contains many things necessary to understand the problems. Please read when in a good mood!
in a pure XML approach such as the LOM XML Binding, the structure of the XML instance is the result of choosing the most convenient syntax, creating the element hierarchy that best matches the structure of the LOM data model.
By contrast, in RDF the precise data model is not only syntactic, but has semantic consequences. RDF is a highly object-oriented "language" where objects have properties that relate them to other objects. The type of an object or property defines its interpretation, and is thus not simply a syntactic placeholder. In the pure XML version of LOM each structure is represented by an element. In RDF there are several different possibilities for representing a LOM element: you can use Properties, Resources, Classes, or even namespaces to reflect the structure of LOM. And the choice matters, as those constructs have fundamentally different semantics. All of these are used in the current draft.
Thus, a considerable amount of effort is needed to extract the desired semantic quality of each element in order to be able to represent it appropriately. If this reinterpretation is not done, you risk losing not only clarity for the human consumer, but you risk more serious damage to the usefulness of the model. Much of the effort that has gone into this binding has focused on creating such a well-formed (machine-interpretable) semantics of the model.
We therefore expect to see much richer structures on many levels in an RDF representation than in the corresponding XML binding instance. In this perspective, we should expect to find that meta-data expressed in RDF using this binding probably can be exported to XML format without many problems; however, an XML meta-data record cannot always be effortlessly translated to RDF, as the translation will depend on your setup.
As a consequence of this, we cannot expect the RDF binding to
fulfill the same purpose as the XML binding. The XML Binding defines an
exchange format for meta-data. The meta-data might be contained in a
database and an XML representation generated on demand, for export to
other tools and environments. Thus, an XML meta-data record is a
self-contained entity with a well-defined structure.
In RDF, the meta-data is not always self-contained, but rather forms
part of a global network of information, where anyone has the capability
of adding any kind of meta-data to any resource. It is not the case that
meta-data for one resource need be contained in a single RDF document.
Translations might be administrated separately, and different categories
of meta-data might be separated. This dramatically strengthens the
incentive both to reuse identical structures that are used repeatedly,
as well as to create decentralized descriptions of resources. Both of
these phenomena naturally lead to a fundamentally different approach to
meta-data modelling than that found in the XML binding.
One way of putting it is this: while XML is document-centric, RDF is
statement-centric. XML describes the structure of a complete document
instance. RDF describes the structure of single metadata statement. The
RDF binding must therefore be designed one element at a time.
Another aspect is that of compatibility. In the XML binding, there is no standard way to reuse other meta-data standards. The statement-centric design of RDF leads to naturally reuseable constructs. Metadata elements can be extended both structurally (by adding more information), or semantically (by adding refinements of elements). This binding has been designed to be directly compatible with Dublin Core (including the DC Qualifiers, DC Type and DC Education vocabularies) and with the vCard RDF binding. However, this compatibly comes at the price of modeling freedom - some modeling restrictions are imposed on us. Fortunately, much of this compatibility comes for free when taking the approach that the data should be modeled to maximally exploit the expressivity of RDF.
Finally, as RDF is intended to be processed by software, and in many cases software with no explicit knowledge of LOM, it is important to use explicit data typing. This will be seen below in the representation of languages and dates. We have tried to avoid using string literals with implicit typing. Thus, a goal of this binding has been to define a set of RDF constructs that facilitates introduction of LOM meta-data into the semantic web in the most convenient way.
This binding has been in development since approximately March 2001, when a first attempt at encoding IMS metadata in RDF was made. The first version of the binding was released with version 1.2 of the IMS metadata standard. Most of the important design decisions had been made at that point (some of the discussion can be seen here). The current binding is a development of that binding, consisting of some clarifications, several minor changes in encoding, updated namespaces and a new introduction, and an update to LOM 1.0 (from draft 6.1, used in IMS). The most important design decisions resulting from these efforts have been:
We will now discuss each of these point separately.
The RDF representation of LOM relies heavily on the Dublin Core
meta-data element set, and its representation in RDF. We try to model
LOM elements similarly to how the Dublin Core qualifiers are
represented. The Dublin Core RDF usage model is taken from the latest
DC-Architecture RDF draft, foundhere.
Understanding that work is helpful when trying to understand this
binding. The decision to extend Dublin Core was made early, and was
probably the single most important decision for this binding. This
decision is therefore well-aligned with the efforts to improve
interoperability between Dublin Core and LOM (see the memorandum of
understanding here).
The RDF binding is designed to be almost fully Dublin Core RDF compatible, in the sense that meta-data constructed according to this guide can be directly understood by Dublin Core-aware software. All the elements of the LOM Dublin Core mapping (in Appendix B of LOM 1.0) are represented in a way compatible with both LOM 1.0 and with Qualified Dublin Core.
It is, however, at this time not possible to map any Dublin Core construct (made without reference to this guide) to a LOM element without some effort, as the LOM requires a more detailed structure in many elements. In short, this guide describes some restrictions to Qualified Dublin Core meta-data that are needed to be LOM compatible. The guiding principle has been that applying the "dumb-down" algorithm described in the Dublin Core Qualifiers in RDF draft should result in correct Dublin Core meta-data.
Please note that the Dublin Core Qualifiers work referred to above
has not yet reached its final version, so some constructs described here
might change.
See below for a more detailed description of the Dublin Core mapping.
This binding also makes use of the vCard RDF binding by the W3C
in a fairly straghtforward manner.
Of fundamental importance for RDF is the usage of URIs (or strictly speaking, URI references). Using URIs for all terms in any RDF vocabulary makes it possible to add RDF metadata to the vocabulary terms themselves. Examples of vocabulary metadata could be machine semantics such as "this term is a refinement of dc:contributor" or human-readable information such as "this term is called 'Skapare' in Swedish". It was quickly decided that this binding must use URIs for all vocabulary terms used.
The Dublin Core metadata specifications deal with terms such as "Element", "Element refinement", and "Element Encoding", that have obvious counterparts in RDF:
Dublin Core term |
RDF term |
---|---|
Element |
rdf:Property |
Element refinement |
rdfs:subPropertyOf |
Element encoding |
rdfs:Class (used as rdfs:range
of the corresponding rdf:Property ) |
Dublin Core therefore has well-defined semantics of each element, a
semantics that corresponds well with RDF semantics. One of the most
important problems for this binding has been that the LOM Data Model
does not seem to have an explicit semantics for its elements. It rather
seems that the term "Element" in LOM more closely corresponds to the
term "Element" in XML, representing a node in a hierarchy. LOM contains
no facilities for refining elements or using another element encoding.
It would therefore seem that LOM uses a structural model just like XML,
while Dublin Core and RDF uses a more semantic model.
Thus, in order to encode the LOM 1.0 data model in RDF in a manner
compatible with Dublin Core, we have had to do some re-modeling of LOM,
trying to interpret the element hierarchy in terms of "properties" and
"values". This is discussed in more detail below. In the following, I
will refer to Dublin Core Elements with the term "DC Element", and to
LOM elements using "LOM Element" to disambiguate the term "element".
Close attention has been paid to the LOM data model and its XML binding, and no structure representable in LOM should be problematic to represent in the RDF binding. But there are some differences from the LOM XML binding in structure, naming, and representation. However, converting an RDF version of the LOM data to XML should be straightforward. Many users of the RDF binding will use highly customized versions of the binding, with many structural and semantic extensions, as well as application-specific conventions. It can therefore not be expected that a generic XML-to-RDF conversion tool will work for all situations.
The binding has been developed in a number of steps. Explaining this
process helps when trying to explain specific problems, so I have chosen
to include it here.
The first step involves extracting an object-oriented view of the
LOM data model. What LOM elements are objects, and which are relations
between objects?
The first kinds of LOM elements that were taken care of were the
nine LOM categories. As they do not in themselves carry information, but
only represent a context for other LOM elements, they were simply used
as namespaces for the properties in each category. So the binding
consists of nine category-specific namespaces plus one namespace for
general information.
Exceptions to the "a category is only a namespace" rule were the
categories 7. Relation, 8. Annotation and 9.
Classification. This can be seen in that the categories themselves
are repeatable, so that each occurence of a category represents a
distict value of some property of the learning object [What is up
with 5. Educational???]. However, these categories still use their
own namespaces.
The second kind of elements that can easily be discerned are obvious
objects: the basic data types. These include:
and so on. Generally, all leaf nodes in the LOM data model are
objects. Note that several of these objects may have several properties
of their own (e.g. dates may have descriptions).
"Vocabulary" items are sometimes, but not always, mapped to objects.
The LOM element 5.4 Educational.SemanticDensity uses a property,
called lom_edu:semanticDensity
, with a value of type lom_edu:SemanticDensity
.
Five instances of that type are defined, which form the vocabulary
given in the LOM data model, e.g. lom_edu:HighDensity
. So
it is now easy to make statements of the form "My resource has semantic
density 'high'":
Subject |
Predicate |
Object |
---|---|---|
<http://www.myresource.com/> |
<lom_edu:semanticDensity> |
<lom_edu:HighDensity> |
(this is an namespace-abbreviated N-TRIPLE RDF
format). It is also very easy to make new vocabularies for this LOM
element: just define your own instances of the Class
lom_edu:SemanticDensity
.
(Note that this is a nice example of the statement-orientedness of RDF.
The statement above is in itself a complete piece of RDF, and can live
independently of any other metadata for that resource).
But take as another example the 7.1 Relation.Kind element.
This element corresponds very closely to an element refinement of the dc:relation
Dublin Core element. In RDF, such refinements are represented as
distinct properties, each marked as being refinements (sub-properties)
of dc:relation
. So a vocabulary for 7.1 Relation.Kind
is actually a list of properties, not of values like in the Semantic
Density example. To say that "My resource is part of http://www.w3.org/
",
one would simply say
Subject |
Predicate |
Object |
---|---|---|
<http://www.myresource.com/> |
<dcterms:isPartOf> |
<http://www.w3.org/> |
where dcterms:isPartOf
is a sub-property of dc:relation
:
Subject |
Predicate |
Object |
---|---|---|
<dcterms:isPartOf> |
<rdfs:subPropertyOf> |
<dc:relation> |
This fact is already recorded in the DC Qualifiers RDF schema, of
course. So, creating a vocabulary for the 7.1 Relation.Kind LOM
element boils down to defining sub-properties (=refinements) of dc:relation
.
Note the difference between this and the above example, where we
defined new instances of lom_edu:SemanticDensity
.
For the rest of the elements (quite a few), things are not always
obvious. I wish I could produce a complete listing of all the properties
and objects, but it would unfortuantely be too long. I will only mention
those that actually cause problems.
As seen above in the 7.1 Relation.Kind example, some elements
have obvious Dublin Core counterparts. Nothing would stop us from
defining our own lom_rel:isPartOf
property, and then saying
Subject |
Predicate |
Object |
---|---|---|
<http://www.myresource.com/> |
<lom_rel:isPartOf> |
<http://www.w3.org/> |
But this would seem quite unnecessary, as the meaning would be
exactly the same as when using dcterms:isPartOf
. It would
only cause interoperability problems, not solve any.
We have therefore tried to reuse Dublin Core vocabulary wherever
that has been feasible. This has actually been quite successful; only in
a few cases has it proven difficult. As has been mentioned, Dublin Core
elements come in two kinds: DC Elements (including DC Element
Refinements) and DC Element Encodings. A DC Element Encoding is a
specification of the type of value for a certain DC Element. For
example, the dc:language
DC Element can include any kind
of string value, such as "English" or "Swedish". One DC Element Encoding
for that DC Element is dcterms:RFC1766
, which specifies
that the string must be encoded using RFC1766 (two-letter ISO language
codes). Using this DC Element Encoding, we can say "The language of My
Resource is something of type RFC1766, and with the value 'en'":
Subject |
Predicate |
Object |
---|---|---|
<http://www.myresource.com/> |
<dc:language> |
_:XXX |
_:XXX |
<rdf:type> |
<dcterms:RFC1766> |
_:XXX |
<rdf:value> |
"en" |
The _:XXX
object (the "something" object) is a
so-called anonymous RDF node (it has no URI).
These DC Element Encodings are very useful for specifying the
interpretation of a literal string, where this is not given by the
definition of the property. This kind of construct has been used in many
places in the LOM RDF binding, and many of those are direct reuses of DC
Element Encodings, for the very same interoperability reasons as above.
So this step of the process consisted of finding out what DC
Elements and Element Encodings were related to a given LOM element. In
some cases this caused reconsideration of the property-value status
(step 1 above) of the LOM element.
After having found the relevant Dublin Core elements, the precise
relation to the Dublin Core element needed to be defined. There are
essentially four ways in which a LOM element might be related to Dublin
Core:
Obviously, these relations could be clarified in the LOM standard. But the fact is that they are not, so we needed to specify them. This was done starting from the LOM-DC mapping in Appendix B of the LOM data model specification. As it (unfortunately) does not contain a mapping for the (many) terms in DC Qualifiers, DC Type vocabulary, or DC Education, this mapping had to be expanded to include those terms. The resulting list can be found in Appendix A.
When the precise relation to the relevant Dublin Core element had
been specified, we needed to make sure that the element could express
all the information in LOM. Such information includes:
With respect to the above, a number of design decisions, common for
many elements, were made.
In contrast to XML, RDF elements have no automatic ordering. Again,
we can see the XML-isms in LOM, which specifies whether an element
should be interpreted as ordered or not. In XML elements are
automatically ordered, so this is simply a question of interpretation.
In RDF, ordering must explicitly be encoded. This is usually done using
the rdf:Seq
container. Using this container construct, a
set of values of a property can be grouped together and placed in order.
There are two other container types in RDF, the rdf:Bag
and
the rdf:Alt
. Their usage is pretty straightforward:
rdf:Seq
is used for a set of values that are ordered
(more important first, for example)rdf:Bag
is used for a set of values that belong
together, but with no inherent order (such as a group authors)rdf:Alt
is used for a set of values that represent
alternative, interchangable values of the same property.RDF allows literal strings to carry a language tag. The LangString
LOM construct is encoded in RDF using this feature and the rdf:Alt
container. Thus, a title with translations is given as
Subject |
Predicate |
Object |
---|---|---|
<http://www.myresource.com/> |
<dc:title> |
_:XXX |
_:XXX |
<rdf:type> |
<rdf:Alt> |
_:XXX |
<rdf:_1> |
"My resource"-en |
_:XXX |
<rdf:_2> |
"Min resurs"-sv |
Several LOM elements correspond to Dublin Core elements, but carry
more information that the Dublin Core element. For example, when using
9. Classification, LOM allows each taxon to have both an id (9.2.2.1 Id)
and a textual entry (9.2.2.2 Entry). In the RDF binding, these are
modeled as additional properties of the object of the dc:subject
property. Dublin Core does not specifiy them, but RDF allows them, so
this approach works seamlessly.
lom_edu:SemanticDensity
,
that can be used as a type for the vocabulary terms of the
corresponding LOM Element. Defining new vocabulary for this LOM element
then boils down to simply defining new instances of this class. In this
way, all elements that can be extended have a well-defined, semantically
correct method for extension.Prio |
Issue |
Problem |
Status |
Suggestion |
|
---|---|---|---|---|---|
5.1.1 |
[1] |
Contribute element |
In LOM, the Contribute element models a contribution, not a contributor. Mapping this to DC is difficult. | Currently, the LOM RDF binding
maps Contribute to subproperties of dc:creator etc. |
Do not use DC compatible modeling - it just does not work! |
5.1.2 |
[2] |
Learning Resource Type |
The Learning Resource Type
element uses rdf:type . This LOM element is ordered, but rdf:type
can not be ordered. |
rdf:type is used,
and order is not preserved! |
rdf:type is the
right mapping for this. This should probably not be changed. Or should
we use rdf:type only for the first type? Probably not... |
5.1.3 |
[2] |
Identifier element |
The identifier element uses URIs whenever possible, but what about the case when there is no URI binding, but only a Catalog/Entry Pair? | Currently uses an rdf:value
solution (making it a DC Element Encoding, essentially), that works
quite well. |
Should we perhaps look at IMS Reusable competency definitions? Their XML binding has nice text on this. In any case, we need to sync with the XML folk. |
5.1.4 |
[1] |
Educational category |
The Educational category in LOM 1.0 has multiplicity 1..*, which it did not have in earlier draft versions of LOM. | We use Educational as a namespace only. | The Educational category probably needs to be remodeled as a structure. |
5.1.5 |
[3] |
Educational.Context |
Does Educational.Context map to dc:audiencelevel ? |
Currently it does not. | In light of 5.1.4, this question might be moot. |
5.1.6 |
[2] |
Taxonomies |
The taxonomy model uses lom_cls:taxon
for two different purposes: to specify the root taxons in a hierarchy,
and to specify sub-taxons in the hierarchy. This leads to improper
conclusions if classification inference is used. |
Right now we cannot distinguish the two cases. |
Need to introduce a new property. |
5.1.7 |
[2] | Need to double- and triple-check LOM compatibility | The model might contain places where it is incompatible with LOM. | No known cases except as noted in this list. | Need to go through the whole model thoroughly. Independent person would be good. |
Prio |
Issue |
Problem |
Status |
Suggestion |
|
---|---|---|---|---|---|
5.2.1 |
[1] | Semantics | Some of the semantic modeling that has been done will need approval from LOM. | None of it has been cleared | ? |
5.2.2 |
[1] | DC/DCQ mapping | Need to specify the details of the full DCQ mapping, and have it approved in LOM. | To be done. | ? |
5.2.3 |
[2] |
Need namespaces | Need to specify the exact namespaces to use. | Boyd is on this? | ? |
Prio |
Issue |
Problem |
Status |
Suggestion |
|
---|---|---|---|---|---|
5.3.1 |
[1] | Use RDF datatyping? | The new RDF specs contain the notion of typed literals. This fits well into many elements. | Currently not used. | Delay until LOM/RDF version 2. |
5.3.2 |
[3] | Very weak contraints - would need both Ontology & Rules | Many of the constraints and inference rules are not represented in a machine-processable manner. We would need to use both Ontology support and RDF Rules to fix that. | Only RDF Schema is used. | Wait until Ontology/Rules specs from W3C stabilize. |
5.3.3 |
[3] |
Ordering is sometimes ugly | Forcing ordering is in some places very ugly | Currently ordering is not
required, but possible. |
Leave as is |
Prio |
Issue |
Problem |
Status |
Suggestion |
|
---|---|---|---|---|---|
5.4.1 |
[2] | Examples - XML/Graphs/N-TRIPLE? | How should we present the
examples? As XML, as Graphs or using a triple notation such as N-TRIPLE? |
Only XML fragment. |
Use more, at least. XML is not
enough. |
5.4.2 |
[2] | RDF Schema extracts? | DCQ/RDF includes Schema
fragments in each example. Should we? |
No schemas in examples. |
Possibly include. |
5.4.3 |
[3] |
Check references | Are all references correct and
up-to-date? |
- |
Check |
5.4.4 |
[2] |
Document format |
Do we have a good doc format? More information?
Does the table work? |
- |
- |