Transforming QTI v2 into (X)HTML 5

At a recent IMS meeting I again mentioned that transforming QTI v2 into HTML shouldn't be too difficult and that I'd already made a start on this project many years ago. I even mentioned it in an earlier blog post: Semantic Markup in HTML. To my shame, I got a comment on that post which called my bluff and I haven't posted my work - until now! I won't bore you with the excuses, day job, etc. I should also warn you that these files are in no way complete. However, they do solve most of the hard problems in my opinion and could be built out to cover the rest of the interaction types fairly easily.

If you want to sing along with this blog post you should look at the files in the following directory of the QTI migration source repository: https://code.google.com/p/qtimigration/source/browse/#svn%2Ftrunk%2Fqti2html. In there you'll find a collection of XSL files.


The goal of this project was to see how easy it would be to transform QTI version 2 files into HTML 5 in such a way that the HTML 5 was an alternative representation of the complete QTI information model. The goal was not to create a rendering which would work in an assessment delivery engine but to create a rendering that would store all the information about an item and render it in a sensible way, perhaps in a way suitable for a reviewer to view. I was partly inspired by a comment from Dick Bacon, of SToMP fame. He said it would be nice to see everything from the question all in one go, including feedback that is initially hidden and so on. It sort of gave me the idea to do the XSL this way.

Let's see what this stylesheet does to the most basic QTI v2 example, here's the command I ran on my Mac:

xsltproc qti2html.xsl choice.xml > choice.xhtml

And here's how the resulting file looks in Firefox:

The first thing you'll notice is that there is no HTML form in sight. You can't interact with this page, it is static text. But remember the goal of this stylesheet is to represent the QTI information completely. Let's look at the generated HTML source:

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:qti="http://www.imsglobal.org/xsd/imsqti_v2p1">
        <title>Unattended Luggage</title>
        <meta name="qti.identifier" content="choice"/>
        <meta name="qti.adaptive" content="false"/>
        <meta name="qti.timeDependent" content="false"/>
        <style type="text/css">
        <!-- removed for brevity -->

To start with, the head element contains some useful meta tags with names starting "qti.", this allows us to capture basic information that would normally be in the root element of the item.

        <h2>Unattended Luggage</h2>
        <div class="qti-itemBody">
            <p>Look at the text in the picture.</p>
                <img src="images/sign.png" alt="NEVER LEAVE LUGGAGE UNATTENDED"/>
            <div class="qti-choiceInteraction" id="RESPONSE" data-baseType="identifier"
                data-cardinality="single" data-shuffle="false" data-maxChoices="1">
                <p class="qti-prompt">What does it say?</p>
                <ul class="qti-choiceInteraction">
                    <li class="qti-simpleChoice" data-identifier="ChoiceA" data-correct="true">You
                        must stay with your luggage at all times.</li>
                    <li class="qti-simpleChoice" data-identifier="ChoiceB">Do not let someone else
                        look after your luggage.</li>
                    <li class="qti-simpleChoice" data-identifier="ChoiceC">Remember your luggage
                        when you leave.</li>

I've abbreviated the body here but you'll see that the item body maps in to a div with a appropriate class name (this time prefixed with qti- to make styling easier). The HTML copies across pretty much unchanged but the interesting part is the div with a class of "qti-choiceInteraction". Here we've mapped the choiceInteraction in the original XML into a div and used HTML5 style data attributes to add information about the behaviour. Cardinality, etc. In essence, this div performs the role of both interaction and response variable declaration.

I chose to map the choices themselves on to an unordered list in HTML, again, using HTML5 data- attributes to provide the additional information required by QTI.


So can this format be converted back to QTI v2? Yes it can. The purpose of this stylesheet is to take the XHTML5 representation created by the previous stylesheet and turn it back in to valid QTIv2. This gives us the power to work in either representation. If we want to edit the HTML we can!

xsltproc html2qti.xsl choice.xhmtl > choice-new.xml

The output file is not identical to the input file, there are some changes to comments, white space and the order in which attributes are expressed but it is valid QTI v2 and is the same in every important respect.


Once we have an XHTML5 representation of the item it seems like it should be easy to make something more interactive. A pure HTML/JS delivery engine for QTIv2 is an attractive proposition, especially if you can use an XSL transform to get there from the original QTI file. One of the main objections to QTI I used to hear from the SCORM crowd back in the early days was that there was no player for QTI files that were put in SCORM packages. But given that most modern browsers can do XSL this objection could just disappear, at least for formative quizzes where you don't mind handing out the scoring rules embedded in the HTML. You could even tie up such an engine with calls to the SCORM API to report back scores.

Given this background, the qti.xsl file takes a first step to transforming the XHTML5 file produced with the first XSL into something that is more like an interactive question.

xsltproc qti.xsl choice.xhtml > choice_tryout.xhtml

This is what the output looks like in Firefox:

This page is interactive, if you change your choice and click OK it re-scores the question. Note that I currently have a score of 1 as I have the correct choice selected.

Future Work

These XSL files will handle some of the other variants of multi-choice, including multi-response (yes, even when interacting) but are in no way a complete method of using XSL to transform QTI. For me, the only compelling argument for using XSL is if you want to embed it in a package to force the client to do the transformation or if you are using one of those cocoon like tools that uses XSL pipelines at runtime. Realistically, writing XSL is a pig and so much time has elapsed since I wrote it that it would take an hour or two to familiarise myself with it again if I wanted to make changes.

But the key idea is the transforms enabled by the first two, which present an alternative binding for the QTI information model.


In historic vote, New Zealand bans software patents | Ars Technica

In historic vote, New Zealand bans software patents | Ars Technica

In my personal opinion, this is great news. I have had to sift through patent applications from software vendors that have clearly been created by simply sending a bunch of interface files to a lawyer to be translated in to patentese. You know the sort of thing, an API call like Student.GetName(id) becomes a 'claim' in which a string representation of a student's name is obtained from a system with a stored representation of a student's registration information, etc, etc. If we carry on as we are, someone will have to write an Eclipse plug-in that generates the patent application every time you build your software.

So this news is a ray of hope. It isn't a blanket "IP law is bad" bill but a measured way of enshrining the basic principle that software 'as such' is not an invention. There will always be the odd counterexample where something is no longer patentable that we might feel should be (and vice versa!) but ever since I've been following this debate it seems to me that the system has stayed the same not for lack of suggestions of how to make it better but because of FUD around change. Thank you NZ for being bolder.

Initially 'via Blog this'


What's in a name? Tin Can, Experience and the Tea Party

On Friday I was at e-Assessment Scotland, getting ready for Fiona Leteney's closing keynote "Anyone need a Can Opener?" when I tweeted:

Getting ready for the #easc13 session on the Experience API adlnet.gov/tla/experience… - anyone got an experience opener?

I tweeted this because I wanted to ensure that the twitter-savvy audience at #easc13 understood that there is an issue with the name of the API and that, in my opinion, it is important. Fiona followed up on twitter and, I tried to put this in to 140 characters but it just wouldn't fit. Before I go on though, a couple of disclaimers...

This blog post represents my personal views, not the views of my employer or of any other organization with which I may be affiliated. In the interests of full disclosure, I work for Questionmark who have presented work related to this API at previous conferencess, including at DevLearn. Also, I'm not a lawyer! I am interested in the world of open source and in some of the legal issues that new technology throws up, particularly when they relate to intellectual property law.

So What's in a name?

The title of @fionaleteney's talk clearly makes reference to the "Tin Can API" but she did touch on the naming issue in the presentation and introduced the newer "Experience API" name with the observation that "Tin Can" is likely to remain the recognisable brand.

Many people in the community probably think the name doesn't matter and that arguing over a name is an unimportant distraction. Indeed, it all feels a bit like the Monty Python film, The Life of Brian, in which followers argue over whether the shoe or the gourd is the correct symbol to represent their community. I believe it is important, in short because I don't think an API like this can succeed if it gets this type of thing wrong. The argument goes like this:

  1. The purpose of the API is to make it easier for activity statements to flow between the tools that generate or initially record activities and tools that aggregate this information.
  2. Therefore, the API is essentially an interoperability specification that will be more successful if a wide range of tools adopt it.
  3. To get a wide range of tools you need a wide range of tool suppliers to invest in adding support.
  4. To get a wide range of suppliers to invest you need a level playing field and trust within the supplier community
  5. To get trust, you need good stewardship of the API.

I've honed in on one particular prerequisite for success here. There are plenty of other challenges that an API like this faces, including other issues related to IP law. After the session one delegate expressed concern about the ownership of information communicated using the API, and I have heard privacy issues voiced too. These are important things to get right, even more reason not to distract the suppliers' legal teams with branding issues when they should be advising their clients on how to use the API to improve the learner's experience from the right side of the law.

You'll have had your tea

One aspect of good stewardship for an API is the branding. In practice, that means control of the trademarks associated with an API. To people in the technology world this often translates directly into domain names but that is only one way in which trademarks are used. What about Google Ad words? Google has a policy for that. And don't forget logos.

Of course, the big guns of the IT world get this type of thing sorted out before they even embark on a joint project but legal issues are often the last thing a small learning technology community thinks about. We're all well meaning, what could possibly go wrong? Why doesn't everyone trust me? What do you mean, I'm not allowed to team up with a small group of my competitors to gain an advantage over the rest of the marketplace? And so on... I have to thank Ed Walker for drawing my attention to the importance of getting this type of thing right when he was in his role as CEO of the IMS Global Learning consortium. He was particularly clear on the last point.

Thinking of Ed brings me (via Boston) to an interesting article about a similar problem in a very different domain. In Trademarking the Tea Party, an article from 2010, Beth Hutchens touches on an issue which resonates with the problem experienced by the Experience API community. In this case, during a dispute over the identity of a political movement one group seeks trademark protection for the Tea Party name and all the other groups cry foul!

The problem is that the Tea Party is more of a conglomerate of different groups of people based loosely around a set of common goals, and not a collective. This could be problematic in gaining trademark protection.

This could well describe the sort of informal grouping that tends to build up around an API. The author goes on to say...

It makes sense to have at least some sort of structure to keep squabbles like this from coming up in the future.

The suggested solution is set yourselves up as a collective movement with a distinctive name so that you can register a collective mark. As with anything law related, it makes sense to put a little work in up-front to ensure you don't have to spend a lot of time later figuring out who (if anyone) has the right to say how the name can and can't be used. The rest of the article is worth a read by the way, if only for its use of the word antidisestablishmentarianism in a justifiable context. But more seriously, it is a great introduction for normal people on what a collective trademark is.

Kicking the Can

To understand why the naming thing has become an issue for this API it is best to read the discussion We Call It Tan Can. There are two issues being debated here so it is a bit confusing.

  1. Which is the better name: Experience API or Tin Can API? - this blog post isn't about this question, for what it's worth I voted for Experience API but I could live with Tin Can if we got the second issue straight.
  2. What is the appropriate legal way for the community to manage and control use of the API's name?

It is clear that the community recognises the problem. The team that originally led the development of the API registered a trademark with a view to handing it over to the project sponsor, the ADL. ADL is part of the US government and it seems like handing a trademark to a government department has turned out to be trickier than expected. The US Government does have a succinct page on the issue of government owned IP though: Copyright and Other Rights Pertaining to U.S. Government Works. On this page it makes it clear that you can't use a government owned trademark without permission (no big surprise there) and clears up the confusion of which bits of government IP are in the public domain and which aren't. (An issue which has fascinating implications for the makers of model aircraft, but we digress.)

In the above thread, Mike Rustici goes on to say:

Since we're on the topic of trademarks, another significant issue to consider in this debate is the fact that "experience api" is not trademarked. If ADL is unable or unwilling to secure one, that is very problematic for the future of this spec.
Anybody could claim to support or own "experience api" rendering the spec (or at least the label) meaningless.

So who should own the mark? The ADL seems to be struggling to take ownership and, even it did, how would it determine the rules under which permission should be granted to members of the API community? If one member abused the mark, would the US government pursue them on behalf of the other members? It isn't clear that the ADL has the desire to fill this role on their behalf.

Let's take the very particular issue of domain names, though Google Adwords do raise similar concerns. Policies like ICANN's Unified Domain Name Dispute Resolution Policy make it clear that trademarks are an important part of determining whether complaints will be upheld. So if the API's name was owned by a neutral party would that group invoke the policy to ensure that names like experienceapi.com or tincanapi.com were used in a way that ensures there will be no confusion? You'd hope so. Right now, both tincanapi.com and experienceapi.com point to basically the same site controlled by just one member of the community and are used to promote their services over and above those of other community members. As far as this blog post is concerned, both names are problematic now.

With the benefit of Experience

It's no surprise that people have had this type of problem before and that there are legal patterns that a community can follow to help ensure that their IP, including any trademarks, have the sort of stewardship which helps attract members and build the community. Creating a new membership organization just for this API would seem onerous but the problem with choosing an existing one is that they'll have already established a way of dealing with the IP and it is unlikely to be a perfect match. Still, this seems to be the solution being explored by ADL, quoting again from the main thread: Plans are already being made to transfer ownership of the spec to an open standards body - this can't come too soon.

One final word of caution here. One of the grim duties that falls on the owner of a trademark is to ensure that it is being upheld and is not just becoming a generic term for a bunch of similar products from a variety of suppliers (hoover, biro etc). I recall the magazine Private Eye getting letters from lawyers when they used the word Portakabin in one of their articles. If confusion takes over around this API then none of the marks will be enforceable and we'll have to start all over again.


RSS Readers: in the dog house

So farewell Google Reader, I will miss you.

This week's announcement of the demise of Google Reader as part of the Second Spring of Cleaning seems to be an important milestone for the internet.

There's a lot of new blog articles lamenting its demise (to some extent, this is one of them) but we shouldn't be too shocked. The original concept behind RSS has been under threat for some time, in fact if you Google "War on RSS" you'll see an established idea that companies that have a powerful influence on the way we use the internet have been deprecating RSS for some time.

Perhaps the most interesting of these contribution comes from @vambenepe who wrote The war on RSS in February last year. It's a good overview of the way RSS reading features are going missing in systems we use to access the internet and contains this worrying quote:

Google has done a lot for RSS, but as a result it has put itself in position to kill it, either accidentally or on purpose. [...snip...] [... If] Google closed Reader, would RSS survive? Doubtful.

This particular commentator is interesting because since writing this he has moved on to become "Product Manager on Google Cloud Platform". Don't expect a follow up article but he did tweet yesterday:

"1 year ago, I asked: "If Google closed Reader, would RSS survive?" http://stage.vambenepe.com/archives/1932 We'll now find out but I won't be able to comment."

One of the takeaways here is that we're not just talking about RSS specifically. When we say RSS we can include Atom and readers of this blog will know that I'm a fan of Atom and the emerging OData standard that is based upon it. But let's not get carried away. This war is not on the protocol but on the use of RSS as a way of end users discovering content on the internet. The emergence of OData (based on the Atom Publishing Protocol, not the read-only RSS) as a protocol that sits between the web app and the data source is likely to get even stronger.

Even HTTP has changed. This blog post uses HTTP in an old fashioned way. I'm writing an article, inserting anchors that form hypertext links to other resources on the internet. I'm banking on the idea that these resources won't go away and that this article will join a persistant web of information. If you're reading this you're probably thinking, duh, that's what the internet is. In the early days this was true but the internet is no longer like this for the majority of users. HTTP sits as a protocol behind the web apps we use to check Twitter, Facebook and iTunes but the concept behind the way most people consume information on the internet bears no relation to the classic hypertext visions we used to cite when we were all researchers working in universities in the early 90s.

Go back and read the seminal As we may think or review the goals of Ted Nelson's Xanadu Project and you won't recognise the origins of iTunes, on-demand TV, micro-blogging or ad-supported social networks. From a UK point of view, we didn't even have commercial broadcast television until 1955 (when ITV was launched) which is 10 years after As we may think was published. The existence of these modern uses of the internet do not preclude the research use envisaged by these information scientists, it just relegates it to a niche.

The problem for people like you and me, who occupy this niche, is that the divergence of consumer internet technology from the original research oriented web is eventually going to make it more expensive. There's no law that says that Google has to provide an RSS reading tool for free (or a blogging service for that matter). In fact, the withdrawal of this service may actually provide a shot in the arm for the makers of RSS readers who have been starved by people like me who use the freebie Google Reader instead of their more tailored offerings. Yes, I would be prepared to pay to have something like Google Reader that stays in sync across my tablet, phone and laptop.

Ad, ad, ad...

While I'm on the subject of money, I do want to draw your attention to Xanadu's rule 9:

Every document can contain a royalty mechanism at any desired degree of granularity to ensure payment on any portion accessed, including virtual copies ("transclusions") of all or part of the document.

I really think it is time that technology providers started to look again at this goal. In the early days of the internet this was considered unrealistic. In fact, I remember sitting through meetings in which people responsible for creating the infrastructure that made the internet possible were highly doubtful that traffic accounting would ever be possible. The growth in internet traffic would always outpace the ability of switching gear and routers to count bits and report on usage. That prediction turned out to be wrong. I think they underestimated the strength of the business case behind bit-counting, which is routine on mobile platforms. My cheap router counts my own internet usage and I know my service provider has realtime stats too, if only to enforce their acceptable usage policy.

There have been a lot of haters for charging based on consumption of bits and this, in my opinion, has distorted the business models available to service providers towards ad-based services and away from the Xanadu-like micro payments.

Most of the rhetoric about the demise of Google Reader is taken from the point of view of the consumer, not the information publisher. Of course I want to consume content for free using free technology over an unlimited internet connection. But none of these things are really free. We've all heard the adage that if something is free then you're the product. As an RSS consumer, my costs just outstripped my marketable value to Google. I'm not a cash cow anymore, I'm a dog.

From Reader to Blogger

But as I type, I'm not just consuming the content I used to research it. I'm also publishing content of my own. At the moment for free. I don't want to enable ads on this blog but the technology doesn't yet make it easy for me, or anyone between me and you, to collect revenue and experiment with pricing. It's more complicated than you might think.

Rule 8 of Xanadu reads "Permission to link to a document is explicitly granted by the act of publication." Early internet sites seriously considered violating this principal. Content providers considered themselves to be so valuable that someone creating a site that aggregated links to their gems were somehow cheating the system. This has been turned completely on its head now, these days information providers are hungry for links and when those links result in product sales they are prepared to pay real money to the aggregator. This is the basis on which all the market comparison sites are run.

If content publishers got revenue from people viewing their materials (Xanadu style) then linking to someone's content becomes a valuable lead. How would payments trickle back to the owner of the <a> tag?

We know that the ad-model works. YouTube generates huge revenues for people like PSY. But for people outside the mainstream who occupy this niche, typified by users of Google Reader, we need another way to solve the money problem. Perhaps the new technology that emerges to take the place of Reader will come up with a creative way to address this issue. Especially if they start getting paid by their users.


OData: Open for Comments at OASIS

Browsing around this morning I noticed that on Friday (25th Jan) there was a test posting to a new mailing list set up by the OASIS technical committee that is taking forward the OData specification.

To recap, OData is a specification that extends the popular Atom Publishing Protocol (APP) with conventions that make it easy to expose data sources (think relational databases) in a standard way. OData has been driven by Microsoft and is now at version 3, but it seems to be making the transition to a work item at OASIS where it seems likely that a more open specification process will be observed.

I've written about OData before but the best way to play with it is to look at some sample feeds, the Netflix database is one I tend to use for my examples because the data is real and something that is widely understood.

With the work now at a more formal standards body I hope that some of the rough edges of the existing specification can be knocked off. This type of thing is important if OData is to make the transition from a specification which works well if you have client and server libraries from the same vendor to one which can be truly interoperable.

For example, the current specification makes a mess of defining the simple concept of a string-literal parsed from a URL. As a result, it is impossible to make a conforming URI which will get you information about an actor like Peter O'Toole. Here's a URL that a naive user might construct:


Notice that the single-quote character in O'Toole terminates the literal and, sure enough, Netflix returns an error.

Syntax error at position 22.

In fact, there is an undocumented way to get around the problem, using the SQL-convention of doubling the quote character:


I've posted a comment to highlight this issue to the new OData comment list, let's see what happens! It's a public forum so anyone can join though the work of the technical committee itself is behind closed doors (OASIS is a subscription-based membership organization).

I'm a fan of what the basic OData specification is trying to do so getting things like this fixed is important. Just looking at the XML file you get back from the above URI immediately opens up the wonderful world of linked data, giving me relative links like People(69540)/TitlesActedIn from which you can see details of all the films Peter O'Toole has acted in. Don't like XML? Just add ?$format=json to the URL and you can consume the list directly into your web-page.

Last year I gave a lightning talk at a CETIS event in which I encouraged people who were creating REST-based protocols as part of their technical standards development process to have a really close look at OData. Building new specifications using existing protocols can dramatically save time when drafting and make it much easier for people to implement afterwards. And even if OData is not for you, if your application is a good fit for a REST-based approach why not just use APP as it is? Forgot the additional complications of things like WADL, you don't need them. What's more, if you use APP then you can take advantage of existing implementations in web browsers to provide basic and easy to consume views of your data.