2013-03-15

RSS Readers: in the dog house

So farewell Google Reader, I will miss you.

This week's announcement of the demise of Google Reader as part of the Second Spring of Cleaning seems to be an important milestone for the internet.

There's a lot of new blog articles lamenting its demise (to some extent, this is one of them) but we shouldn't be too shocked. The original concept behind RSS has been under threat for some time, in fact if you Google "War on RSS" you'll see an established idea that companies that have a powerful influence on the way we use the internet have been deprecating RSS for some time.

Perhaps the most interesting of these contribution comes from @vambenepe who wrote The war on RSS in February last year. It's a good overview of the way RSS reading features are going missing in systems we use to access the internet and contains this worrying quote:

Google has done a lot for RSS, but as a result it has put itself in position to kill it, either accidentally or on purpose. [...snip...] [... If] Google closed Reader, would RSS survive? Doubtful.

This particular commentator is interesting because since writing this he has moved on to become "Product Manager on Google Cloud Platform". Don't expect a follow up article but he did tweet yesterday:

"1 year ago, I asked: "If Google closed Reader, would RSS survive?" http://stage.vambenepe.com/archives/1932 We'll now find out but I won't be able to comment."

One of the takeaways here is that we're not just talking about RSS specifically. When we say RSS we can include Atom and readers of this blog will know that I'm a fan of Atom and the emerging OData standard that is based upon it. But let's not get carried away. This war is not on the protocol but on the use of RSS as a way of end users discovering content on the internet. The emergence of OData (based on the Atom Publishing Protocol, not the read-only RSS) as a protocol that sits between the web app and the data source is likely to get even stronger.

Even HTTP has changed. This blog post uses HTTP in an old fashioned way. I'm writing an article, inserting anchors that form hypertext links to other resources on the internet. I'm banking on the idea that these resources won't go away and that this article will join a persistant web of information. If you're reading this you're probably thinking, duh, that's what the internet is. In the early days this was true but the internet is no longer like this for the majority of users. HTTP sits as a protocol behind the web apps we use to check Twitter, Facebook and iTunes but the concept behind the way most people consume information on the internet bears no relation to the classic hypertext visions we used to cite when we were all researchers working in universities in the early 90s.

Go back and read the seminal As we may think or review the goals of Ted Nelson's Xanadu Project and you won't recognise the origins of iTunes, on-demand TV, micro-blogging or ad-supported social networks. From a UK point of view, we didn't even have commercial broadcast television until 1955 (when ITV was launched) which is 10 years after As we may think was published. The existence of these modern uses of the internet do not preclude the research use envisaged by these information scientists, it just relegates it to a niche.

The problem for people like you and me, who occupy this niche, is that the divergence of consumer internet technology from the original research oriented web is eventually going to make it more expensive. There's no law that says that Google has to provide an RSS reading tool for free (or a blogging service for that matter). In fact, the withdrawal of this service may actually provide a shot in the arm for the makers of RSS readers who have been starved by people like me who use the freebie Google Reader instead of their more tailored offerings. Yes, I would be prepared to pay to have something like Google Reader that stays in sync across my tablet, phone and laptop.

Ad, ad, ad...

While I'm on the subject of money, I do want to draw your attention to Xanadu's rule 9:

Every document can contain a royalty mechanism at any desired degree of granularity to ensure payment on any portion accessed, including virtual copies ("transclusions") of all or part of the document.

I really think it is time that technology providers started to look again at this goal. In the early days of the internet this was considered unrealistic. In fact, I remember sitting through meetings in which people responsible for creating the infrastructure that made the internet possible were highly doubtful that traffic accounting would ever be possible. The growth in internet traffic would always outpace the ability of switching gear and routers to count bits and report on usage. That prediction turned out to be wrong. I think they underestimated the strength of the business case behind bit-counting, which is routine on mobile platforms. My cheap router counts my own internet usage and I know my service provider has realtime stats too, if only to enforce their acceptable usage policy.

There have been a lot of haters for charging based on consumption of bits and this, in my opinion, has distorted the business models available to service providers towards ad-based services and away from the Xanadu-like micro payments.

Most of the rhetoric about the demise of Google Reader is taken from the point of view of the consumer, not the information publisher. Of course I want to consume content for free using free technology over an unlimited internet connection. But none of these things are really free. We've all heard the adage that if something is free then you're the product. As an RSS consumer, my costs just outstripped my marketable value to Google. I'm not a cash cow anymore, I'm a dog.

From Reader to Blogger

But as I type, I'm not just consuming the content I used to research it. I'm also publishing content of my own. At the moment for free. I don't want to enable ads on this blog but the technology doesn't yet make it easy for me, or anyone between me and you, to collect revenue and experiment with pricing. It's more complicated than you might think.

Rule 8 of Xanadu reads "Permission to link to a document is explicitly granted by the act of publication." Early internet sites seriously considered violating this principal. Content providers considered themselves to be so valuable that someone creating a site that aggregated links to their gems were somehow cheating the system. This has been turned completely on its head now, these days information providers are hungry for links and when those links result in product sales they are prepared to pay real money to the aggregator. This is the basis on which all the market comparison sites are run.

If content publishers got revenue from people viewing their materials (Xanadu style) then linking to someone's content becomes a valuable lead. How would payments trickle back to the owner of the <a> tag?

We know that the ad-model works. YouTube generates huge revenues for people like PSY. But for people outside the mainstream who occupy this niche, typified by users of Google Reader, we need another way to solve the money problem. Perhaps the new technology that emerges to take the place of Reader will come up with a creative way to address this issue. Especially if they start getting paid by their users.

2013-01-30

OData: Open for Comments at OASIS

Browsing around this morning I noticed that on Friday (25th Jan) there was a test posting to a new mailing list set up by the OASIS technical committee that is taking forward the OData specification.

To recap, OData is a specification that extends the popular Atom Publishing Protocol (APP) with conventions that make it easy to expose data sources (think relational databases) in a standard way. OData has been driven by Microsoft and is now at version 3, but it seems to be making the transition to a work item at OASIS where it seems likely that a more open specification process will be observed.

I've written about OData before but the best way to play with it is to look at some sample feeds, the Netflix database is one I tend to use for my examples because the data is real and something that is widely understood.

With the work now at a more formal standards body I hope that some of the rough edges of the existing specification can be knocked off. This type of thing is important if OData is to make the transition from a specification which works well if you have client and server libraries from the same vendor to one which can be truly interoperable.

For example, the current specification makes a mess of defining the simple concept of a string-literal parsed from a URL. As a result, it is impossible to make a conforming URI which will get you information about an actor like Peter O'Toole. Here's a URL that a naive user might construct:

http://odata.netflix.com/catalog/People?$filter=Name%20eq%20'Peter%20O'Toole'

Notice that the single-quote character in O'Toole terminates the literal and, sure enough, Netflix returns an error.

Syntax error at position 22.

In fact, there is an undocumented way to get around the problem, using the SQL-convention of doubling the quote character:

http://odata.netflix.com/catalog/People?$filter=Name%20eq%20'Peter%20O''Toole'

I've posted a comment to highlight this issue to the new OData comment list, let's see what happens! It's a public forum so anyone can join though the work of the technical committee itself is behind closed doors (OASIS is a subscription-based membership organization).

I'm a fan of what the basic OData specification is trying to do so getting things like this fixed is important. Just looking at the XML file you get back from the above URI immediately opens up the wonderful world of linked data, giving me relative links like People(69540)/TitlesActedIn from which you can see details of all the films Peter O'Toole has acted in. Don't like XML? Just add ?$format=json to the URL and you can consume the list directly into your web-page.

Last year I gave a lightning talk at a CETIS event in which I encouraged people who were creating REST-based protocols as part of their technical standards development process to have a really close look at OData. Building new specifications using existing protocols can dramatically save time when drafting and make it much easier for people to implement afterwards. And even if OData is not for you, if your application is a good fit for a REST-based approach why not just use APP as it is? Forgot the additional complications of things like WADL, you don't need them. What's more, if you use APP then you can take advantage of existing implementations in web browsers to provide basic and easy to consume views of your data.

2012-12-05

Writing a stream to a zipfile in Python, harder than you think!

So here's the problem, you have a stream (a file-like object) in Python and you want to spool the contents of it into a zip archive. Sounds like a common requirement? It turns out to be very hard. I propose a solution here with hooks.

There are two methods for writing data to a zip file in the Python zipfile module.

ZipFile.write(filename[, arcname[, compress_type]])

and

ZipFile.writestr(zinfo_or_arcname, bytes[, compress_type])

The first takes the name of a file, opens it and spools the contents in to the archive in 8K chunks. Sounds like a good fit for what I want except that I have a file-like object, not a file name, and ZipFile.write won't accept that. I could create a temporary file on disk and write my data to that, then pass the name of the file instead but that supposes (a) that I have access to the file system for writing and (b) I don't mind spooling the data twice, once to the disk and once back out again for storage in the archive.

Before you protest, the ZipFile object only requires a file-like object with support for seek and tell, it doesn't actually have to be a file in the file system so (a) is still a valid scenario. We will have to ditch any clever ideas of spooling a zip file directly over network connections though. A closer look at the implementation shows us that once the data has been compressed and written out to the archive the stream is wound back to the archive entry's header to update information about the compressed and uncompressed sizes. Still, even if you are buffering the output at least you are dealing with the smaller compressed data and not the original uncompressed source.

So if ZipFile.write doesn't work for streams what about using ZipFile.writestr instead? This takes the data as a string of bytes (in memory). For larger files this is unlikely to be practicable. I did wonder about tricking this method with a string-like object but even if I could do this the method will still attempt to create an ordinary string with the entire compressed data which won't work for large streams.

Solution 1

The first solution is taken from a suggestion on StackOverflow. The idea is to wrap the ZipFile object and write a new method. Clearly that would be something good for the module maintainers to consider but it requires considerable copying of code. If I'm going to be so dependent on the internals of the ZipFile object implementation I might as well look to see if there is a better way.

Solution 2

Looking at the ZipFile implementation the write method is clearly very close to what I want to do. If only it would accept a file-like object! A closer look reveals that it only does two things with the passed filename. It calls os.stat and then, shortly afterwards, calls open to get a file-like object.

This got me thinking whether or not I could trick the write method in to accepting something other than the name of a file. I created an object (which I called a VirtualFilePath) and gave it a stat and open method. The implementation is not important, but this object essentially wraps my file-like object simulating these two operating system functions.

Unfortunately, I can't pass a VirtualFilePath to the operating system open function. I'll get an error that it wasn't expecting an instance. The same goes for os.stat. However, I can write hooks to intercept these calls and redirect the calls to my methods if the argument is a VirtualFilePath. This is basically what my solution looks like:

import os,__builtin__

stat_pass=os.stat
open_pass=__builtin__.open

def stat_hook(path):
 if isinstance(path,VirtualFilePath):
  return path.stat()
 else:
  return stat_pass(path)

def open_hook(path,*params):
 if isinstance(path,VirtualFilePath):
  return path.open(*params)
 else:
  return open_pass(path,*params)

class ZipHooks(object):
 hookCount=0
 
 def __init__(self):
  if not ZipHooks.hookCount:
   os.stat=stat_hook
   __builtin__.open=open_hook
  ZipHooks.hookCount+=1
  
 def __enter__(self):
  return self
 
 def __exit__(self, type, value, traceback):
  self.Unhook()
      
 def Unhook(self):
  ZipHooks.hookCount-=1
  if not ZipHooks.hookCount:
   os.stat=stat_pass
   __builtin__.open=open_pass

This code adds hooks which detect my VirtualFilePath object when it is passed to open or stat and redirects those calls. To make it easier to manage the hooks we create a ZipHooks object with __enter__ and __exit__ methods allowing it to be used in a 'with' statement like this:

with ZipHooks() as zh:
 # add stuff to an archive using VirtualFilePath here

There's one final detail to clear up. stat is supposed to return the size of the file but what if I don't know it because I'm reading data from a stream? In fact, closer inspection of the ZipFile.write method's implementation shows that it doesn't really rely on the size returned by stat as it monitors both compressed and uncompressed sizes and re-stuffs the header when it back-tracks.

The only other bits of stat that ZipFile.write is interested in is the modification date of the file and the mode (which it uses to determine if the file is really a directory). So if your file-like object isn't very file-like at all it won't matter too much because you only have to fake these fields in the stat result.

2012-08-14

Thomson Routers from plusnet: problems using Gmail over wifi?

Summary

If you came to this page because of a Google search for this problem then here is my advice in a nutshell:

  1. Turn off the "Web Browsing Interception" feature in your router by setting it to "disabled".
  2. Set your wifi to b/g operation only to reduce data speeds - you probably don't need more than 54Mb/s if you are on ADSL
  3. Power the router off, count to 30 and power it back on.
  4. If the problem returns, repeat step 3.
  5. If the problem comes back frequently consider using a cheap timer plug to power cycle the router automatically in the middle of the night.

Oh, while you're here, could you just look at my PC?

Everyone who works in any job that is vaguely computer related will know that sometimes, when visiting a friend or relation's house the conversation will come around to some little problem they are having with their PC or home network. I had just such an experience at the weekend. I would normally have attempted to post the technical workaround to the plusnet forum where the particular issue I was debugging is being discussed but you have to be a plusnet customer to use the forums and I'm just a visiting techie so I'm posting what I know here.

The Symptoms

The scenario is a simple home network with a Thomson TG585 v8 ADSL router supplied by plusnet and an ordinary Dell laptop. I actually got a report of this problem earlier in the year. The main symptom was that email with attachments couldn't be sent. As the laptop owner spoke to me on a cordless wireless phone I could hear some interference on the line each time the laptop tried to send the email.

The First Diagnosis

This network is in a fairly isolated spot with only one other house within wifi range so interference from other wifi networks seemed unlikely. However, the house does have a cordless phone with a signal booster to reach to outbuildings. As is often the case with wifi issues you have to suspect interference, some older cordless phones are simply unusable alongside 2.4GHz routers. The obvious test is to instruct the user to plug their laptop directly in to one of the ethernet ports on the back of the router. Sure enough, the problem went away.

So is wifi really incompatible with this make of phone? The phone is of a modern design and is supposed to be compatible with wifi but perhaps that booster is the problem? To make the cable-based workaround easier the router was moved nearer the laptop. This took it further away from the phone and the booster - I had hoped that this might improve wifi connectivity enough for attachments to work but the problem remained. Needless to say, next time I visited the house the subject came up!

The problem gets worse

I must confess, I had forgotten about the problem completely until I arrived at the house and attempted to send an email via their wifi network from my own laptop. I was able to receive all my email no problem, web browsing was fast but I could not send messages through either my work email or through my Gmail account. Could this be related to the problem with sending attachments? Here is someone asking a similar question: Gmail, Windows Live Mail and Plusnet problems?

The problem appears so black-and-white I began to suspect that it must be a block at the ISP until I read this thread on plusnet's community forum: Mail cannot send attachments. The thread ends with an answer from a plusnet employee but the original problem from someone using Apple Mail (as I do) goes undiagnosed.

Eventually one of my smaller email messages managed to sneak out of my laptop at the Nth time of trying. Intermittent problems may be the worst to diagnose but at least this showed me that it is just a really bad connectivity problem.

I'm not alone

At this point I went back to my first diagnosis. I tried every channel, I powered off every other wireless device I could find (including the phones) and I still had the same trouble. This is beginning to look like a bug in the router. I checked the firmware version (8.2.7.8) and found a new thread that matched my symptoms: Thomson TG585V8 poor upload speed over wifi and then the more general: Wireless connectivity on the Thomson V8. The second of these has 13 pages of discussion and brings the problem right up to date: but with no resolution!

The solution

To recap, we have a router that is perfectly capable of sending (downloading) data over wifi at high speed but is very bad at receiving it during uploads. I started suspecting some type of buffer overflow or error in one of the protocol stacks but restricting it to b or b/g operation didn't fix the issue either. I then started looking at other tools and settings in the router and came upon "Web Browsing Interception". I have no idea what this means but it was set to automatic. I found one post on a DSL forum which was enough to raise suspicion: Web Browsing Interception?. I disabled it and applied the settings, almost immediately my email started going through as normal.

The same feature is cited in this thread relating to different Thomson hardware (which I found later): "Solved: Thomson Speedtouch Modem, disconnects often and slows down browsing".

So did this solve the problem with attachments? No. At least not directly. Turning this setting off stopped the router interfering with mail clients making outbound connections to Gmail and Microsoft mail products. But I was still getting larger attachments stuck at about the 100Kb mark. It may be that all I have done is free up enough resource (RAM or CPU) to send small messages by turning off a feature of the router software. In which case, it seems likely that a resource leak of some kind is to blame in these routers. To test the theory I powered the router on and off, waited ages for it to restart (it really is slow) and then tried again. Everything worked fine - all attachments sending without error, even large ones.

So is this Web Browsing Interception feature to blame? Possibly, the router has been powered on and off before without fixing the problem so it seems likely that this feature is either causing or exacerbating a resource leak in the unit.

2012-07-08

The demise of iGoogle: is this the beginning of the end for widgets?

What's happening to iGoogle? - Web Search Help:

So I woke up this morning and found a warning on my home page.  iGoogle is going away it calmly announced.  November 2013, but earlier than that if you are using a 'mobile' device.

So that means my home page is going away.  I can finally give up on the idea that Google will come up with a decent iGoogle experience on the iPad, or even on my Android phone.  I may have just upgraded to a slightly larger 15" MacBook but it is still portable - it must be, I've just carried it from the UK to New Orleans ready for OSCELOT and Blackboard's annual 'world' conference.  In fact, thanks to United Airlines' upgrade programme I was even able to plug it in and use it for most of the flight.


So what am I going to miss about my home page?  I'll miss the "Lego men" theme but I can probably live without Expedia reminders which count me down to my next trip.  In fact, if they would only just say something more appropriate than "Enjoy your holiday" I might miss that gadget a bit more.  I have quick access to Google Reader but I increasingly find myself reading blogs on my phone these days - it just seems like the right thing to do on the tube (London's underground railway).  I'm not sure about Google bookmarks - I assume they're staying so I might have to make them my home page instead.  Finally, I'm not sure I've ever actually chatted to anyone through iGoogle.

So I'm over iGoogle, not as easily as Bookmark lists but nothing I can't handle.

Is this the end of widgets?


iGoogle was one of the more interesting widget platforms when it launched.  You can write your own widgets, get them published to their gadget list so that other users can download them and install them on their iGoogle page.  They're small, simpler than full-blown computer applications on the desktop, simpler and smaller than complete websites.  iGoogle is a platform which reduces the barrier to entry for all sorts of cool little apps.  It is particularly good for apps that allow you to access information stored on the web, especially if it is accessible by JSON or similar web services.

You may notice a strong resemblance to mobile apps.  Google certainly have and this is the main reason why iGoogle is going away.  It is no longer the right platform.  People with 15" screens organizing lots of widgets on their browser-based home page are an anachronism.  These days people organize their apps on 'home screens', flipping between them with gestures.  They don't need another platform.  Apple have already seen this coming, in fact, they are having a significant influence on the future.  There is already convergence in newer versions of Mac OS.

There's a lot of engineering involved in persuading browsers to act as a software platform.  The browser doesn't do it all for you (and plugins do not seem like the solution because they are browser specific).  There are a number of widget toolkits available for would-be portal engineers but the most popular portals tend to have their own widget implementations (just look at your LMS or Sharepoint).

For many years I've been watching the various widget specifications emerge from the W3C, lots of clever engineers are involved (those I know I hold in high regard) but I'm just beginning to get that sickening feeling that it is all going to have been a learning experience.  At the end of the process we may have a technically elegant, well-specified white elephant.

As someone who has spent many years developing software in the e-Learning sector I've always found it hard to draw the line between applied research which is solely for the purposes of education and applied research which is more generally applicable but is being driven by the e-Learning sector.  As a community, we often stretch the existing platforms and end up building out new frameworks only to have to throw stuff away as the market changes underneath us.  The web replaced HyperCard and Toolbook in just this way - some of the content got migrated but the elaborate courseware management systems (as we used to call them) all had to be thrown away.













'via Blog this'

2012-07-02

QTI Pre-Conference Workshop: next week!

Sadly I won't be able to make this event next week but I thought I'd pass on a link to the flyer in case there is anyone still making travel plans.

http://caaconference.co.uk/wp-content/uploads/CAA-2012-Pre-Conference-Workshop.pdf

I'm still making the odd change to the QTI migration tool - and the integration with the PyAssess library is going well.  This will bring various benefits like the ability to populate the correct response for most item types when converting from v1.  So if you have a v1 to v2 migration question coming out of the workshop please feel free to get in touch or post them here.

'via Blog this'

2012-06-19

What is ipsative assessment and why would I use it? | Getting Results -- The Questionmark Blog

What is ipsative assessment and why would I use it? | Getting Results -- The Questionmark Blog:

Having recently registered for the eAssessment Scotland conference I was reminded that last year I learnt a new word there: ipsative assessment.

This is a nice summary on the subject from John Kleeman.

'via Blog this'