The Guardian and the Observer have just launched a DigitalArchive At present, they have the Guardian from 1821, the Observer from 1900 – both to 1975. Eventually they’ll have the Observer back to 1791, and both up to 2003. Rather than the current online editions – this is a full archive, adverts and all.
The archive was created by scanning archived copies and using a technology developed by Olive Software in Israel called componentisation”, which uses mathematical algorithms to work out where one article stops and the next ends. Other systems typically use (bored) people.
While the current Guardian Unlimited (back to 1999) is free – they’re charging for access to the archive. Which, given that most news archives charge for access to web sites, it’s really very good.