The ECM Revolution

12/10/2004 14:06:36

In the early 90s it was ERP (Enterprise Resource Planning), and in the late 90s CRM (Customer Relationship Management).

The term "ERP" is no longer fashionable. None of the vendors use it any more - it seems so 20th century. But the product never went away, and most organisations now use some sort of integrated enterprise-wide applications software.

The term "CRM" has also gone out of fashion, but for a different reason. It didn't work. At least, it was very difficult to make it work, and stories of failed implementations abound. When last heard of, CRM was becoming a subset of ERP and both the user and vendor community were trying to work out what went wrong.

Now it's ECM's turn in the spotlight. ECM vendors have evolved from a number of different areas, which are now converging. They include the old document management and imaging players, like FileNet and Documentum. They include Web development and content management companies, like Interwoven and Vignette, and portal vendors like Hummingbird and Plumtree.

Established ERP vendors like SAP and Oracle are moving into the space. So are the business intelligence (BI) vendors, like Business Objects, Cognos and SAS. IBM, too, made its intentions clear with its acquisition last month of a small ECM company called Venetica. ECM is hot.

But what is it? It is software designed to gather, store, manage and present all of an organisation's digital content. That includes structured content, typically transactional data held in relational databases, but also unstructured content such as that found in e-mails, word processing documents, image libraries, Web pages and the like.

Estimates vary, but there is a rough consensus that structured data comprises less than a quarter of all the information an organisation possesses. Over the 50-year history of the commercial computer we have become very good at storing and managing structured data. After Ted Codd's 1969 invention of the relational DBMS it became much easier to store transactional data, and the ERP revolution of the 90s cemented structured data at the core of most organisations' information systems.

But while all this was happening great masses of unstructured data were also being digitised and stored on PC hard disks and departmental servers. Word processor, spreadsheet and e-mail documents have proliferated at a massive rate. In the 80s and 90s banks, governments and insurance companies digitised their paper documents.

Then the Internet hit, and organisations large and small built Web sites for promotional and other purposes. The word "content" came to be used for the great mass of information that found its way onto the Web.

Now the separate disciplines of storage data management, data management and content management are coalescing into ECM. A good indication of the trend is leading storage company EMC (with a confusingly similar abbreviation), which acquired document management company Documentum last year.

EMC is now promoting the concept of ILM (Integrated Lifecycle Management), a system where different types of data (or information - the terms are often used interchangeably) is parked in different types of storage, depending on its currency and its value. Archival data or backups are stored on tape - though increasingly disks are getting so cheap that is not necessary, transactional data on "off-line" disk, current data on "near-line" disk, and active data in memory.

EMC, and most other storage vendors, have increasingly become software companies in recent years. The disk drives themselves are now commodity items, incredibly cheap and reliable, and the real trick to their efficient use is in the storage management software that controls them. EMC's purchase of Documentum, and of a slew of other companies such as Legato and even Data General a few years ago, indicates where that vendor believes the storage industry is headed.

The ECM vendors have a range of interesting techniques and technologies for handling unstructured data, and for integrating it with structured data. Analyst company Butler Group has coined the term "Content-Aware Applications" to describe the tendency towards such integration. New standards such as Web Services and XML and its many extensions are emerging to handle this integration.

Last month W3C (the World Wide Web Consortium), headed by Web pioneer Sir Timothy Berners-Lee, announced SSML (Speech Synthesis Markup Language), an extension of XML that will bring high-quality synthesised speech to Web applications. SSML will enable the Web to understand and propagate voice-based content, just as it can at the moment with text. It is an aspect of the emerging "Semantic Web", proposed by Sir Timothy in his seminal 1999 book, "Weaving the Web". The Semantic Web will enable data contained in Web pages to be coded with an extra dimension of information that will enable computers to make sense of it.

We are part of the way there, with XML and emerging Web services protocols, but the Semantic Web will contain much more meaning. It will enable intelligent software agents to perform many of the searches and conduct many of the transactions that can currently only be undertaken by humans. Extend that capability to voice, and the possibilities are endless.

Consider also that voice traffic over the telephone is now largely digital, and therefore capable of being stored on disk. Most voice traffic now is lost in the ether once the words are uttered, but there are increasing demands that it be stored, for both security and commercial purposes. Storage is now so cheap that we have the technology, and the affordability, to do this.

Already clever technologies exist to mine unstructured voice data, just as we currently mine structured transactional data for patterns of interest to marketers or governments. This stuff is still in its infancy, but it will be big business in the years to come.

We are at an interesting inflection point in IT. Up until now, we have focused on the "T" - the technology. The real battle in this millennium will be over how we use the "I" - the information. The future belongs to unstructured data. ECM is just the first step.

graeme@philipson.info


[ Printer Friendly Version ]

[ Other stories about World Wide Web Consortium, W3C, Vignette, SAP, SAS, Plumtree, Interwoven, IBM, Hummingbird, EMC, Documentum, Data General, Cognos, Butler Group, SIR, Business Objects, Cognos, Oracle ]