Josh Payne on content analytics, enterprise content and information management

Archive for the ‘Content Assessment’ Category

Throwing disk at the problem isn’t the long term solution

leave a comment »

Last week I gave 8 talks on the topic of content analytics over the course of 2 regional marketing events in Washington DC and Atlanta. Having given that many talks on related topics so frequently in such a short time period, I found myself locking in on a few key statistics and facts, and I was reminded of that fact as I read Craig Rhinehart’s most recent missive on his blog. In my talks last week I similarly made the point that the “save everything” ethos described by Craig is losing steam. Why? The cost of storage isn’t dropping as quickly as the information is being generated. Organizations are coming to the realization that it’s simply not cost effective to ‘throw storage’ at the problem. The statistic I found myself using repeatedly last week was cited in a recent Forrester blog posting


It’s no surprise that Forrester clients report their storage capacity requirements are growing 20% to 40% each year. Storage costs have grown to 17% of the IT hardware budget, up from 10% in 2007.

That jump from 10% to 17% is what I found myself repeating last week. Cost per GB is going down every year. But organizations keep on spending more and more of their budget on keeping stuff. Throwing more storage at the problem (and avoiding the cause) has simply led to increased costs across the board. Not the hallmark of an effective, long-term solution.


Written by Josh Payne

May 10, 2010 at 1:21 pm

Posted in Content Assessment

You Need Content Analytics to Determine the Value of Content

with one comment

I went on vacation last week.* (side note — though I’ve embraced twitter, foursquare and other modern public media platforms, I’ve yet to embrace the idea of broadcasting to the world the fact that my house was completely empty and I was 1000 miles away to the world at large – call me old fashioned if you must).

I mention it not to gloat about how much fun I had with my kids, but to bring up what I did the day before I departed. Again, call me old fashioned, but I typically get my books not from amazon, a bookstore or via an iPad, but from a more cost effective source: the public library. Quaint, I know.

When I go to the library, I can’t go without a plan. I can’t simply browse the stacks to find a good book. Yes, the library is well organized (good classifications!). And each book has good information on the cover describing the contents (standard metadata!) like author and title. But that information exterior to the contents just is not effective in helping me quickly determining the value of a book relative to my needs. I prepare in advance by reading reviews of others – other people who’ve read the books and analyzed their value. Otherwise finding a good couple of books for my vacation is an overwhelming and frustrating task.

The same idea – expending effort to analyze the long-form text inside content – applies to the content inside your organization. In previous postings I’ve discussed the value of content assessment to your organization. And to execute content assessment you need to execute content analytics. Historic approaches to tackling the content assessment problem have focused on  metadata exterior to a document – the title, the author, the dates. This is much like trying to find a library book just by browsing the stacks. Determining what content is necessary to your organization – what content is valuable, requires governance, is legally relevant – is virtually impossible simply by examining data exterior to your content.

Content analytics provides your organization the ability to determine the value of your content by interrogating the interior of those documents. Metadata on the outside of a document is only part of the story. What concepts are covered in the document? Does this document concern itself with a customer? A business partner? Does this document concern itself with a particular business activity?

All of these questions are difficult to answer without examining the text in a document but given the volume of information in your organization, it’s difficult to actually make these assessments on a large scale basis.  In my next posting I’ll cover how content analytics can help to answer these valuation questions.

Written by Josh Payne

May 7, 2010 at 7:48 am

The Value of Content Assessment

with one comment

In my previous post, the first in my series on content assessment, I described the information landscape with respect to content. Organizations are facing ever increasing volume, velocity and variety of information. Understanding growing piles of uncontrolled content through content analytics has clear benefits to organizations of every size. Each organization – and the range of stakeholders in those organizations – will benefit from engaging in content assessment. How? In three main ways:

1) There is value to all stakeholders in simply understanding content better through analytics. Dynamically analyzing silos of unmanaged, uncontrolled content via content analytics provides new insight about this information stakeholders previously did not have. Before, stakeholders simply knew the ‘speeds and feeds’ about a content repository: the number of documents, the size of those document, etc. Content analytics now delivers insight about the content and that insight leads to better, more informed decision making. Which areas represent the most risk? Where should we start our governance efforts? Where should our priorities lie? What is the projected ROI of better information lifecycle governance?

Today, organizations make these kinds of decisions about their unstructured content repositories with limited data. More likely, they avoid making decisions because they lack this kind of insight. No longer. Improved understanding and insight about your unstructured information leads to better decisions about how to take action.

2) One such action to take is to decommission content, the systems that support that content and the systems that rely upon that content. Decommissioning is primarily an IT concern. They manage the costs of the information infrastructure. By default, most organizations have been doing nothing with their content. And as such their infrastructure costs have continued to rise. With an understanding of the content, you can take on these these once avoided decisions with more confidence. By understanding the content in a particular system, you can take action to shut those systems down and save costs.

3) There is a flip side to decommissioning old content and the systems that support content. It is that by understanding content, you will be empowered to preserve the necessary content. Preserving the necessary content enables the decommissioning you want to execute.

Content assessment provides you the ability to identify content that is valuable. This makes general line of business users happy, as they are resistant to decommissioning because they don’t want you to throw away ‘something they’ll need’ in the future.

Content assessment provides you the tools to identify content that requires lifecycle governance. The compliance officers and records managers will be happy because your organization’s obligations will be met in a documented process. You will be taking steps to enforce your content policies on disposition of content while still working to control your costs.

Content assessment provides you the tool to identify content that is legally relevant. The lawyers will be happy because they can use it to find the information relevant to legal cases where it resides in uncontrolled environments – and exert the kind of control the eDiscovery process demand.

Three main ways content assessment delivers value to your organization: via understanding of you content on its own; via decommissioning and consequently reduction of IT cost; via preservation and governance for fulfilling the needs of line-of-business stakeholders and compliance minded stakeholders alike.

Next in the content assessment series . . . what content is ‘necessary’ to your organization and how does content analytics help to make this determination?

Written by Josh Payne

April 21, 2010 at 9:30 am

My College Laundry Habits and Your Organization’s Content Habits

leave a comment »

First in a series of posts on content assessment.

Not to scale . . . my college laundry piles were *much* biggerIt has been quiet around this here blog. One reason was that the month of March saw two “once in 50 year” rain storms in the Boston area. I got to learn some valuable skills in flood prevention as a result – unfortunately, those lessons came at the cost of activities like blogging and tweeting . . . but I’m back and ready to roll with a series of posts on a topic I’ve been thinking and working on over the past 3 months – content assessment.

I introduced this topic after our original announcement for our content assessment offering. And I’ve spent the last few months talking to IBM customers, analysts and other enterprise content professionals inside IBM. It’s an exciting application of content analytics technology to solve a class of problems that our customers have traditionally ignored . . . and hoped that it would go away — kind of like my laundry in college. Back then I kept on wearing my clean cloths day after day, hoping my laundry would magically wash itself. Not surprisingly, the cloths kept piling up. Finally, a random Sunday afternoon would arrive; I’d wake up, bite the bullet and wash my cloths. Ah  . . . to be 19 again . . . I digress.

Much as I continuously generated dirty cloths, organizations continue to generate content. And similar to the haphazard piles of laundry in my dorm room, these chaotic uncontrolled piles of content aren’t cleaning up themselves. And these piles of content are growing at a much faster pace.

In college, I’d wait until I couldn’t stand it anymore. And then I’d take action to take control of my clothing situation.  With the velocity, volume and variety of content growth, organizations are hitting a similar stage. They can’t maintain the same ‘do nothing, save everything’ practices about the content. The day has arrived to tackle those piles.

To IT, the costs are continuing to rise upwards (17% of IT budgets are devoted to storage alone, up from 10% just a few years ago). Records managers increasingly realize they can’t rely on users to identify and control business records. Legal needs to find the documents they need for eDiscovery proceedings – and fast.  Line of business users need better access and control of trusted content to better execute their business activities.

These information stake holders need better control over the necessary information for their business. But to take action to exert that control they need better understanding of their content landscape. They see the mounds of content, as far as their virtual eye can see. Years of bad content habits have created an intimidating problem that leaves them paralyzed as to how to solve it.

Content assessment solutions – powered by innovations in content analytics – are now ready to meet this challenge. Content assessment solutions deliver the kind of understanding organizations need to make decisions about their content. Empowered with insight about their content via content analytics, organizations can now take action. They can take action by decommissioning the content they no longer need. They can take action by decommissioning the systems and infrastructure that supports their unnecessary content. And they will be willing to take these cost cutting actions because they’ve identified and preserved the content that is necessary to their organization.

In the coming days and weeks, I’ll post more in this series of posts on content assessment – covering in more detail who benefits from content assessment, what those benefits are, and the key elements to a content assessment solution. Its an exciting new solution area.

You can’t avoid the grappling with the piles of content . . . just as I couldn’t avoid doing laundry.  If your content governance practices are analogous to my college laundry habits, content assessment is an idea you need to learn more about.

Written by Josh Payne

April 15, 2010 at 3:57 pm