As part of a burgeoning information governance (IG) project one of the most helpful things any organization can undertake is to run an audit of its unstructured data (its content) and if it has serious IG ambition it really needs to acquire the technology and skills to run and interpret its own audits. In short, when you truly understand what you’ve got, a whole range of possibilities open up and experience suggests that the most obvious is to get rid of some junk. Its a natural human reaction to feel that way and it makes good sense to do it too.
In my line of work, I utter the above statement endlessly in many different forms and despite much practice and a fast maturing IG market, I still find myself having to work hard to convince some organizations that its true. I guess the real problem is that the bean counters want (need?) to see some numbers that make ROI crystal clear (see my other article) and unfortunately people find that hard to do. The issue often lies in the lack of clarity around not only the cost of storage but also in how storage is charged and the intangibles that make up the resulting project (uncertainty of outcome, staff effort, stakeholder management, disposal/retention policies, managing legal hold etc). Let’s address these points head on:
With those points addressed, the question becomes one of cost benefit and ROI. First and foremost you need to understand the cost recovery relevant to the organization and apply the right model. In short, if you have a pile of sunk capital costs in storage infrastructure, seeking operating cost reduction through clean up can lead to disappointment. Also, if your service arrangement does not reduce charges with reducing volumes you’ll get no return there either. In such cases, you’re looking for return at milestone points by avoiding part of the contracted or capital commitment you need to make (avoiding purchase of 20TB of enterprise storage might save more than USD 100k on its own. For operating cost reduction, the variables are complex and varied and so I’ll make some simplifications and assume cost saving is possible based upon a volumetric charge.
Its surprising how hard per-TB operational costs and charges can be to pin down but let’s be clear that it is MUCH MUCH more than the cost of storage hardware and that it must account for blended costs across storage tiers including backup, cloud and similar. I see wide ranging estimates but across a global customer base and many different industries it all seems to boil down to a range USD 5k to 10k (GBP 3.5k to 6.5k) per TB annually. Regardless of the real number, the point I want to make is that this estimate provides a budget from the clean up that allows organizations to acquire and embed the necessary technology as a key capability for much broader, softer and more valuable IG benefits. Doing the maths suggests that modest content volumes (say 20TB) achieving a 30% clean up rate would provide a USD 35k/GBP 22k year 1 budget for the procurement. As volumes progress beyond 100TB the budget becomes much more significant and can support a compelling standalone business case.
So, my point is that content clean up is best considered part of a broader approach to IG. In many ways it is a key first step because it can actually be connected to real savings (be they in-year operational or longer term avoidance) and therefore provide a budget to acquire the technologies, processes and skills that will be needed to implement many other parts of an IG programme or strategy. Gartner, Forrester and the Information Governance Initiative do those justice in a number of their papers – suffice to say that risk reduction, migration cost avoidance, operational efficiency etc all figure heavily – and getting started with content clean up can often make good sense.
Comment LinkedIn