IT Is Power. Try Living Without A Single Breadboard For A Day.

Don MacVittie

Subscribe to Don MacVittie: eMailAlertsEmail Alerts
Get Don MacVittie: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Related Topics: Cloud Computing, EMC Journal, CIO, Memopal, Cloud Hosting & Service Providers Journal, Construction News, Cloud Computing Newswire, Cloud Backup and Recovery Journal, Green Storage Journal, Platform as a Service, CIO/CTO Update, LeanITmanager

Blog Feed Post

The Problem with Storage Growth is That No One is Minding the Store

Continued Growth of Storage Space amidst the Financial Crisis

In late 2008, IDC predicted more than 61% Annual Growth Rate for unstructured data in traditional data centers through 2012. The numbers appear to hold up thus far, perhaps were even conservative. This was one of the first reports to include the growth from cloud storage providers in their numbers, and that particular group was showing a much higher rate of growth – understandable since they have to turn up the storage they’re going to resell. The update to this document titled World Wide Enterprise Systems Storage Forecast published in April of this year shows that even in light of the recent financial troubles, storage space is continuing to grow.


I found both of these statistics interesting. People are taking storage tiering seriously these days, the cloud is being considered for some storage uses – according to Greg Ness, archival storage is seeing solid pickup, which is not a bad use for it, if you can adequately protect your archive files, but we’re still chatting about NAS vs. SAN, so I suspect there’s a decent amount of disjunction between what we’re considering for alternate tiers (I will continue what I started with “The Right Tier for the Job” and call Cloud the Cloud Tier). And frankly, very few organizations truly understand what they are storing, file-wise. Automated tiering does a good job of slicing and dicing by file-system metadata, but do you know what’s in those files? Because you’re going to need to if you make use of the Cloud Tier. In fact, you need to know what types of unstructured data you have, how they relate to corporate goals, how sensitive those files are, and aging policies for them. It’s not enough to key off of the extension, because one .doc or .docx file is the list of local hotels for interviewees and another is a summary of this quarters’ financials. Definitely not equal data that should have a docx policy that acts exactly the same on them. Particularly not if the policy points to the Cloud Tier and sends files off-site.

So as you ramp up for maximizing use of your storage, consider this… Throwing hardware at it is a temporary fix until it too fills up, in-place deduplication can help, but functionally, it is a cheaper version of throwing more hardware at the problem. Data growth will not stop any time soon, users will continue to make documents, presentations, spreadsheets, and a whole array of other unstructured data. Your best solution to deal with this problem is multi-faceted. First, implement tiering to keep the cost of storage down, second, if you trust your vendor, implement in-place dedupe, which will reduce storage space by some amount, depending upon the nature of your data (as you can imagine, all those instances of the corporate logo placed inside documents will dedupe really well), and learn what you have on-hand. It’s a tall order, but truly understanding the sensitivity level of a given type of document from a given group can help you decide if it can be stored on the Cloud Tier, which, because it is pay-as-you-go is attractive for storage of less frequently used documents.


But you have to know what’s in those documents, or accept the risk associated with storing them on someone else’s hardware. And most organizations are pretty conservative when it comes to storing their most critical information anywhere. Once the cat is out of the bag, it cannot be put back in, so erring on the side of protecting data is probably a solid plan. So to make the most of the Cloud Tier, you’ll have to seek understanding of your unstructured data that goes beyond “that LUN is assigned to accounting, so those documents need protection”, because accounting has non-critical documents too. More than you might think. But you won’t know unless you check.

The potential of cloud storage is, in my humble opinion, (did you just abbreviate that while reading it?), the ability to charge back with a clean, no-questions-asked bill for storage of non-critical data. Part of the unstructured data explosion is that there is no - and I do mean NO - motivation to clean up. IT just keeps expanding storage capabilities, and if anyone ever runs out of disk space, it’s an emergency of epic proportions that IT should have prevented. If you store in the Cloud Tier though, you can pretty readily present each department with a monthly bill for their share of the burgeoning data storage problem. Oddly enough, when it comes out of their budget, managers are much more likely to espouse a desire for their employees to keep their storage neat and tidy. Funny how that works.



Okay then. Now that you’re sold on the benefit of the cloud tier, let’s just extend that concept a little bit. You’ve implemented tiering, and your are or are not utilizing the cloud tier. It’s time for you to take charge  of data growth. Pick a time – eighteen months to two years is good – and if a file hasn’t been accessed in that long, archive it and delete the original. Seriously. No one is cleaning up the data, that’s why storage needs are continually growing. Imagine if you had kept every plate, cup, and glass you have ever owned since birth, just in case you need it. That’s how we’re ‘managing’ storage.

With no motivation for business unit owners to drive file asceticism, and no willingness in IT to enforce it, you are at least partially responsible for the never-ending cycle of ever more storage. Sure there are files that are relevant after two years of being untouched… But we all know they are the exception, far from the rule. It’s time to do some perpetual spring cleaning. Out with the old so you can bring in the new without paying for ever more space. But just like public cloud storage, you must know what’s on your storage or you can’t make intelligent decisions.

Of course you might be one of those rare exception cases – the CIO blog linked to in the “related articles” section talks about storage growth for a health care group. Since most of their storage is diagnostic images, there are laws regulating how long they must keep that data, even if the patient passes away. So they’ve got to find some other solution, but tiering and a private storage cloud helped them at least.

Or keep going the way you have been. NetApp just completed its best quarter ever, so if you stay on your current course, I’d recommend buying stock in them and EMC, because it’s easy to predict who’s paying and who’s earning on the current system of never-ending storage growth.

Read the original blog entry...

More Stories By Don MacVittie

Don MacVittie is founder of Ingrained Technology, A technical advocacy and software development consultancy. He has experience in application development, architecture, infrastructure, technical writing,DevOps, and IT management. MacVittie holds a B.S. in Computer Science from Northern Michigan University, and an M.S. in Computer Science from Nova Southeastern University.