Archive for the ‘Information Governance & ILM2.0’ Category

Storage isn’t free. Never will be. Management costs, opex, overwhelm capex expenditures.  I continue to scale the cost of managing storage, CMS, in my research. Depending on organization size and complexity, the scary thing is that I find it is growing again, ranging from $10k to $35k/TB/yr.

Now a new metric. Let me suggest that while I first published this metric in 1992, and continued publishing primary research on it through 1997, we have to look at it differently today. Here’s the math. The acquisition cost of most classes of disk arrays is between $1k to $4k per TB.  That means the annual ratio of CMS to storage cost is still 7x to 10x (same as in 1992 – another scary thought) but, guess what. That is the wrong way to look at it.

The top problem in storage in the datacenter, I first published in 1994 through today, is storage expansion. No, it is not hard to add disk drives. The problem is that expansion causes all storage practices, management, and services to have to expand as well to accommodate the new storage. It has a ripple effect. Now add cost of managing storage. Adding 1TB adds ~$25k of incremental cost to large organizations per year.  It is the per year thing that gets you now. Storage doesn’t just go away. It has a life. A better way to look at the CMS is over at least a three year life. Even retirement doesn’t mean capacity reduction rather it means replacement, so the cost is ongoing…  But, we have to pick a threshold otherwise this gets ridiculous.  At three years, the real CMS is $30k-$100k/TB  and the factor of Opex to Capex is really 20-30x.  Wow!

The point is that if we recognize the real cost of adding storage to the datacenter, we will be more judicious in its use. If you just stop and recognized that every TB you add of primary storage will add ~$50k of cost, what would you do.  Buy less? Not necessarily. What you definitely would do is cost-reduce your practices by doing things like deletion, deduplication, and tiering You can be more efficient in your use of storage, but that is a one time deal. A change in efficiency does not change the shape of the consumption curve. It just resets the baseline.  You still need to cost reduce your practices.

To summarize, I think that this is the best thing to happen to the datacenter in a long time. Due to budget constraints we are having to pay attention to practices and fix an IT system that is broken and does not scale due to ever growing cost.

I’d like to explain the many ways that information and data will be lost in a typical datacenter. Note that I say, will be lost. Data loss is inevitable, Information is lost the more it is handled, copied, moved, replicated, migrated, and as it ages.

The point is that Loss happens. Let me say that you can not stop data loss? The key questions are “How much will be lost?” “Do you care?” and “what can we do to reduce it?” Here is what I mean by lost?

There are 4 principle classes of loss.

The first category I call poor storage practices. By this I mean several things. In a relatively large file system with millions to trillions of files distributed across multiple sites, servers, desktops, test databases, DR sites, and remote web-servers or service providers trust me, lots of files will be misplaced and effectively lost by users and the system.  Loss occurs if you can’t find it, read it, or interpret it. I’d doesn’t matter how it was caused. All these are valid forms of loss.

Additional storage problems come from poor doc control practices such as losing track of versions or ‘official records’ and are compounded if you are using external services. What happens when files are sent offsite to a web host or storage service and if those services are down, corrupted, or go out of business and you can’t get your files back, Loss happens. As we move into focusing on Cloud Storage we’ll hear more of this problem surfacing. Remember, You risk fines or other penalties during litigation if information can not be discovered and produced. This is a cost of loss.

The second class of loss is through poor security practices. the most obvious is when a hacker or employee gets through your firewalls and takes information, views confidential or private information, or changes or damages information. We have all heard countless stories now of lost notebooks or tapes containing millions of records with personally identifiable information. Those all count as forms of loss. One of the worst nightmares in litigation evidence control is when an ex-employee shows up with historical files and emails that you don’t have since you followed your retention and deletion protocols and permanently deleted them on schedule. They took them while employees and now potentially have an advantage. Perhaps the only perspective to have on losing information is one of damage control and recovery. If you think otherwise, consider the next class of challenges.

The third class of loss is through human or operational errors. Human error is the number one cause of damage or loss and we are not likely to change that fact. It manifests in many ways, but the pertinent issue is whether or not your recovery systems work. Here’s the test. Your systems are faithfully backed up. But, how often and how thoroughly have you tested recovery? Backup works great when it is write once, read never. But, you might be surprised how often recovery is compromised. The alternative is to rebuild information from scratch. Costs estimates to do this vary ranging between $5k to $50k per Megabyte. Factor that thinking into your recovery strategies. ‘

The fourth class of loss is caused by process or practice errors. First – inappropriate deletion processes. Deletion is good. You must delete expired and disposable information when you can otherwise all you are doing is driving up operating costs, storage costs, and risk.

But, do it wrong and you may cause ‘spoliation’. Make sure your processes are correct and cleared with legal and then audit them.

Next, mistakes occur and here are two examples:

1st – during litigation evidence processing, if you lose authenticity, damage chain of custody, the evidence is as good as lost. You may not be able to present it.

2nd – during migration events many things such as these can happen. It is safe to say that Migration causes damage. After two migrations most IT people will openly admit they have lost some portion of the information. Migration data loss is significant. That is why all digital information is at risk long-term. We just don’t have good physical and logical migration practices in place as an industry.

For long-term retention and preservation I strongly urge you to get expert help. Talk to me!

In August of 2003 at the advent of the marketing boom of Information-Lifecycle-Management (ILM) we formed the Data Management Forum (DMF) within the SNIA in a merger with the ‘Enhanced Backup Solutions Initiative’ that I was managing. In September, we created the ILM Initiative. At the time, ILM was just talk.  No one had any products or a vision of what it could really become. Fortunately, I was able to do several customer research projects and work with the ILM Initiative to fill that gap.   By January 2004, the ILM Initiative had produced a vision statement, definitions, and a roadmap for ILM that positioned it as a set of IT processes, not a product.  ILM was defined as the polices and practices of managing  information and data based on their value to the organization in the most cost-effective manner over their lifecycle. This concept required alignment of many practices, services, and tools across the datacenter including new tools and new standards. Ironically, the vendor community was so embroiled in the hype-cycle that they were branding anything and everything ILM. Archive, email archive, db-archive, and tiering became the mainstay synonomous product classes.  5 startups were launched in 2003/2004 aiming at building the next ILM management platform.  It was an energized time that left the customers confused and ultimately irritated at the confusion of ILM stories and incomplete practices.

Paradoxically, once the concept of ILM was born in its full glory in early 2004, implementation efforts began only to be frustrated by two main problems. The tools and metadata did not exist around which to instrument and automate ILM practices and the issues of how do you get started kept running into deadends.  In June of 2006, DMF published a paper co-authored with ARMA  aimed squarely at the ‘how do you get started’ problem.  (Committees are hard places to produce and write new content ideas, so I ended up drafting much of the architectural content and John Webster was retained to arbitrate and tie all the text together. It took a whle, but we finally succeeded.) The paper, “Collaboration: The New Standard of Excellence”  (see my publications page for a copy)  launched a quiet revolution and has had profound impact on the adoption of ILM-based practices world-wide.

For the ARMA eDiscovery Conference in June 2007, I developed and presented a workflow expression to describe the entire process of getting started and beginning an ILM-based practice as a closed-loop holistic practice.  (presented below) This process paralleled existing service management processes such as ITIL (Information Technology Infrastructure Library) and matched existing RIM practices nicely. So, we adopted it in the ILM Initiative. Now what to call the entire process became a problem.  So, we came up with the idea of framing the entire practice in service management terms (to break away from the entanglement of ILM) and use ILM as the implementation and infrastructure service management piece.  The result was the term “Information-Centric Management”.  There is an important disctinction I’d like you to hang onto as we work through this term.  Terry Yoshi of Intel asked the operative question at one point.  He said, “What are we managing?” The answer is “we are operating services to meet the policies and requirements of the information and data in the datacenter over their lifecycle.”  We are managing services, policies, service catalogs, people, systems, and the information and data across time. Here is an extract out of a discussion recently held in the ILM Initiative.

++++++++++++++++++++++++++++++++++++++++++++++++++++++

DISCUSSION AROUND THE PAPER: “a terminology for ICM”

From: terry yoshii

Subject: RE: [ilmi] – Terminology of Information-Centric Management

Date: November 7, 2008 12:40:58 PM PST

Hi Michael,

My concern with “ICM” is that it could make things a bit more confusing … ECM, ICM, ILM, DM, SM, etc.  And I’m confused enough already.

The term information-centric management implies management services driven by information and/or information metadata.  That’s fine but with this interpretation, what is not clear to me is what you’re trying to manage.

ICM also seems to be much broader than ILM and I think I need a picture (including the relationship to ITIL/ITSM).

Some suggestions and comments on this process …

INFORMATION-CENTRIC MANAGEMENT PROCESS

Collaborate: with C-level sponsorship, bring the key departments together, agree on terminology,  and conduct a business impact assessment

And also define horizontal and vertical domains of authority/responsibility?

Identify: the organization’s information and data assets, compliance requirements, legal and security risks

Inventory? of business processes/applications, information, data, storage infrastructure?

Inventory? of business, regulatory, legal and security requirements and policies?

Classify: organize the information and data assets, including all information as it is created, into a small number of categories with common attributes and business requirements for management purposes

Classify business processes/applications, information, data, storage infrastructure?

Requirements: establish the business requirements and their corresponding policy classes for each class of information and data, set service-level objectives, and review the requirements with IT

Design: (it seems that most everyone skips this very important piece for some reason — Comment: MPeterson – I had assumed design and automation as part of “implementation”.  I do like it and propose we incorporate it. )

Establish ILM processes or modify existing IM processes to support ILM business requirements?

Define/update Information States to support ILM processes?

Define/update ILM specific policies

Define/update metadata attributes required to support ILM processes?

Define ILM Service operations, processes, and support structure

Define the storage infrastructure requirements

Plus design of all of the management stuff (i.e., ITIL interface, Metadata Repository & Mgmt., Policy Repository & Mgmt, Service Catalog, Service Agreements, Monitoring, Reporting, etc.)

Identify all gaps and requirements

Automate: Integrate and automate as many of the ILM processes as possible to maximize efficiency

Purchase and/or develop ILM automation tools

Purchase infrastructure components required to support ILM design

Implement: align the IT service catalog to the requirements and policies using ILM-based practices

Align business/application services required to IM/DM/SM capabilities though the IT storage service catalog?

Measure: audit and quantify the results, reporting to the managing committee

Improve: work together, measure the results, focus on improvement and set goals

BTW – I meant to mention this in other discussions but I think the Storage Service Catalog may need to map to the service domains and each requirement (i.e., Retention Period, Deletion, Protection, etc.) would then map to one or more specific capabilities within one or more service domains. Now, I’ll bet you want a picture. J

Terry

========

From Michael Peterson

Subject: Re: Terminology of Information-Centric Management DO WE CHANGE to ICM OR STAY WITH ILM?

Date: November 7, 2008 3:50:53 PM PST

Terry,

Thanks for the thoughts. Let me try to explain my perspective.  –and this is a good conversation to take to the community site, not bury in DMF email. I’d love to hear from your user community on a subset of this discussion.

The goal is to both step away from ILM a little (as it has been bastardized for so long)  and connect more tightly with service management practices as they deal with the operations and governance issues from a practice standpoint. We’ve found ICM describes what we’re espousing quite nicely at the front end of the process and it allows ILM-based practices to be the IT and infrastructure practice underneath ICM. So, I see ICM more like a module of ITIL that calls upon ILM-based practices for the implementation steps.  We fear we can’t lead with ILM anymore because of the fud around the term. We also know we have to frame the entire scope of the process otherwise it gets pigeonholed and confused again like ILM is currently. We know we need to make it look normal – thus the connection to service mgmt. And, we know we need to call it something, otherwise we can’t talk about it…  Nothing is easy here this late in the game.

So the perspective I have is this:

ICM is the most neutral and useful term we’ve found so far that defines a service mgmt practice that incorporates ILM-based practices just like a set of IT services. Said the other way around, IT uses ILM-based practices to implement the requirements and policies defined in the initial steps of ICM. Remember the workflow is:  Collaborate, Identify, Classify, Requirements, Implement, Measure, Improve. The whole process we are calling ICM. The implementation step is based on ILM.

We are certainly open to suggestions so we minimize the confusion going forward… And, for what it is worth, I have found some good supporting material for this change in the Information Management community and will discuss it separately.

You also said: “The term information-centric management implies management services driven by information and/or information metadata.  That’s fine but with this interpretation, what is not clear to me is what you’re trying to manage.”

Maybe you hit a key thought – Yes, that is right, we’re trying to manage services that support information and data in accordance with their value to the organization over their lifecycle. We’re not trying to manage the infrastructure or the information itself in the sense of work on or change the information’s content. Rather, we’re trying to operate services to meet the policies and requirements of the information and data over their lifecycle with efficiency.

We’ve invited members of the IEEE’s Mass Storage Systems and Technologies workgroup on digital preservation to join with SNIA members in reviewing the requirements for long-term retention and preservation. If you would like to participate in this discussion, please go to the DMF Community’s site and register to access it.

http://community.snia-dmf.org

This is an important conversation as we need to update these requirements and then extend them further as we consider the implications of bringing technologies and architectures to market to solve the two ‘holy grail’ problems of preservation – logical and physical migration. Please participate.

Three years ago we started the work on long-term digital information preservation in the Data Management Forum’s Long-Term Archive and Compliant Storage initiative, LTACSI. One of the first activities we held was a panel discussion at the SNIA’s June 2005 Symposium in Boston. Among the panelists was an archivist, MacKenzie Smith, Assoc-Dir for Technology, MIT Libraries and a datacenter practitioner, Jim Riggs, PERMS Program Manager, US ARMY who has a huge long-term retention challenge. Now the room was full of about 70 storage ‘geeks’ – the types that frequent symposia such as this. But, it also was attended by a few RIM/IT types and a CTOs from the handful of emerging archive systems companies like Permabit and Archivas, some email archiving companies, as well as a contingent of the CAS group from EMC. MacKenzie surprised us all when she told us in clear terms how difficult her work was with today’s storage systems and that the way we looked at ‘archive’ was wrong.

Point 1:

  • Based on feedback we got there, from our engagements with RIM and IT practitioners from ARMA and other groups including the SNIA End-User Council, and then from the important “Long-Term Digital Information Retention Requirements Study” I conducted for SNIA and published in January of 2007, we were continually admonished to stop using the “archive” word as it was too confused.
  • Here is a poignant quote from the survey: Records retention is different than depositing something in an archive. Archiving is a very problematic word and I would suggest not using it. It suggests dumping records into some bottomless pit where they can be forgotten. (Instead) Ingest (them) into a record keeping environment where they can be permanently preserved for long-term records retention seems better.

Point 2:

  • Engagements with ARMA’s RIM community and work on regulatory compliance brought out the importance of retention-periods, the setting of retention requirements, and proper disposition (meaning permanent deletion) of expired information to reduce the volume of information being stored long-term.

  • Paradoxically, our requirements survey as well as many informal audience surveys at conferences tell us that approximately 80% of the IT community still don’t know the requirements for the information they manage. A gauge of this disconnect can be seen in the many retention-requirements documents produced by RIMs that contain 2000 to 4000 specific record types and retention schedules.IT and IT systems can’t handle that type of granularity. (Thankfully, this thinking is dying out as people start talking and working together – we see classification catching on using just a few buckets.)
  • This gap is very important as it led us to begin the work with ARMA in stating that “Collaboration” is the starting point to “information-centric management” just as setting requirements for that information based on its value to the organization is the starting point for Information Lifecycle Management, ILM, based practices. (See the white paper we co-authored: “Collaboration: the New Standard of Excellence” linked on my publications page.)
  • Think about it now. Retention requirements are the focal issue to legal and RIM. Storing it off into a silo the focus of IT because they don’t have the authority to delete anything. No wonder, we have a disconnect around what archive means. Here are some definitions from their 2007 glossaries that illustrate the difference in thinking:

o ARMA – RIM: (context retention) 1. Used for electronic records, it is the procedure for transferring information from an active file to an inactive file, storage medium, or facility. 2. Act of creating a backup copy of computer files. See also BACKUP

o Society of American Archivists – Archivists: (context computing) – To store data offline.

o SNIA – IT: (context ILM) – (verb) To copy or move data for purposes of retention; to create an archive.

  • OK, I have to say something here about using backup for an ‘archive’. Don’t. Completely wrong thinking.  We’re trying to kill that message everywhere we can.

Point 3:

  • More information is being held long-term by more companies than any of us expected. In the requirements survey, 83%, of the 110 responding companies to this question, reported that they have to keep some information over 50 years.
  • What is long-term? Isn’t it relative. Yes, but we still need a number. Read my discussion on the definition of “long-term “ in the requirements study for the details on how this was derived, but for now let me just make the statement. In the LTACSI, we’ve adopted the definition that long-term is the period of time beyond which you start losing data. Today, that number is 10-15 years.

Now you have the background for what I want to say. The point is that we have to shift our thinking to using retention and preservation as the key terms, not archive. Let’s redefine archive similarly to what the digital archivist and library communities did in OAIS as an “electronic archive” defining a type of repository for long-term preservation, not as a verb which the storage community uses to connote “moving data into an electronic archive.” Throw the verb out! It is wrong thinking anyway as the notion of moving information around as it ages just adds cost and complexity. (aha, another discussion thread…)

The beauty of this switch is that it also changes our frame of reference and helps move the organization down the path towards information-centric management. Now, you don’t just say the words and its over. There is important work to do:

  • First, IT, RIM legal, security, and the business groups have to get together and collaborate to identify their information assets, classify them into a manageable number of buckets, and then set the retention requirements. (And, while at it set the other requirements too, please.) The mantra I teach for this process is “collaborate, identify, classify, requirements, implement, measure, improve”.
  • Second, we need the storage industry to recognize that information services such as ILM, retention, preservation, deletion, etc require the capabilities of managing information – not just the data. (See the discussion on the difference between digital information and data to fully appreciate this thought.)
  • Finally, we need a new storage architecture for long-term retention in the datacenter – not just a ‘preservation data store’ or another proprietary silo. And that is the point of this note. With it “archive” and backup go away and are replaced with retention and preservation.

I’ll discuss this architecture in another post titled “Virtualizing the secondary storage tier.”