Data Retention Policy: What to Keep and What to Delete

Discover Which Data Is Most Important to Your Organization—And How You Can Get Rid of What’s Not

By | Nov 6, 2023 | All, Featured, Technology

Data is probably one of the most critical assets an enterprise can have today. As organizations progress in their digital transformation journeys, information quickly becomes the key to unlocking valuable insights that can change everything. With the right technology and analytics tools, organizations can use data from all of their systems and applications to improve operations, streamline processes, deliver more personalized customer experiences, help employees work more efficiently, identify and overcome production bottlenecks and much more.

Because digital systems use lots of data and also generate even more of it, the volumes of information that must be stored, managed, organized and analyzed is growing almost out of control for some organizations. Modern high-speed, high-capacity storage is helping resolve this challenge, but over time massive stores of data can get expensive. As collections of data grow ever larger the question becomes, should an organization keep it all?

The quick and easy answer is no. Some experts estimate that up to 30% of enterprise data is considered redundant, obsolete or trivial. And research from data platform vendor Splunk discovered that 60% of organizations report that “half or more of their data is dark—which means its value is unknown.”

The really tough part of that question is, which data should be kept and which data should be deleted? That’s a question with no widely agreed-upon consensus. The answer is different for each organization, depending on what data they have and what their business is.

There are, however, some best practices and general guidelines that can help when it comes to policies for business and enterprise data retention.

 

 

Types of data to generally keep

It’s probably safe to say that most of an organization’s data can be useful. Obviously, that would include information that feeds the day-to-day systems and applications needed to operate the business. But it goes beyond that as well. Some of the types of data that are important to retain include:

 

      • Customer demographics and purchasing habits
      • Industry data, market trends and information about competitors
      • Historical data such as past sales and trends to help track performance over time
      • Internal administrative data such as financial statements, business plans, etc.
      • Marketing data and advertising metrics
      • Product performance data
      • Sales and transaction data, including bookings, meetings, etc.
      • Social media data from customers and prospects to see how they are talking about the organization
      • Internal and external communications records
      • Accounting and financial data
      • Technology data, such as which technologies and tools employees use and prefer
      • HR analytics and employee records

Regulatory compliance is key

Most industries have some sort (or a lot) of governmental and industry regulations around how and which data should or can be gathered, stored, used, shared and retained. For instance, healthcare organizations must comply with regulations from HIPAA (Health Insurance Portability and Accountability Act), which governs how sensitive patient data is used and accessed. There are many conditions around privacy and even which types of data organizations can collect from patients and how long they can retain it.

 

Types of data to delete

As data storage costs decrease, it can be tempting for some organizations to think they should just keep all of their data. If it’s not useful now, it might be sometime in the future. But that sort of thinking can lead to big problems down the line.

Why? The biggest issue is that the potential amount of data is just too massive. In some industries, especially those involved in intensive research like astronomy and particle physics, organizations could be generating TBs of data every day. The bigger the data storage, the more time and effort it takes for IT to manage and control it. Simply backing up huge stores of data will essentially double or triple capacity needs and costs as well.

Hanging on to obsolete data can take a toll on productivity, too. It can bloat file servers and directories. A 2021 study by Wakefield Research discovered that “54% of U.S. office professionals agreed that they spend more time searching for documents and files than responding to emails and messages.”

No one denies that having the right data can be a game changer—and it will undoubtedly be a lot of data. It’s just that if some of that information is duplicated, outdated, irrelevant or simply unimportant there’s no reason to retain it.

In an article for CIO.com, Veritas Senior Director Jasmit Sagoo urged organizations to get rid of all that redundant, obsolete or trivial data because it could pose a security risk in the future. He said:

This is data that holds little or no business value and should be proactively deleted, especially when considering the data’s exposure and level of risk. For example, ex-employee and ex-customer data is very high risk. It can contain personally identifiable information so it’s only worth keeping this data for legal reasons. Financial records are particularly vulnerable to hackers and another example of sensitive data that needs to be managed carefully.

Other types of data that could open organizations to increased risk include:

      • User passwords stored in plain text or unencrypted files
      • Data associated with outdated, obsolete production systems or websites
      • Old copies of customer databases in the form of extracted XLS or CSV files
      • Any records or personally identifiable information that is past its mandated retention period

So beyond these few examples, how can organizations determine which data to keep or delete? Sagoo told CIO.com, “As a starting point, businesses need to be able to identify specific details within data, pinpoint the areas of risk and its potential value. It’s also important to understand what is stored, who is accessing it and how often.”

That information will form the basis for the all-important data retention policy.

 

 

Data retention policy: What it is and best practices

A data retention policy is a set of guidelines created by an organization that dictates its protocol for how and when to keep information—especially where it involves operational or regulatory compliance.

As mentioned above, it takes a clear view into an organization’s stores of data to create a solid and successful data retention policy. The policy is a critical part of the organization’s overall data management strategy.

General recommendations for developing a data retention policy include the following:

      • Decide who’ll be responsible for creating the policy. This is typically a team of stakeholders because it requires a wide range of expertise from across the entire organization. Part of this step is also determining who will oversee the data retention policy after it’s created and make sure it’s being followed correctly.
      • Gain a thorough understanding of what types of data you have, how it’s used, where it’s stored, its value to the company and the risk it could pose if stolen by hackers. This likely includes data mapping, which can be time-consuming. But you can’t protect, monitor or govern what you don’t know you have.
      • Gain a clear understanding of your legal requirements, including all the regulations from industry associations and governments that dictate data storage, management, usage and retention requirements.
      • Determine your business requirements for the data. This includes deciding how and what types of data to retain for business purposes and lists the business reasons it should be retained—plus next steps, such as the process of moving the data into archives and eventually deleting it.
      • Decide who will be responsible for doing internal audits to ensure policy compliance.
      • Decide how often the policy should be reviewed and updated.
      • Determine how the data retention decision will be implemented, monitored and overseen at the application level.
      • Write the official data retention policy and get approval from all key stakeholders.

Some best practices for data retention policies:

      • Avoid creating a single all-inclusive data retention policy that applies to every type of data. There are many reasons to hold on to some types of information longer than others. The policy should clearly differentiate each type of data and its unique retention requirements.
      • The policy should make a clear distinction between backup data and archived data. Backups are used to restore current data if it’s lost or stolen, and archives store data that is no longer being used but that still needs to be retained for compliance purposes.
      • When creating the policy, be sure to decide how to organize the information within it so it’s easy for users to search and access later—or to delete anything that is no longer applicable.
      • Some organizations use a premade template for creating their data retention policy. Many templates are available online and some are even free of charge.
      • For long-term data stored in archives, be sure everyone understands the time and fees it will cost to access that data.

 

 

Benefits of a good data retention policy

With a solid data retention policy, organizations can rest assured they’re not holding on to data they no longer need or that is no longer relevant or current. Other benefits include:

      • Decreased likelihood of penalties for noncompliance with data regulations
      • Reduced storage costs
      • Improved productivity by eliminating need for management and administration of outdated information
      • Reduced attack surfaces by eliminating old, unused data that could become a vulnerability in the system
      • Assurance that organizations are retaining the right amount of data for their needs and no more

 

Optimize use of saved data with NAND flash SSDs from Phison

As organizations develop their data retention policies and work to eliminate redundant, obsolete or trivial data, they will need reliable data storage for their most business-critical datasets and workloads. As a world leader in NAND flash controllers and SSDs, Phison can help.

Phison offers NAND flash SSDs and other storage solutions that deliver the speed, capacity, endurance and reliability that enterprises need to efficiently and easily store, manage, access, share and analyze their valuable data.

The Foundation that Accelerates Innovation™

en_USEnglish