Dark Data: Managing the Data You Can’t See

We’re excited to bring Transform 2022 back in person on July 19 and pretty much July 20-28. Join AI and data leaders for insightful conversations and exciting networking opportunities. Register today!


In today’s era of seemingly infinite data volumes and complexity, many enterprises inadvertently ignore an entire category of data that is essential to their data protection and management. On average, more than 50% of a company’s data is “dark” – information held in data repositories with no added or particular value. Dark data not only costs an average of $26 million in storage costs per year, but also poses significant risks to an enterprise’s security and compliance efforts, making it more important than ever to address the fundamental issues that cause it.

Dark data threatens protection

Most companies are not clear about the data they need to protect. Because dark data is often out of sight and out of mind for many enterprises, dark data reservoirs — which contain sensitive and valuable data — are becoming an attractive target for cybercriminals and ransomware attacks.

In addition, nearly half of senior IT decision makers are unable to confidently and accurately state the exact number of cloud services their company currently uses, even as enterprises implement a multicloud approach with both on-premises and public cloud resources as part of their data infrastructure. . If an organization doesn’t shine a light on dark data, especially dark data stored in the cloud, multi-cloud approaches can further broaden the door to cyber-attacks and cannot guarantee recovery at scale.

To survive any kind of ransomware attack, you need to know what, where your data is, and what it’s worth. The more organizations know about the data they hold, the more effective they are at understanding how to protect it from risk and recover from an attack.

Dark data threatens compliance

Unencrypted and unstructured data also pose challenges to meeting regulatory landscapes that are constantly evolving. For example, the California Consumer Privacy Act — or CCPA — which is currently limited in scope but will come into full effect by January 2023, will require companies — including data brokers — to send consumers notices explaining their privacy practices.

While we don’t have a federal data compliance law yet, states are following California’s lead. As data privacy laws extend to Virginia, Colorado, Massachusetts and New York, companies that identify and catalog their most critical information, remove information that contains no value, and ensure compliance with all local regulations are best suited to proactively manage information risks and gaps. in data management.

Tactically, enterprises can implement data capture, archiving, and monitoring capabilities to meet data compliance requirements. Better management of dark data helps companies comply with strict regulations and implement retention policies for their entire database.

Dark data and sustainability

In addition, dark data plays an important role in a company’s environmental compliance – another set of increasing regulations. As companies develop sustainability programs to meet carbon reduction standards, the environmental cost of dark data must be a priority. Dark data storage would release an estimated 6.4 million tons of carbon dioxide into the atmosphere by 2020. And the outlook for the future is even worse: Analysts predict a 91 ZB increase in dark data by 2025 (more than four times the volume in 2020). This means that dark data will continue to emit carbon into the atmosphere at alarming rates.

To protect the planet from the waste of dark data, companies must rethink their data management strategies, identify valuable data and rid their data centers and clouds of unnecessary data. By properly managing dark data, there is a significant opportunity for companies to reduce their carbon footprint, comply with industry environmental regulations and meet sustainability goals that are increasingly important to a wide range of stakeholders.

Manage and protect dark data

It is clear that dark data poses a threat to the security and compliance of an enterprise. So how can data managers better identify, manage and protect dark data within their business?

First, data officers must evolve and act from a proactive data governance mindset, enabling organizations to gain visibility into their data, gain control over data-related risks, and make informed decisions about what data to keep or delete. before a critical security event occurs.

Some of the tactics that data managers should implement to create a proactive mindset are data mapping, used to discover over there sources and locations of collected and stored data, and data minimization, used to reduce the amount of data stored and to confirm that retained data is directly related to the purpose for which it was collected.

Second, companies must also use technological advances to their advantage. Artificial intelligence (AI) and machine learning (ML) offer significant opportunities to effectively identify, manage and protect large pools of unencrypted, unstructured data and play a critical role in data management processes.

The ultimate goal is to manage the information, not just the data, at its source (edge) by quickly scanning, tagging and classifying information to ensure that sensitive or risky data is well managed and protected, no matter where it resides. located. As such, transparent AI and ML policies help companies gain full visibility into their data by detecting vulnerabilities and securing risks. That’s the next frontier.

Well-managed dark data provides organizations with a more secure and compliant future, lowers costs and enables actions through previously untapped intelligence, opening opportunities for organizational optimization and innovation within every business.

Ajay Bhatia is Vice President and General Manager, Data Compliance and Governance at Veritas Technologies

DataDecision makers

Welcome to the VentureBeat Community!

DataDecisionMakers is where experts, including the technical people who do data work, can share data-related insights and innovation.

If you want to read about the latest ideas and up-to-date information, best practices and the future of data and data technology, join us at DataDecisionMakers.

You might even consider contributing an article yourself!

Read more from DataDecisionMakers