Guess how much stored data is ever used or accessed

Interview NetApp’s Chief Technology Evangelist, Matt Watts, is concerned about sustainability and data waste, even as his employer withdraws third-party support from the BlueXP classification.

In a 2023 report, just before the IT world became obsessed with AI, Watts wrote a foreword to a report [PDF] That highlighted how bad the data waste situation was becoming. That report noted that 41 percent of the data currently stored by UK organizations is unused and unwanted.

But “40 percent is low, very low,” says Watts The Reg. “What we’ve seen come back is that in some cases it can be between 70 and 80 percent.”

That’s a lot of redundant data on active servers.

Until recently, NetApp’s BlueXP classification tool was heterogeneous. “It doesn’t just look at NetApp storage,” says Watts, “it looks at all types of storage. It even looks at the cloud providers, S3 buckets and all that stuff.

“It then gives us back a whole bunch of metadata to tell us who owns it, what the file is, the permissions, and it also tells us the last access times.”

Watts describes it as “a deeper dive into actually bringing out what the reality of the situation is, not the gut feeling, but what it actually looks like.”

Watts estimates that 15 to 20 percent of data center energy consumption is storage, with fluctuations depending on what the hard drives are doing. According to the UK’s National Grid, around 2.5 percent of the UK’s electricity consumption is supplied by the country’s 400 to 600 commercial data centres. This figure is expected to rise to around six percent by 2030.

“The biggest challenge in managing data more effectively,” says Watts, “is ownership. And we’ve had this problem since I entered the IT industry over 30 years ago: who actually owns the data.

“You can give people as much knowledge as possible about what the data is. But you go into an organization and say to a group, ‘Well, it’s your data because you created it,’ and they’ll say, ‘No.’ no no – it’s IT. They own the data.’

“And then you look at IT… and IT will say ‘no no no no – we’re just the custodians of the data.'”

It’s a challenge that many administrators will recognize, and one that makes NetApp’s decision to focus the BlueXP (formerly Cloud Data Sense) classification on NetApp storage systems all the more difficult for data domains that don’t run on the company’s technology.

“Initially the product was heterogeneous,” Watts explains. “So we added the ability to support S3, it scans every NFS mount point, SMB and all that stuff. And it was a paid piece of software.

“What we saw was a lot of partners trying to build services around it… So we made the decision, as part of the launch in May, that we would use it as a way to better differentiate ourselves.”

That differentiation has led to support for, for example, Google Cloud Storage, Amazon S3, OneDrive, and so on. BlueXP Rating is now available at no additional cost as a core capability within BlueXP, but without many legacy features.

Given growing concerns about data waste and dark data crunching in data centers, could heterogeneous functionality make a comeback? Watts is non-committal: “I don’t think we’re ruling anything out. We continue to offer our partners the opportunity to use it heterogeneously.

“But in the future, the initial plan will cost no cost and be included for all customers. Longer term, we could change that. But right now we think this creates maximum value for people thinking about using NetApp or who are currently using NetApp.” ®

Leave a Comment