Scaring horror fans since the days of black and white, zombie movies have captivated audiences for years and we’re constantly bombarded with new movies, shows and video games featuring them. Whether you are a fan of this genre or not, the talk of the living dead has now spilled over into the tech world.
Your organization might have disaster recovery and a secure technical environment that’s protected against a zombie apocalypse, but is the threat already lurking hidden in your system? We don’t wish to alarm you but zombie data is real and chances are it’s taking up valuable resources and posing risks in your environment even as you read this post…
In the tech world, zombies can take many forms and the term is used in a number of ways. It can refer to a computer connected to the Internet that has been compromised by a hacker, computer virus or trojan horse which can be used to perform malicious tasks of one sort or another under remote direction. Most owners of zombie computers are completely unaware that their system is being used in this way, hence these computers are metaphorically compared to zombies. Botnets of zombie computers are often used to spread e-mail spam and launch denial-of-service (DoS) attacks.
In Unix operating systems, a zombie is a ‘child’ program that was started by a ‘parent’ program but then abandoned by the parent.
Zombie data on the other hand, represents a threat within most organizations, referring to enormous collections of data that lack purpose and insight. This information which usually has originated from former employees has no business value, no valid reason to be retained but is still being preserved, backed-up, and maintained on corporate networks. It’s zombie in that the user no longer exists, and the data is inactive (note, zombie data is different than data under legal hold, even though legal holds often outlast the employees who originated the data in the first place).
Most zombie data comes from files and file shares which IT organizations routinely dump off of devices when employees leave companies. And in most cases, it’s fairly straightforward to identify where the data came from, who it used to belong to, and ultimately what the company should do with it. If an organization has an information management strategy, all data within the company is retained, preserved or deleted using specific policies that enable compliance and good information governance practices.
One area often causing organizations the biggest headache is PST files.
PST files – A Zombie Zone Nightmare
As soon as a user is no longer maintained on active directories, any PST files they were using are technically ‘orphaned’ – i.e., they have no current owner. In fact, a PST file can become orphaned as soon as it is detached from an Outlook profile, but again, in many cases the file is attributed to the original owner.
But not so fast. There are plenty of PST files that don’t have an obvious owner – a good example are those “auto-archive” files which earlier versions of Outlook notoriously created as a means to backup older emails. The PST file may have gone out of fashion, but unless someone specifically deleted them – or even knew enough to – those PST files are still out there.
The problem gets compounded by backups and the transfer of legacy data onto corporate servers, either as part of routine desktop backups or as dumps of former employees’ data and devices.
The problem occurs because PST files are containers, not individual files, and once their correlation with the original owner is lost, even Outlook can’t restore those associations. These become orphaned PST files. These are your zombies – and chances are you have a lot of them.
Why Do These Zombie Files Respawn?
As soon as IT personnel begin looking into orphaned PST files, it might look as if they self-replicate, but the situation starts very innocently: a former employee used PST files as a convenient filing system in Outlook, and he therefore backed them up onto a separate place on his hard drive. Corporate IT did a nightly desktop backup and thus replicated both copies of this employee’s PST files. When the employee left the company, they dumped an image of his hard drive onto their corporate servers and proceeded to include this in backups as well. His replacement was given a copy of the former employee’s mailbox data – and PSTs – so she could come up to speed on some older projects on which he had worked. Suddenly, the company had at least four copies of each PST that they were backing up.
Because the employee was no longer in the directory, these PSTs were unassigned – so they were reassigned to the replacement. But the backup copies never got reassigned – and this particular company continued to backup and maintain these zombie PST files for years until a new IT organization undertook a comprehensive look at their entire data retention policies.
Declare War on Zombie Data
Zombie data poses two distinct problems. The obvious issue is storage. Even though companies may look at storage as “cheap” it really isn’t when it comes to zombie files: they are uncompressed, have no business value, and unlike a couple of emails, measure in the MBs or larger. They add up quickly, especially if they’re replicated across backups which are also saved on disk. Plus there’s overhead associated with any type of storage (even cloud). They really can suck the life out of your storage resources.
The second issue is less obvious, but a bigger problem. Zombie PST files may factor in eDiscovery. Legal queries typically stipulate the “who” and the timeframe and a few key terms; companies generally over-preserve to ensure they don’t accidentally delete something that’s relevant.
The IT organization will be tasked with finding any and all files that were appropriate to the users being put on hold, and those often include former employees. So now all these zombie files need to be queried – not an easy process, it’s not like reading a directory – searched for the appropriate employee or user, and when found, put on-hold, i.e. secured. The process is tedious and may turn up significantly more information that the company then needs to review and potentially produce – this is an expensive process. You can bring them back to life just long enough to extract any information needed for ediscovery or edislosure purposes before you eliminate them completely.
Survival Means No More Living With The Dead
Believe it or not, getting rid of zombie data is fairly straightforward, if you a) admit that you probably have it and b) deploy the necessary technology tools to find and eliminate it. Many companies find that it makes sense to take care of this zombie data as part of a larger PST effort. PST files are superfluous when companies undertake modern archiving solutions; their contents become archives and are much more easily managed. It makes perfect sense to locate, migrate and eliminate these troublesome files prior to strategic IT projects such as cloud or Exchange migration, desktop refresh, VDI or BYOD initiatives.
Products like Barracuda’s PST Enterprise were designed to tackle all a company’s PST challenges, including those zombie files. PST Enterprise deploys several routines to accurately identify the owners of orphaned or zombie PST files and allow the IT team to migrate, manage, and eliminate them. Often, the data is so far beyond the company’s retention scheme that it makes sense to simply delete that data – but only a product like PST Enterprise can query these files on a broad, automated basis, and provide the content details necessary to make retention decisions.
If companies are moving away from PST use, then a single pass at identifying and removing zombie data will ensure that it never returns.
For companies continuing to use PST files, for whatever purposes, then tools like PST Enterprise need to become routine parts of their IT regimen, periodically inspecting servers and file shares for PSTs which are orphaned or follow and taking the appropriate action against the energy-sucking living dead before you have a full blown invasion on your hands.
Rich is the Product Marketing Manager, Information Management. He's been with Barracuda since the acquisition of C2C Systems in 2014. Rich specializes in cloud-deployed solutions, information management, and archiving systems. His experience includes extensive work on OEM opportunities and the legal community.
You can email Rich at firstname.lastname@example.org.