A Short Guide About ZFS: The Last Word In File Systems

As insignificant as it sounds, there is a lot of responsibility that comes with being the last word in file systems. ZFS was once given the backronym Zettabyte File System in an attempt to give a non-word meaning. The truth of the matter is, ZFS doesn’t stand for anything. This very unique storage system is a combination of a file system and a logical volume manager (LVM), which means it handles both the physical aspect of data storage (such as hard disk drives) and its volumes (it’s logical arrangement into volumes, all the files stored within those volumes, as well as their status and condition).

Functionality

Though it’s commonly referred to as a copy-on-write file system, Oracle defines it appropriately as being a redirect-on-write file system. The distinctions between the two file system types lie in how they both handle original blocks of data after duplicating them. The copy-on-write file system overwrites original data blocks after reproducing an exact copy in a new location. On the other hand, the redirect-on-write file system (particularly the ZFS file system) updates the metadata so that it points—redirects—towards the newly written blocks.

By design, ZFS is capable of managing hundreds and even thousands of network-attached storage (NAS) drives on one network. It can do this on a solitary network while pooling all available storage as though it existed within one huge NAS drive. For this reason, ZFS file systems are highly scalable, supporting large maximum file sizes; if the system requires additional capacity, users only need to add more drives to the network.

Features

ZFS Send/Receive. ZFS and OpenZFS users can take a “snapshot” of the file system so that its image can be sent to different server nodes, allowing data from that image to be replicated to separate systems for the purpose of backing up data or data migration to the cloud.

ZFS and OpenZFS file systems are capable of producing point-in-time copies of file systems quite fast and efficiently since the system preserves all copies of the data. Snapshots are considered immutable copies (data is replaced rather than changed), while clones are mutable (data is modified and updated in-place regularly). With ZFS on Solaris, snapshots and clones are integrated into boot environments and allow users to conduct a rollback of previous snapshots in the event something goes wrong while updating or patching a system.

Lastly, ZFS snapshots and clones are beneficial as a recovery tool in the event of a ransomware attack.

Inline Data Compression. ZFS and OpenZFS file systems use inline data compression as a built-in feature to reduce the number of bits required to store data. In addition to that, both ZFS and OpenZFS file systems support several compression algorithms, allowing users the option of enabling or disabling inline data compression.

Delegated Permissions and Tighter Access Control. Since security is an ever-growing issue among organizations today, one can appreciate how both ZFS and OpenZFS file systems feature relegated permissions and finer-grained access control lists. Other valuable security features offered by ZFS and OpenZFS file systems include the option to set them to read-only and data encryption, which is supported by Oracle on Solaris.

Inline Data Deduplication. ZFS and OpenZFS are free of redundant data thanks to this built-in feature that enables storage efficiency. Both ZFS and OpenZFS analyze small-sized datums derived from blocks of digital data (also known as a checksum) in search of duplicates. Depending on what the blocks of digital data contain, checksums vary in size. As with the inline data compression feature, ZFS and OpenZFS allow users to enable or disable inline data deduplication.

RAID-Z: ZFS and OpenZFS file systems provide a data/parity distribution system similar to RAID 5 allowing the same data to be stored in various locations to improve fault tolerance. This means one can reconstruct data on lost drives using the data from the other drives on the system. RAID-Z-powered devices dynamically stripe data across all top-level virtual devices, increasing total data throughput for processing devices that request data faster than storage devices can provide. Dynamic striping permits each block to become its own RAID stripe which means each RAID-Z write is a full-stripe write. Combine that with the copy-on-write function of ZFS and OpenZFS and one eliminates parity inconsistency due to system crashes.

Another benefit of RAID-Z is that it detects and corrects silent data corruption, as well as self-healing data. This allows ZFS and OpenZFS file systems to read and compare RAID-Z blocks against their checksums. If a data storage device returns bad or damaged data, ZFS and OpenZFS file systems examine the parity to discern which disk returned the bad or damaged data; the data is then repaired and the correct data is sent to the user.

There are five RAID-Z types:

  1. RAID-Z0 which is similar to RAID 0 in that it offers no redundancy.
  2. RAID-Z1 which is similar to RAID 5 in that it allows one disk to fail.
  3. RAID-Z2 which is similar to RAID 6 by allowing two disks to fail.
  4. RAID-Z3 which is similar to a RAID 7 configuration, allowing three disks to fail.
  5. RAID-Z Mirror which is similar to RAID 1, allowing all but one disk out of the group to fail.

Advantages

The advantages of using ZFS include:

  • ZFS is built into the Oracle OS and offers an ample feature set and data services free of cost.
  • Both ZFS is a free open source filesystem that can be expanded by adding hard drives to the data storage pool.
  • The file system and volume manager are integrated by ZFS so it’s not necessary for users to learn separate sets of commands and tools.
  • ZFS file systems don’t require disk partitions to be resized in order to increase capacity.

Limitations

A few of the limitations include:

  • The rich feature sets can make the ZFS filesystem complicated to use and manage.
  • ZFS is limited to running on a single server.
  • ZFS checksum algorithms require processing power and have been known to affect performance.

 

RAID Inc. leverages free open source software with Lustre 2.12 and ZFS on Linux 0.7, unleashing the performance and scalability of the Lustre parallel file system for HPC workloads. ZFS is a robust, scalable file-system with features not available in other file systems available today. Backed by core Lustre developers at Whamcloud®, RAID Inc. offers high-performance, cost-effective Lustre over ZFS solutions with enterprise-level support. In order to learn more about RAID Inc. + Lustre on ZFS Solutionscontact one of our knowledgeable experts today.