What is GPFS? Part 3, Cluster Configs

GPFS supports a variety of cluster configurations independent of which file system features you use. Cluster configuration options can be characterized into three basic categories:

Shared Disk

A shared disk cluster is the most basic environment where the storage is directly attached to all machines in the cluster. The direct connection means that each shared block device is available concurrently to all of the nodes in the GPFS cluster. Direct access means that the storage is accessible using a SCSI or other block level protocol using a SAN, Infiniband, iSCSI, Virtual IO interface or other block level IO connection technology. This is a good configuration for providing network file service to client systems using clustered NFS, high-speed data access for digital media applications or a grid infrastructure for data analytics.

Network Block I/O

In environments where every node in the cluster is not attached to a single SAN, GPFS makes use of an integrated network block device capability over TCP/IP networks called the Network Shared Disk (NSD) protocol. The choice between SAN attachment and network block I/O is a performance and economic one.

GPFS clusters can use the NSD protocol to provide high speed data access to applications running on LAN-attached nodes. Data is served to these client nodes from one or more NSD servers. In this configuration, disks are attached only to the NSD servers. Each NSD server is attached to all or a portion of the disk collection.

GPFS has the ability to define a preferred network subnet topology, for example designate separate IP subnets for intra-cluster communication and the public network. Allowing access to the same disk from multiple subnets means that all of the NSD clients do not have to be on a single physical network. This can reduce the networking hardware costs and simplify the topology reducing support costs, providing greater scalability and greater overall performance.

Sharing Across Clusters

Multi-cluster environments are well suited to sharing data across clusters belonging to different organizations for collaborative computing, grouping sets of clients for administrative purposes or implementing a global namespace across separate locations. A multi-cluster configuration allows you to connect GPFS clusters.

By using this configuration, you can allow other clusters to access one or more of your file systems and you can mount file systems that belong to other GPFS clusters for which you have been authorized. A multi-cluster environment allows the administrator to permit access to specific file systems from another GPFS cluster.

Multi-cluster capability is useful for sharing across multiple clusters within a physical location or across locations. Clusters are most often attached using a LAN, but in addition the cluster connection could include a SAN.

Contact us to learn more about GPFS.