GlusterFS – introduction

By | February 1, 2015

GlusterFS is a software-only product that you can use to build a distributed file system across multiple storage server nodes. It provides a highly scalable and unique data storage solution that utilizes the XFS file system. GlusterFS uses a native FUSE-based client to export the file system, but it can also be configured to use NFS and CIFS to export the storage system to the client hosts that cannot use the native client (FUSE). In addition to these features, GlusterFS can also provide some extra redundancy, if you decide to use a Replicated volume or one of the other types mixed with the Replicated type such as Distributed-Replicated, a type of setup which requires more server nodes.

 

Key features

 

  • Elasticity: Storage volumes are abstracted from underlying hardware and can be grown, shrunk, or migrated across physical systems as necessary. Storage servers can be added or removed from the system dynamically, with data being rebalanced across the trusted pool of servers. Data is always online, with no downtime.

  • No metadata: Unlike other distributed file system projects (for example, Lustre), GlusterFS does not create or use a separate index of metadata in any way. Hence there is no single point of failure or bottleneck. All servers in the trusted pool of storage servers have intelligence built in to locate any piece of data without a need for using indexes.

  • Scalability: As mentioned earlier, GlusterFS can be expanded (or shrunk) dynamically without any downtime. It can scale (up or down, as required) both to performance and capacity.

  • HA and high availability: Synchronous n-way file replication ensures high data availability and recovery. An asynchronous geo-replication is also available and supported in private and public cloud, datacenter and hybrid environments.

  • Flexibility: GlusterFS runs in userspace, so there is no need for kernel patches, custom modules, and so on. You can also reconfigure storage performance characteristics to meet your needs.

  • Geo-replication GlusterFS enables you to replicate the whole storage system between different datacenters or geographic locations.

 

Storage concepts

  • A brick is a basic unit of storage. In most cases, brick refers to a storage pool server and its exported local mount point, in the format: ServerName:/localMountPoint. For example, if a server storage1.example.com had a local storage mounted under /mnt/disk1, a brick would be referred to as storage1.example.com:/mnt/disk1

  • A Distributed File System is a file system that allows multiple clients to concurrently access data across multiple servers or bricks in a trusted storage pool.

  • FUSE This is a loadable kernel module for UNIX-like operating systems. It enables non-privileged users to create their own file systems without editing kernel code. This is achieved by running file system code in user space, while the actual FUSE module provides a bridge to kernel.

  • Geo-replication provides an asynchronous, continuous and incremental replication service from one site to another over LAN, WAN, and the Internet.

  • A volume is a logical collection of bricks. Most management operations in GlusterFS happen on the volume level.

 

Volume types

 

Type Description
Striped Data is striped across bricks, similar to RAID-0 striping. Striped volumes are best used in environments where high concurrency is required with very big files.

WARNING: A disk or server failure in a striped volume can result in serious data loss because data is spread across the bricks.

Distributed Files are spread across the bricks in the volume. This type of GlusterFS volume is useful when you need to scale storage and redundancy is either not important or is provided by other means.

WARNING: A disk or server failure in a distributed volume can result in serious data loss because directory contents are spread randomly across the bricks.

Replicated
This type of volume provides a file replication across multiple bricks. It is the best choice for environments where high availability and high reliability are critical, and when you want to self-mount the volume on every node such as with a web server document root (the GlusterFS nodes are their own clients).

Files are copied to each brick in the volume similar to a RAID-1. However, you can have 3 or more bricks or an odd number of bricks; usable space is the size of one brick, and all files written to one brick are replicated to all others. This type works well if you are going to self-mount the GlusterFS volume, for example, as the web server document root (/var/www) or similar where all files must reside on that node. The value passed to replica is the same number of nodes in the volume.

Striped-Replicated Striped-replicated volumes stripe the data across replicated bricks in the trusted storage pool. Best results can be achieved by using this type of volume in highly concurrent environments where parallel access to very large files and performance is of critical importance.
Distributed-Replicated In this type of volume, files are distributed across replicated bricks in the volume. You can use this type of volume in environments where the requirement is to scale storage and have high availability. Volumes of this type also offer improved read performance in most environments, and are most common type of volumes used when clients are external to the GlusterFS nodes themselves.

Somewhat like a RAID-10, an even number of bricks must be used; usable space is the size of the combined bricks passed to the replica value. For example, if there are 4 bricks of 20 GB and you pass replica 2 to the creation, your files will distribute to 2 nodes (40 GB) and replicated to 2 nodes. With 6 bricks of 20 GB and replica 3, your files are distributed to 3 nodes (60 GB) and replicate to 3 nodes, but if you used replica 2, they are distributed to 2 nodes (40 GB) and replicate to 4 nodes in pairs. This would be used when your clients are external to the cluster, not local self-mounts.

Distributed-Striped-Replicated This is type of GlusterFS volume distributes striped data across replicated bricks. Best results can be achieved by using it in environments with highly concurrent parallel access to very large files, where high performance and high availability are critical.

 

Graphical representations of the four major volume types you might find useful

[INTRO]

Distributed volume

 

 

Replicated volume

 

Striped volume

 

 

Distributed – Replicated volume

 

Summary

As shown in the preceding section, GlusterFS offers a whole set of different deployment scenarios, based on your requirements.

Note: GlusterFS cannot be used to share many small files between your web servers (for example, HTMLA templates or PHP scripts, especially if the application sends manyAPI calls to access these small files thousands of times.  Some GlusterFS performance options can be modified, but the modification might not be enough for this scenario.

Сomments аrchive