Hammerspace Storageless Data Architecture

Introduction to Hammerspace Storageless Data

Hammerspace provides storageless data software. But what does that really mean in a world of often confusing acronyms and buzzwords? Storageless data may sound like a contradiction in terms but it is not. It is certainly no less dichotomous than serverless compute which by now is a well-known technology and industry-accepted term. Does serverless mean that there is no server? Of course not! Serverless orchestrates computing power so that less time has to be spent managing the underlying server infrastructure. Serverless, therefore, is a time savings function that brings compute closer to its end-users and, consequently, accelerates time to completion.

Storageless data follows the same exact paradigm from a data management perspective. Data obviously has to be stored somewhere just as compute has to be delivered by a server of some sort. Storageless data software from Hammerspace orchestrates data so that less time has to be spent on managing the underlying storage infrastructure. Storageless, therefore, is a time-savings function that brings data closer to its end-users and, consequently accelerates time to completion.

Hammerspace takes the Storageless concept a significant step further by enabling data to be stored on virtually any underlying storage infrastructure, including Hammerspace itself. Let’s take a closer look at the various deployment methods as well as the rich options for leveraging virtually any underlying storage infrastructure. Let’s also take a look at how Hammerspace can turn disparate islands of storage into onecohesive unit. Data orchestration gives you the ability to organize, manage, and deliver all storage as a single logical unit. It also allows you to organize and deliver all unstructured storage services under a global file system. We will take a closer look at the Hammerspace Global File System in an upcoming blog. For now, let’s focus on how to architect a Storageless Data solution and its building blocks.

 

Hammerspace Deployment Reference Architecture

From a high-level perspective, we can divide a Hammerspace deployment into three logical layers, each with a distinct function. They are presentation, orchestration, and infrastructure. The presentation layer is made up of the front-end protocol interfaces or “customer end” of the architecture. This is constituted by NFS, SMB, and a Kubernetes CSI. The NFS protocol interface can be further divided into NFSv3 and NFSv4.2. The latter iteration of NFS offers parallel performance, among other things, with unrivalled performance. Our SMB implementation, of course, includes SMBv3.

The orchestration layer consists of the two software components. The first one, called Anvil, is the “brain” in the architecture. It manages all of the metadata and provides all of the higher-level control functions. Anvil also provides a GUI as well as a CLI and REST API for managing the entire deployment. DSX, which stands for Data Services eXtensions, provides the front-end protocols and data movement, along with other key servo functions. It is “the arms and legs” of a Hammerspace deployment. Front-end protocols include SMB versions 2 and 3, NFS versions 3 and 4.2. The latter allows higher performance through parallelization. In addition, there is also a Container Storage Interface (CSI) for Kubernetes workloads. The Hammerspace CSI supplies persistent volumes for block, file, and NFS shared storage. Anvil and DSX together provide Data Orchestration.

The infrastructure layer can be stitched together in many different ways from virtually any storage technology from any vendor. Hammerspace has done specific integrations with NetApp, Isilon and Cloudian, but any NAS will work, along with SANs, Object storage, or Cloud. It is also possible to leverage Hammerspace itself for software-defined storage which will be covered in more detail below.

 

Deployment Methods

Anvil and DSX can be installed in three different ways: on bare-metal commodity hardware, virtualized on a hypervisor, or through cloud providers. These two components are inextricably interwoven. Together, they deliver the front-end protocols by leveraging the underlying infrastructure. By treating underlying infrastructure as a service layer, Hammerspace is able to consolidate disparate and siloed storage into a single, cohesive, logical resource. Thus, Hammerspace simplifies and unifies the management and delivery of data services to applications and end-users. This architecture brings together the often mutually exclusive goals of delivering simplicity and supervisory control. This means you can automate to eliminate redundant tasks where it makes sense while retaining the option to exercise highly granular controls when required by policy, compliance, or otherwise.

Cloud Deployment

Deploying Anvil and DSX in either AWS EC2, Google Cloud Platform, or Microsoft Azure is very easy and quick. This type of installation takes advantage of deployment tools such asCloudFormation which further simplifies deployment. It allows you to automate if you want to deploy Anvil and DSX in a high availability configuration or as a standalone solution. DSX, furthermore, can be installed in multiple numbers to take advantage of parallel performance through NFSv4.2.

 

Virtualized Deployment

Virtualizing Anvil and DSX on a hypervisor, such as VMware ESX, Microsoft Hyper-V, KVM, or XenServer is another simple and fast way to deploy a Hammerspace solution. An important thing to keep in mind is that this is not an either-or choice. It is more than possible to deploy Hammerspace on a virtual platform on-premises as well as in cloud, and then “glue” them together with the Hammerspace Global File System. which is one of the jewels contained within Hammerspace software, and a topic that will be covered in greater depth in an upcoming blog.

 

Bare-Metal Hardware Deployment

It is also possible to install Hammerspace software directly on a bare-metal platform, such as SuperMicro or any other reputable business grade hardware brand. This involves slightly more work than virtualization or an in-cloud deployment as hardware has to be scoped and purchased in accordance with our hardware compatibility list and installed per system requirements. But once the right hardware has been obtained, the Hammerspace installation is identical to the previous ones. Of course, without the automation tools offered on cloud and virtualization platforms.

 

Deployment Automation

Anvil and DSX can also be deployed and configured using Ansible playbooks as the declarative nature of Ansible aligns perfectly with Hammerspace. Ansible playbooks offer a repeatable, reusable configuration management, and multi-machine deployment framework, making it well suited for rolling out large-scale deployments. Instead of repeating tedious manual tasks, just write a playbook and put it under source control. Then you can use the playbook to install more Hammerspace nodes, push out a new configuration, or confirm the configuration of remote systems. The ansible-examples Github repository is a great resource for additional information about Ansible playbooks.

 

Infrastructure Components

Hammerspace decouples storage from data. That is the essence of what constitutes storageless data. There is, after all, storage involved just as there is a server involved in serverless compute. But what makes it less is that there is a decoupling and abstraction that pools disparate resources together and makes them work in unison to better serve applications and end-users.

DAS

Hammerspace DSX is, as previously mentioned, a software-defined node that can be deployed on bare-metal hardware, virtualized on a hypervisor, or in cloud. But what may not be immediately apparent is that DSX can also contain its own Direct Attached Storage (DAS). This can be spinning disk, SSD, NVMe, or some sort of a combination to suit a particular workload. This, of course, also allows you to use simple, declarative statements to tier data to meet performance or cost requirements, or any other criteria you choose. You can spin up as many DSX nodes as you desire to scale capacity or/and performance requirements.

NAS

The previously mentioned decoupling of storage infrastructure allows Hammerspace to utilize any flavor of underlying Network Attached Storage (NAS). Although we have done specific integration work with NetApp and Dell EMC storage, any Windows or Unix/Linux file server can be integrated into a Hammerspace solution. The next item truly sets Hammerspace apart as a unique solution in the data management world. You can add existing storage with data on it. Hammerspace intelligently assimilates metadata,consolidating existing storage into a single, cohesive, logical unit that can be managed, utilized, and presented as one. Hammerspace, consequently, brings siloed information together under a global namespace on a global file system.

Shown in the image above, adding storage to a Hammerspace solution is a simple 3-step configuration:

  1. Add a storage system.
  2. Select the type of storage from a drop-down menu and fill in the appropriate credentials.
  3. Add the storage system. Any existing data will be assimilated into the global file system. No, not the kind of assimilation The Borg in Star Trek does. We are the good guys, here to solve data sprawl and storage silos.

SAN

SAN storage is also supported. The simplest way to incorporate a SAN is to provision it during installation of Hammerspace. The only caveat is that we do not present a block interface such as FibreChannel, iSCSI, or FCoE on the front-end. However, you can certainly leverage and consolidate SAN storage as part of a Hammerspace deployment.

Object Storage and Cloud

Hammerspace can also leverage underlying infrastructure in the form of cloud and object storage. Through our partnerships, we have done specific integrations with AWS, Azure, Cloudian, Dell EMC ECS, Google, and Wasabi. But any S3-capable object storage, such as NetApp StorageGRID, will also work equally well.

   

 

Conclusion

Hammerspace solves traditional storage infrastructure limitations by consolidating data resources under a global namespace and global file system. This is done simply by decoupling data from its underlying infrastructure, freeing data from limitations and empowering it with of declarative intent through Hammerspace Objectives. Storage becomes less important while data becomes more important, which is a desired state. Data has inherent value whereas infrastructure is only valuable as long as it makes data available, durable, and secure. Infrastructure comes and goes, sometimes many times, while data persists. This is particularly obvious when we look at the explosive growth of data. Therefore, Hammerspace can be easily and rapidly deployed on premises and in cloud on bare metal or virtualized. Hammerspace also supports Kubernetes persistent storage through a highly sophisticated Container Storage Interface that provides block, file, and shared storage volumes with file- and container-granular declarative control. Data is king in the modern world! It is data that allows us to transact business, run analytics, increase revenue and profitability, provide records, and solve the great challenges of the modern world. Infrastructure is there to support it, The term Storageless Data accurately describes the latest innovation in data management. Going Storageless with Hammerspace empowers your data to work harder for you.

 

Johan Ballin

Director of Technical Marketing