GPL About Home

Planet Series Project

Project Daphnis  (Saturn, Generation 2)

Objective:

Network Attached Storage :: To provide a NAS device which caters for the most common data storage needs (distributed/scalable, replicated/available and striped/performant storage models), within a native storage architecture that will enable multi-box, multi-rack, multi-site configurations.

Project output includes example hardware component information with assembly instructional documentation, Linux firmware and operating documentation.

Description:

Project Saturn was started in 2006 to construct a robust Network Attached Storage capability: to provide a NAS device which caters for the most common data storage needs (distributed/scalable, replicated/available and striped/performant storage models), within a native storage architecture that will enable multi-box, multi-rack, multi-site configurations. Generation one ("Pan") was built from GlusterFS on x86/x64, however the most recent incarnation ("Daphnis") has been built from MooseFS on Odroid HC2.

Specifically, Saturn aims to be/have;
  • Distributed
    • Must be capable of concatenating multiple drives on either a single node or multiple nodes into a single contiguous storage capacity
    • Must be expansible such that adding another drive to an existing storage capacity must not require recreation of that storage capacity (destroying, creating, formatting)
  • Highly Available
    • Must be capable of replicating data between drives on either a single node or multiple nodes into a single highly available storage capacity
    • Must be capable of multiple replications to support the "two copies in the same rack, one copy in another" principle, and to address the Bathtub curve penalty seen with bulk drive purchases i.e. multi-drive/node failures
    • Should support high-latency asynchronous replication to enable multi-site deployments
  • No dependence on strict RAID implementations
    • Must not employ proprietary on-disk storage - Hardware RAID tends to employ proprietary on-disk storage and thus the volume or, indeed, any individual disk becomes unrecoverable when the RAID set is broken
    • Must enable good throughput on commodity equipment - Software RAID tends to bring about poor performance without accelerated hardware features
  • Recovery of replicated data should not be location specific
    • Should be able to recover a failed node at one site either at its home site or at any other site with that data (i.e. being able to transport an empty or out-of-sync node to another location for high speed data recovery is highly desirable)
  • No need for manual maintenance
    • All maintenance tasks should be exceptional and ideally limited to Moves, Adds and Changes
    • All sub-components that require routine maintenance should be automated where-ever practicable
  • No lost data
    • Data degradation (aka Bit Rot) will be automatically detected and corrected
    • When configuring the NAS it should not be possible to lose or destroy data without asserting a conscious administrative decision to do so
  • Highly accessible
    • Locally (LAN) presented storage capacity should be accessible via native client interfaces (like SMB or NFS), though client-side agents are acceptable
    • Remotely (WAN) presented storage capacity should be accessible via API-friendly interfaces (like REST based object storage)
  • Confidential
    • All stored data (including metadata) will be encrypted at rest, to mitigate physical compromise (theft)
    • All data in transit should be encrypted, to mitigate local network compromise (MitM)
    • Private mount points from the common storage pool must be authenticated
The build guide (Saturn Installation and Operations Manual, included below), describes a highly-available scale-out storage fabric built on the Odroid HC2, 16TB drives (Seagate ST16000VE000), whole / full-disk encryption (covering both the MooseFS chunks and the MooseFS metadata), hardware / physical security token authentication via Yubikey, and is entirely passively cooled. In addition to a step-by-step build process for both hardware and software stages, the document includes some guidance for operations (including details on security, performance and thermal management). Further, for those interested, it also includes a complete Bill of Materials and Cost Benchmarking against the market for solution context as at the time of the original release (June/July 2020).

To achieve the project's availability goal, the build guide takes the MooseFS Storage Classes Manual example of "Scenario 1: Two server rooms (A and B)" and adapts it to a deployment across two shelves to assure predictable availability within a site. By applying labels to the MooseFS chunk servers, and applying a new storage class, the logical availability grouping within MooseFS matches the physical availability grouping within the Data Center. As data is replicated to at least an "A" node on one shelf and a "B" node on another, and each shelf with its own power and network infrastructure, an entire shelf can be lost or disabled without losing any content. This architecture can be scaled to the additional rack with the creation of "C" nodes and a slight variation of the storage class.

Also included in the documentation (below) is the new Digital Asset Management Guide - A Taxonomy For Digital Repositories. Given that this is a storage project the DAM Guide has been authored to provide open guidance regarding the management of unstructured data. Once a business is on to its third terabyte of unstructured data (or an individual starts working with more than a couple of USB drives worth of files), the question inevitably arises: "how do I manage all of these files?" The most intuitive computer interface for unstructured data is the file-system, and the ability to file content away and retrieve it quickly does not require a search engine or even database, so long as a logical taxonomy is articulately defined and easily understood, and naming conventions are clear and consistently followed.

The Digital Asset Management Guide starts from first principles;
  • Create useful containers -
    • It is common to direct a certain class of infrastructure and/or applications to a certain type of file; for example, digital signage will want a directory full of picture, flash, or video files. The taxonomy and naming conventions should support this type of grouping, such that applications can be associated with one or two directories and don’t need to be given the entire file system to find their content of interest.
  • Store handles for, and not classification of, content -
    • The file-system (directory and file names) should only contain sufficient information to successfully file and retrieve content in a useful way. Applications that consume certain file types (such as iTunes, for audio files) will acquire and store their own meta data, providing additional views of classification. Those views (such as Author, Genre, etc) should not be embedded into the on-disk structure.
    • The absolute minimum data required at the file-system layer would therefore be the data required to "look-up" the metadata (aka, the handle). This could mean storing documents named as their invoice number for example. However, the practical minimum data required at the file-system layer needs to include consideration of how users are to interact with the files – i.e. a single list of invoice numbered files may not be intuitive, but an issue date and invoice numbered file inside an organisational directory might be.
  • Remove ambiguity -
    • The resultant taxonomy should not create opportunities for confusion (i.e. should I file this content here or here).
    • Minimising/removing metadata from the taxonomy and naming conventions will aid usability. For example, filing a Lecture in a directory of Lectures is straight-forward. Filing a Lecture in a structure of institutions that is parallel to a structure of lecture topics increases filing complexity and makes access less reliable.
  • Consider performance -
    • Although the performance of the storage sub-system is not in the scope of the content management guide, poor taxonomic structure can impact storage performance.
    • Broadly speaking file-systems are designed as hash tables. If the application only ever adds/moves/deletes specific files then the directory structure has very little impact on the performance of the application. If the application provides some sort of browsing or scanning functionality then having millions of files in a single directory will increase load on the storage sub- system and decrease application performance, negatively impacting user experience.
  • Leverage existing standards -
    • Where common structures and/or naming conventions exist, these will be leveraged.
Domains covered in the taxonomy so far include Apps, Human Resources (or Family), Devices / Platforms, Finance, Legal, Multimedia, Personal / Private and Projects. So, whether you’re using Box / Dropbox, Google Drive, iCloud, One Drive, S3, a NAS from your Data Centre or a local file server, this initial taxonomy has been developed to provide you with guidance to intuitively manage your digital assets.

Screen Shots:

The following screen shots show the software or hardware developed for this project, in action;
Odroid HC2 - 8 CPU Cores, 2GB RAM, SATA3, 1GB NIC
Odroid HC2 - Daphnis Node
The Odroid HC2 as shipped, mounted in an open-faced aluminium enclosure
Odroid HC2 - Enclosure
The Odroid HC2's aluminium enclosure is a  heat-sink
Odroid HC2 - Upside-down
An Odroid HC2 depicted with SanDisk Ultra 64GB microSDXC UHS-1, to boot Raspian and MooseFS
Odroid HC2 with SanDisk 64GB
An Odroid HC2 depicted with 3.5inch Seagate Surveillance Drive (8TB)
Odroid HC2 with Seagate 8TB
Ikea TJUSIG as a standardised Saturn Generation 2 Racking Unit
Ikea TJUSIG - Daphnis Rack
Ikea TJUSIG, meet the Odroid HC2
Ikea TJUSIG, meet the Odroid HC2
Ikea TJUSIG provides ample airflow
Ikea TJUSIG with ample airflow
Node cables zip tied to the front rail
Node cables zip tied to the front rail
Cable trunk on the top shelf, with closed braided wire wrap
Braided cable trunk on the top shelf
Right-hand power-rail (6-way power board)
Right-hand power-rail
Right-hand access switch and cabling
Right-hand access switch and cabling
Left-hand power, access switch and cabling
Left-hand power, switch and cabling
A ten (10) node rack of the Second Generation Saturn - Daphnis -  scale-out storage with  Odroid HC2 and MooseFS
A ten (10) node rack of Daphnis
A ten (10) node rack of Daphnis, up and running - scale-out storage with Odroid HC2 and MooseFS
A ten (10) node rack up and running
Right-hand access switch powerd up
Right-hand access switch powerd up
Left-hand access switch powerd up
Left-hand access switch powerd up
Physical Security Token - The YubiKey Neo
Physical Security Token - YubiKey

Papers:

The following documents (papers, guides, manuals, etc) have been developed for this project;

Code:

The following code (source, binaries, patches, etc) have been developed or mirrored for this project;

Links:

The following links have been identified as relevant to this project;

Activity:

This project was initiated on Saturday, the 23rd of November 2019. Its last recorded activity stamp is Sunday, the 19th of July 2020.