High Performance Storage System

HPSS Logo
Incremental Scalability
Based on storage needs and deployment schedules, HPSS scales incrementally by adding computer, network and storage resources. A single HPSS namespace can scale from petabytes of data to exabytes of data, from millions of files to billions of files, and from a few file-creates per second to thousands of file-creates per second.
About HPSS   :    HPSS for Spectrum Scale (formerly GPFS)
Overview
  • Not only is HPSS a highly scalable standalone file repository, a single HPSS may also provide disaster recovery and space management services for one or more Spectrum Scale file systems.
  • Spectrum Scale customers may now store petabytes of data on a file system with terabytes of high performance disks.
  • HPSS may also be used to backup your Spectrum Scale file system, and when a catastrophic failure occurs, HPSS may be used to restore your cluster and file systems.

HPSS for Spectrum Scale

HPSS for Spectrum Scale key terms

  • Spectrum Scale: A proven, scalable, high-performance data and file management solution (based upon IBM General Parallel File System or GPFS technology). Spectrum Scale is a true distributed, clustered file system. Multiple nodes (servers) are used to manage the data and metadata of a single file system. Individual files are broken into multiple blocks and striped across multiple disks, and multiple nodes, which eliminates bottlenecks.
  • Spectrum Scale ILM policies: Spectrum Scale Information Lifecycle Management (ILM) policies are SQL-like rules used to identify and sort filenames for processing.
  • Disaster recovery services: Software to capture a point-in-time backup of Spectrum Scale to HPSS media on a periodic basis, manage the backups in HPSS, and restore Spectrum Scale using a point-in-time backup.
  • Space management services: Software to automatically move data between high-cost low-latency Spectrum Scale storage media and low-cost high-latency HPSS storage media, while managing the amount of freespace available on Spectrum Scale. The bulk of data are stored in HPSS, allowing Spectrum Scale to maintain a range of free space for new and frequently accessed data. Data are automatically made available to the user when accessed. Beyond the latency of recalling data from tape, the Spectrum Scale space management activities (migrate, purge and recall) are transparent to the end user.
  • Online storage: Data that are typically available for immediate access (e.g. solid state or spinning disk).
  • Near-line storage: Data that are typically available within minutes of being requested (e.g. robotic tape that is local or remote).

Many to one advantage of HPSS for Spectrum Scale

  • A single HPSS may be used to manage one or more Spectrum Scale clusters.
  • ALL of the HPSS storage may be shared by all Spectrum Scale file systems.
  • One Spectrum Scale file system may leverage all HPSS storage if required.
  • The Max Planck Computing and Data Facility (MPCDF, formerly known as RZG) is space managing and capturing point-in-time backups for six (6) Spectrum Scale file systems with ONE HPSS.
  • Many Spectrum Scale to one HPSS

What HPSS software is installed on the Spectrum Scale cluster?

  • HPSS Session software directs the space management and disaster recovery services.

  • HPSS I/O Manager (IOM) software manages data movement between HPSS and Spectrum Scale.

  • HPSS Software
  • HPSS Session software is configured on all Spectrum Scale Quorum nodes, but is only active on the CCM (Spectrum Scale Cluster Configuration Manager) node.
  • The CCM may fail over to any Quorum node, and the HPSS Session software will follow the CCM.

  • Where HPSS software runs in the cluster
  • HPSS IOM software may be configured to run on any Spectrum Scale node with an available Spectrum Scale mount point.
  • There are five processes that comprise the HPSS Session software: HPSS Process Manager, HPSS Mount Daemon, HPSS Configuration Manager, HPSS Schedule Daemon, and HPSS Event Daemon.
  • There are three processes that comprise the HPSS IOM software: HPSS I/O Manager, HPSS I/O Agent, and HPSS ISHTAR.
  • HPSS I/O Agents copy individual files while ISHTAR copies groups of files (ISHTAR is much like UNIX tar, but faster, indexed, and Spectrum Scale specific).

Additional Spectrum Scale cluster hardware

  • HPSS IOMs are highly scalable and multiple IOMs may be configured on multiple Spectrum Scale nodes for each file system.
  • Bandwidth requirements help determine the expected node count for a deployment.
  • New Spectrum Scale quorum nodes for HPSS may need to be added to the cluster.
  • HPSS for Spectrum Scale is typically deployed on a set of dedicated nodes that will become the Quorum nodes.
  • HPSS Session and IOM software are configured on the new Quorum nodes.
  • Dedicated HPSS nodes becomes Quorum nodes

Space manage Spectrum Scale with HPSS

  • Periodic ILM policy scans are initiated by the HPSS Session software to identify groups of files that must be copied to HPSS.
  • The HPSS Session software distributes the work to the I/O Managers. Spectrum Scale data are copied to HPSS in parallel.
  • The HPSS advantage is realized with two areas: (1) high performance transfers; and (2) how data are organized on tape.

  • Data flows to HPSS on migration
  • When Spectrum Scale capacity thresholds are reached (the file system is running out of storage capacity), unused files are purged from Spectrum Scale, but the inode and other attributes are left behind.
  • All file names are visible, and the user may easily identify which files are online and which files are near-line.
  • The HPSS Session software will automatically recall any near-line file from HPSS back to Spectrum Scale when accessed.
  • Tools to efficiently recall large numbers of files are also provided.

  • Data flows to Spectrum Scale on recall

Backup and disaster recovery with HPSS for Spectrum Scale

  • The backup process captures the following data:
    • File data - the space management process (discussed above) is the process used by HPSS to capture the data for each file.
    • Namespace data - the Spectrum Scale name space is captured using the Spectrum Scale image backup command.
    • Cluster configuration data - the cluster configuration is saved to HPSS to protect the cluster configuration.
  • The HPSS disaster recovery processing minimizes data movement and is ideal for high performance computing (HPC) environments where the goal is to bring the namespace back online quickly.

< Home

Come meet with us!
HPSS @ MSST 2019
The 35th International Conference on Massive Storage Systems and Technology will be in Santa Clara, California in May of 2019 - Learn More. Please contact us if you would like to meet with the IBM business and technical leaders of HPSS at Santa Clara University.

HPSS @ ISC19
The 2019 international conference for high performance computing, networking, and storage will be in Frankfurt, Germany from June 16th through 20th, 2019 - Learn More. Come visit the HPSS folks at the IBM booth and contact us if you would like to meet with the IBM business and technical leaders of HPSS in Frankfurt.

2019 HUF
The 2019 HPSS User Forum details are coming soon. The HUF is typically hosted by HPSS customers in the September/October timeframe. The 2018 HUF was hosted by the UK Met Office in Exeter, United Kingdom from October 15th through October 18th, 2018.

HPSS @ SC19
The 2019 international conference for high performance computing, networking, storage and analysis will be in Denver, Colorado from November 18th through 21st, 2019 - Learn More. Come visit the HPSS folks at the IBM booth and contact us if you would like to meet with the IBM business and technical leaders of HPSS in Denver.

What's New?
HPSS 7.5.3 Release - HPSS 7.5.3 was released in December 2018 and introduces many new and exciting improvements.

IBM TS1160 - On November 20, 2018 IBM announced the new enterprise tape technology supporting 20 TB of native capacity and 400 MB/s of native bandwidth. Learn more.

Best of Breed for Tape - HPSS 7.5.2 and 7.5.3 improvements raise HPSS tape library efficiency to 99% on both IBM and Spectra Logic tape libraries.

Explosive data growth - HPSS Collaboration leadership from Lawrence Berkeley National Laboratory's National Energy Research Scientific Computing Center (NERSC) helped author the "NERSC Storage 2020" report, and NERSC trusts HPSS to meet their immediate and long term data storage challenges.

HPSS Vendor Partnership Grows - HPSS begins Quantum Scalar i6000 tape library testing in 2018. Other HPSS tape vendor partners include IBM, Oracle, and Spectra Logic.

Swift On HPSS - Leverage OpenStack Swift to provide an object interface to data in HPSS. Directories of files and containers of objects can be accessed and shared across ALL interfaces with this OpenStack Swift Object Server implementation - Contact Us for more information, or Download Now.

Capacity Leader - ECMWF (European Center for Medium-Range Weather Forecasts) has a single HPSS namespace with 385 PB spanning 356 million files.

File-Count Leader - LLNL (Lawrence Livermore National Laboratory) has a single HPSS namespace with 46 PB spanning 1.208 billion files.

RAIT - Oak Ridge National Laboratory cut redundant tape cost-estimates by 75% with 4+P HPSS RAIT (tape stripe with rotating parity) and enjoy large file tape transfers beyond 1 GB/s.
Home    |    About HPSS    |    Services    |    Contact us
Copyright 2018, HPSS Collaboration. All Rights Reserved.