High Performance Storage System

Incremental Scalability
Based on storage needs and deployment schedules, HPSS scales incrementally by adding computer, network and storage resources. A single HPSS namespace can scale from petabytes of data to exabytes of data, from millions of files to billions of files, and from a few file-creates per second to thousands of file-creates per second.
About HPSS   :    HPSS for Spectrum Scale (formerly GPFS)
  • Not only is HPSS a highly scalable standalone file repository, a single HPSS may also provide disaster recovery and space management services for one or more Spectrum Scale file systems.
  • Spectrum Scale customers may now store petabytes of data on a file system with terabytes of high performance disks.
  • HPSS may also be used to backup your Spectrum Scale file system, and when a catastrophic failure occurs, HPSS may be used to restore your cluster and file systems.

HPSS for Spectrum Scale

HPSS for Spectrum Scale key terms

  • Spectrum Scale: A proven, scalable, high-performance data and file management solution (based upon IBM General Parallel File System or GPFS technology). Spectrum Scale is a true distributed, clustered file system. Multiple nodes (servers) are used to manage the data and metadata of a single file system. Individual files are broken into multiple blocks and striped across multiple disks, and multiple nodes, which eliminates bottlenecks.
  • Spectrum Scale ILM policies: Spectrum Scale Information Lifecycle Management (ILM) policies are SQL-like rules used to identify and sort filenames for processing.
  • Disaster recovery services: Software to capture a point-in-time backup of Spectrum Scale to HPSS media on a periodic basis, manage the backups in HPSS, and restore Spectrum Scale using a point-in-time backup.
  • Space management services: Software to automatically move data between high-cost low-latency Spectrum Scale storage media and low-cost high-latency HPSS storage media, while managing the amount of freespace available on Spectrum Scale. The bulk of data are stored in HPSS, allowing Spectrum Scale to maintain a range of free space for new and frequently accessed data. Data are automatically made available to the user when accessed. Beyond the latency of recalling data from tape, the Spectrum Scale space management activities (migrate, purge and recall) are transparent to the end user.
  • Online storage: Data that are typically available for immediate access (e.g. solid state or spinning disk).
  • Near-line storage: Data that are typically available within minutes of being requested (e.g. robotic tape that is local or remote).

Many to one advantage of HPSS for Spectrum Scale

  • A single HPSS may be used to manage one or more Spectrum Scale clusters.
  • ALL of the HPSS storage may be shared by all Spectrum Scale file systems.
  • One Spectrum Scale file system may leverage all HPSS storage if required.
  • The Max Planck Computing and Data Facility (MPCDF, formerly known as RZG) is space managing and capturing point-in-time backups for seven (7) Spectrum Scale file systems with ONE HPSS.
  • Many Spectrum Scale to one HPSS

What HPSS software is installed on the Spectrum Scale cluster?

  • HPSS Session software directs the space management and disaster recovery services.

  • HPSS I/O Manager (IOM) software manages data movement between HPSS and Spectrum Scale.

  • HPSS Software
  • HPSS Session software is configured on all Spectrum Scale Quorum nodes, but is only active on the CCM (Spectrum Scale Cluster Configuration Manager) node.
  • The CCM may fail over to any Quorum node, and the HPSS Session software will follow the CCM.

  • Where HPSS software runs in the cluster
  • HPSS IOM software may be configured to run on any Spectrum Scale node with an available Spectrum Scale mount point.
  • There are five processes that comprise the HPSS Session software: HPSS Process Manager, HPSS Mount Daemon, HPSS Configuration Manager, HPSS Schedule Daemon, and HPSS Event Daemon.
  • There are three processes that comprise the HPSS IOM software: HPSS I/O Manager, HPSS I/O Agent, and HPSS ISHTAR.
  • HPSS I/O Agents copy individual files while ISHTAR copies groups of files (ISHTAR is much like UNIX tar, but faster, indexed, and Spectrum Scale specific).

Additional Spectrum Scale cluster hardware

  • HPSS IOMs are highly scalable and multiple IOMs may be configured on multiple Spectrum Scale nodes for each file system.
  • Bandwidth requirements help determine the expected node count for a deployment.
  • New Spectrum Scale quorum nodes for HPSS may need to be added to the cluster.
  • HPSS for Spectrum Scale is typically deployed on a set of dedicated nodes that will become the Quorum nodes.
  • HPSS Session and IOM software are configured on the new Quorum nodes.
  • Dedicated HPSS nodes becomes Quorum nodes

Space manage Spectrum Scale with HPSS

  • Periodic ILM policy scans are initiated by the HPSS Session software to identify groups of files that must be copied to HPSS.
  • The HPSS Session software distributes the work to the I/O Managers. Spectrum Scale data are copied to HPSS in parallel.
  • The HPSS advantage is realized with two areas: (1) high performance transfers; and (2) how data are organized on tape.

  • Data flows to HPSS on migration
  • When Spectrum Scale capacity thresholds are reached (the file system is running out of storage capacity), unused files are purged from Spectrum Scale, but the inode and other attributes are left behind.
  • All file names are visible, and the user may easily identify which files are online and which files are near-line.
  • The HPSS Session software will automatically recall any near-line file from HPSS back to Spectrum Scale when accessed.
  • Tools to efficiently recall large numbers of files are also provided.

  • Data flows to Spectrum Scale on recall

Backup and disaster recovery with HPSS for Spectrum Scale

  • The backup process captures the following data:
    • File data - the space management process (discussed above) is the process used by HPSS to capture the data for each file.
    • Namespace data - the Spectrum Scale name space is captured using the Spectrum Scale image backup command.
    • Cluster configuration data - the cluster configuration is saved to HPSS to protect the cluster configuration.
  • The HPSS disaster recovery processing minimizes data movement and is ideal for high performance computing (HPC) environments where the goal is to bring the namespace back online quickly.

< Home

Come meet with us!
The 2020 HPSS User Forum (HUF) will be hosted virtually due to the COVID-19 pandemic. It will be hosted online for six days spread across three weeks in October 2020 with no admission cost. This will be a great opportunity to hear from HPSS users, collaboration developers, testers, support folks and leadership (from IBM and DOE Labs) - Learn More. Please contact us if you are not a customer but would like to attend.

The 2020 international conference for high performance computing, networking, storage and analysis will be hosted virtually due to the COVID-19 pandemic. SC20 will be on November 16th through 19th, 2020 - Learn More. As we do each year, we are scheduleing and meeting with customers via IBM Single Client Briefings. Please contact your local IBM client executive or contact us to schedule a virtual HPSS Single Client Briefing to meet with the IBM business and technical leaders of HPSS.

HPSS @ STS 2021
The 3nd Annual Storage Technology Showcase is in the planning stage, but HPSS expects to support the event in March of 2021. Checkout their web site - Learn More. We expect an update later in 2020.

The 2021 international conference for high performance computing, networking, and storage will be in Frankfurt, Germany from June 27st through July 1st, 2021 - Learn More. Come visit the HPSS folks at the IBM booth and contact us if you would like to schedule a face-to-face meeting with us in Frankfurt.

What's New?
HUF 2020 - The HPSS User Forum will be hosted virtually at no cost in October 2020.

HPSS 9.1 Release - HPSS 9.1 was released on September 24th, 2020 and introduces a new features.

HPSS 8.3 Release - HPSS 8.3 was released on March 31st, 2020 and introduces a new features.

HPSS 8.2 Release - HPSS 8.2 was released on December 6th, 2019 and introduces a few new features.

New Globus DSI - Version 2.9 of the HPSS DSI is now available from the GitHub release page. It provides the capability to resume interrupted Globus transfers.

Lots Of Data - In November 2019 IBM/HPSS delivered a system to a customer in Canada and demonstrated a sustained tape ingest rate of 11,574 MB/sec (1 PB/day peak tape ingest) while simultaneously demonstrating a sustained tape recall rate of 8,832 MB/sec (791 TB/day peak tape recall). HPSS pushed four 13-frame IBM TS4500 tape libraries (scheduled to house over 500 PB of tape media) to 2,168 mounts/hour.

HPSS 8.1 Release - HPSS 8.1 was released on October 1st, 2019 and introduces a few new features.

July 2019 - Argonne Team Breaks Record for Globus Data Movement from the Summit supercomputer at Oak Ridge National Laboratory to HPSS tape.

Capacity Leader - ECMWF (European Center for Medium-Range Weather Forecasts) has a single HPSS namespace with 511 PB spanning 370 million files.

File-Count Leader - LLNL (Lawrence Livermore National Laboratory) has a single HPSS namespace with 59 PB spanning 1.475 billion files.

Explosive data growth - HPSS Collaboration leadership from Lawrence Berkeley National Laboratory's National Energy Research Scientific Computing Center (NERSC) helped author the "NERSC Storage 2020" report, and NERSC trusts HPSS to meet their immediate and long term data storage challenges.

Older News - Want to read more?
Home    |    About HPSS    |    Services    |    Contact us
Copyright 1992 - 2020, HPSS Collaboration. All Rights Reserved.