GSAS: A Parallel-Access Infrastructure Approach
Traditional IT infrastructures simply weren't designed to support the requirements demanded by today's compute-intensive and data-intensive workflows. Re-Store’s grid storage and archive (GSAS) infrastructure offers a parallel-access storage 'bus' design that uniquely address the four most important issues facing organizations with scale-out storage requirements:
· Massive storage capacity and performance with a game-changing cost structure
· Single global namespace for simplified digital asset management
· Integrated, offsite, disaster protection
· Active” long-term archiving
Our IBM-based solution relies on IBM’s supercomputer technology and massively parallel file system (GPFS) to unite compute, I/O, storage & archive into a single, contiguous, linearly scalable infrastructure. We transform UNIX, Linux and Windows applications into an HPC configuration to help the customer scale compute and storage intensive components of their workflow (file encryption, video transcode, e-discovery, etc) to scale the business. We then leverage the global filesystem to migrate (ILM) the data into the NAS components of the grid for distribution and analysis. As the data ages, we use external GPFS storage pools to migrate to tape for an integrated archive.
Specifically, the GSAS architecture delivers:
Massive Performance & Capacity – Minimal Cost
Our solution provides highly parallel data access and uses modular, grid-based storage ‘nodes’; during reads and writes, data is striped across all storage nodes in parallel. In this manner, performance becomes linearly scalable; the grid design ensures resiliency, as data is re-routed around any single node outage. Next, we use a mature ILM toolset to automate the migration of data to the cost-correct tier of disk storage. This policy-based file auto-migration reduces the overhead associated with data storage management; optionally, we can abstract tape (via HSM) to integrate use of multiple classes of disk and LTO tape, producing a blended cost of storage that’s significantly cheaper than current "disk-only" strategies. Finally, using COTS-based hardware ensures maximum cost efficiencies.
A Single, Global Namespace
GPFS’s global namespace is the key to effective, efficient management of distributed file storage. It’s a logical layer that is inserted between clients (users and applications) and file systems, providing a method of viewing and accessing files that is independent of the physical file locations. Most critically, the file/data path never changes.
This is a powerful concept, as it means an administrator can use a namespace to logically arrange and present data to users, irrespective of where the data is physically located. It also gives administrators the ability to add, change, move, and reconfigure physical file storage without affecting how users view and access it. Finally, it enables an administrator to aggregate file storage across heterogeneous, geographically distributed storage devices and to view and manage it as a single file system.
Integrated Data Protection (Back-up) with Off-Site Data Protection (DR)
As an option: the architecture provides an extendable platform for developing value-added functions such as data protection and disaster recovery. For this, (true data protection) is resolved with another IBM tool, Tivoli Storage Manager (TSM), a dual-purpose tool, offering both data protection and archive functionality. With this approach: upon ingest, files are pre-migrated via TSM’s HSM to the object-oriented tape file system immediately, where they can be replicated to other libraries off-site and/or any number of copy tapes is produced for local pick-up. Realize, when a file is pre-migrated a copy is archived on tape while the original file still resides on disk (GPFS) area. Rather than having to "duplicate & replicate" storage arrays to achieve protection, now users can realize true, centralized data protection gracefully integrated throughout the entire environment.
"Active" Long-Term Archive
As an option: Traditional data archiving has been viewed as "stale": data being off-line, unavailable & difficult to manage. Now, users have the ability to extend the file system over a myriad of different storage tiers to appear as a single, logical storage volume - allowing data to reside on the most appropriate storage level & remaining available as an available asset. What makes this a breakthrough solution is GPFS’s and TSM’s ability to seamlessly expand a file system to tape. Now, we have the ability to access all of the data online without having to dedicate large resources to storage that’s optimized for performance rather than capacity. This approach enables users to easily meet otherwise unachievable demands of capacity, file sizes, data rates, and number of objects stored, while meeting stakeholder’s specific price/performance/capacity requirements.