Triton

Create New Account | Documentation | Policies | Download Fee Schedule: UM Users | non-UM Users

The University of Miami maintains one of the largest centralized academic cyber infrastructures in the country with numerous assets. Since 2007, the core has grown from zero advanced computing cyberinfrastructure to a regional high-performance computing environment that currently supports more than 500 users, 240 TFlops of computational power, and more than 3 Petabytes of disk storage.

The University’s latest supercomputer acquisition, TRITON—rated one of the Top 5 Academic Institution Supercomputers in the U.S. for 2019—is UM’s first GPU-accelerated high-performance computing (HPC) system, representing a completely new approach to computational and data science for all University campuses. Built using IBM Power Systems AC922 servers, this system was designed to maximize data movement between the IBM POWER9 CPU and attached accelerators like GPUs. The University’s first Supercomputer “Pegasus”, an IBM IDataPlex system, was ranked at number 389 on the November 2012 Top 500 Supercomputer Sites list.

TRITON Specs

IBM Power9/Nvidia Volta – 6 Racks
IBM declustered storage – 2 Racks
96 IBM Power 9 servers
30TB RAM (256/node)
1.2 Petaflop Double Precision
240 Tflop Deep Learning
64 bit scalar
100 GB/sec Storage
150 TB shared flash storage
400 TB shared home
2 @ 1.99 TB ssd local storage

AVAILABLE TOOLS & SERVICES

Machine Learning | Deep Learning

IBM PowerAI Vision: PowerAI Vision provides intuitive tools to label, train and deploy deep learning models for computer vision without coding or deep learning expertise.
- Deep Learning and Machine Learning Frameworks
  - BVLC Caffe and IBMCaffe Used for projects in vision, speech, and multimedia (2.0 brings a focus to mobile and low power compute). As its full name “Convolutional Architecture for Fast Feature Embedding” indicates, Caffe is fast and suitable for conventional CNN applications.
  - TensorFlow Designed to provide an end-to-end platform for production and scalability. Optimized version includes visualization tool TensorBoard and the easy to use Keras API
  - PyTorch One of the latest deep learning frameworks, developed at Facebook and open sourced in 2017. It is famous for simplicity and flexibility with dynamic tensor computations while maintaining efficiency. It is great for rapid prototyping and research for most use cases.
  - Keras (tensorflow-keras)
  - SnapML: Snap machine learning is an efficient, scalable machine learning library for training linear models in finance and government. It has been developed at IBM Research Zurich specifically to address scaling issues in commercial clouds.
- Distributed Deep Learning (DDL)
  - Distribute the training of models across many servers and GPUs
Large Model Support (LMS)
- Larger data sets
- More complex models
Watson APIs: Access to Watson APIs focused in Finance, Governance and Healthcare. These APIs enable developers to access the full power of Watson from a local installation.

Data Engineering Services

Interactive Analytics (Familiar tools at scale – AutoHPC)
- Jupyter Notebooks / Juptyerlab available
  - Python, R environments, Julia and Erlang
- R-Studio
- Matlab….many more
Fully Secure Data Services (Blockchain Attestation Available)
- IBM Mainframe Hardware Encryption (Exceeds US DoD standards)
- Extract, Transform, and Load your data onto the Supercomputer Graphically!
- Example Data Layers supported: Apache Drill, Storm, Spark; MQTT
Integrated Data Presentation/Persistence Services
- Common Data Access Services Supported: HTTP(s), FTP, SSH, bbcp, Aspera etc.
- Common Web Platforms/Stacks Supported: LAMP, MEAN, Django, Node-JS, React, etc.
- Common Data Stores: PostgreSQL (commercial support available), MongoDB, Maria/MySQL, Redis

Data Pipeline Systems

Faster, cheaper & more reliable than cloud technologies
Flexible and secure system architecture
User interfaces on 2 Rockhopper II Mainframes
- Hardware encryption
Data store on ESS parallel systems
- De-clustered RAID
AI Training and Inference on Triton
Centralized data design
- Multiple system same data
- No tiering costs
- No transaction/transfer cost
Globally Accessible Network
- Worldwide Access
- 100Gb/sec I2
- 10 Gb/sec Internet

DATA STORAGE

IDSC Advanced Computing offers an integrated storage environment for both structured (relational) and unstructured (flat file) data. These systems are specifically tuned for IDSC’s data type and application requirements, whether they are serial access or highly parallelized. Each investigator or group has access to its own area and can present his or her data through a service-oriented architecture (SOA) model. Researchers can share their data via access control lists (ACLs), which ensure data integrity and security while allowing flexibility for collaboration.

IDSC offers structured data services through the most common relational database formats, including: PostgreSQL, MariaDB, and Mongo. Investigators and project teams can access their space through a SOA and utilize their resources with the support of an integrated backend infrastructure.

SERVICES

Big Data & Data Analytics Services

Big Data and Data Analytics Services offers consultation for big data and bioinformatics analytics projects; includes consultation for the incorporation of appropriate computational support, including software licenses and analytical personnel for inclusion in grant proposals; development of prototypes or initial analysis in preparation for proposal submission; expert support and grant writing for proposal submission.

Advanced Computing Services

IDSC Advanced Computing provides the academic and research communities with comprehensive advanced computing resources ranging from hardware infrastructure to expertise in designing and implementing high-performance solutions. Available services include:

- Advanced Compute Processing Service includes initial consultation for services and account set up for use of the supercomputing environment. Each advanced compute service unit includes 200 GB home directory, 2 GB of RAM per CPU hour, account maintenance, and end-user support. End-users are to delete data from scratch space within 24 hours of job completion to avoid additional charges. Users requiring assistance with extended temporary storage solutions are asked to consult with the Advanced Computing Team prior to starting any job.
- High Performance Storage Service includes initial consultation for services and set up of Windows, Linux, and OS X based storage for connectivity to the supercomputing environment. Each High Performance Storage unit includes hardware, installation, systems administration, storage maintenance, account maintenance, and end-user support.
- Advanced Computing Consulting Service includes consultation for the design and implementation of high-performance solutions, scientific programming, parallel code profiling, and code optimization.