Triton

University of Miami Institute for Data Science & Computing TRITON supercomputer

Triton

Create New Account  Documentation  |  Policies   | Download Fee Schedule:  UM Users | non-UM Users

The University of Miami maintains one of the largest centralized academic cyber infrastructures in the country with numerous assets.  Since 2007, the core has grown from zero advanced computing cyberinfrastructure to a regional high-performance computing environment that currently supports more than 500 users, 240 TFlops of computational power, and more than 3 Petabytes of disk storage.

The University’s latest supercomputer acquisition, TRITON—rated one of the Top 5 Academic Institution Supercomputers in the U.S. for 2019—is UM’s first GPU-accelerated high-performance computing (HPC) system, representing a completely new approach to computational and data science for all University campuses. Built using IBM Power Systems AC922 servers, this system was designed to maximize data movement between the IBM POWER9 CPU and attached accelerators like GPUs. The University’s first Supercomputer “Pegasus”, an IBM IDataPlex system, was ranked at number 389 on the November 2012 Top 500 Supercomputer Sites list.

 

TRITON Specs

  • IBM Power9/Nvidia Volta – 6 Racks
  • IBM declustered storage – 2 Racks
  • 96 IBM Power 9 servers
  • 30TB RAM (256/node)
  • 1.2 Petaflop Double Precision
  • 240 Tflop Deep Learning
  • 64 bit scalar
  • 100 GB/sec Storage
  • 150 TB shared flash storage
  • 400 TB shared home
  • 2 @ 1.99 TB ssd local storage

 

AVAILABLE TOOLS & SERVICES

Machine Learning | Deep Learning

  • IBM PowerAI Vision: PowerAI Vision provides intuitive tools to label, train and deploy deep learning models for computer vision without coding or deep learning expertise.
    • Deep Learning and Machine Learning Frameworks
      • BVLC Caffe and IBMCaffe  Used for projects in vision, speech, and multimedia (2.0 brings a focus to mobile and low power compute). As its full name “Convolutional Architecture for Fast Feature Embedding” indicates, Caffe is fast and suitable for conventional CNN applications.
      • TensorFlow Designed to provide an end-to-end platform for production and scalability. Optimized version includes visualization tool TensorBoard and the easy to use Keras API
      • PyTorch One of the latest deep learning frameworks, developed at Facebook and open sourced in 2017. It is famous for simplicity and flexibility with dynamic tensor computations while maintaining efficiency. It is great for rapid prototyping and research for most use cases.
      • Keras (tensorflow-keras)
      • SnapML: Snap machine learning is an efficient, scalable machine learning library for training linear models in finance and government. It has been developed at IBM Research Zurich specifically to address scaling issues in commercial clouds.
    • Distributed Deep Learning (DDL)
      • Distribute the training of models across many servers and GPUs
  • Large Model Support (LMS)
    • Larger data sets
    • More complex models
  • Watson APIs: Access to Watson APIs focused in Finance, Governance and Healthcare. These APIs enable developers to access the full power of Watson from a local installation.

 

Data Engineering Services

  • Interactive Analytics (Familiar tools at scale – AutoHPC)
    • Jupyter Notebooks / Juptyerlab available
      • Python, R environments, Julia and Erlang
    • R-Studio
    • Matlab….many more
  • Fully Secure Data Services (Blockchain Attestation Available)
    • IBM Mainframe Hardware Encryption (Exceeds US DoD standards)
    • Extract, Transform, and Load your data onto the Supercomputer Graphically!
    • Example Data Layers supported: Apache Drill, Storm, Spark; MQTT
  • Integrated Data Presentation/Persistence Services
    • Common Data Access Services Supported: HTTP(s), FTP, SSH, bbcp, Aspera etc.
    • Common Web Platforms/Stacks Supported: LAMP, MEAN, Django, Node-JS, React, etc.
    • Common Data Stores: PostgreSQL (commercial support available), MongoDB, Maria/MySQL, Redis

Data Pipeline Systems

  • Faster, cheaper & more reliable than cloud technologies
  • Flexible and secure system architecture
  • User interfaces on 2 Rockhopper II Mainframes
    • Hardware encryption
  • Data store on ESS parallel systems
    • De-clustered RAID
  • AI Training and Inference on Triton
  • Centralized data design
    • Multiple system same data
    • No tiering costs
    • No transaction/transfer cost
  • Globally Accessible Network
    • Worldwide Access
    • 100Gb/sec I2
    • 10 Gb/sec Internet

 

IBM Power AI slide

DATA STORAGE

IDSC Advanced Computing offers an integrated storage environment for both structured (relational) and unstructured (flat file) data. These systems are specifically tuned for IDSC’s data type and application requirements, whether they are serial access or highly parallelized. Each investigator or group has access to its own area and can present his or her data through a service-oriented architecture (SOA) model. Researchers can share their data via access control lists (ACLs), which ensure data integrity and security while allowing flexibility for collaboration.

IDSC offers structured data services through the most common relational database formats, including: PostgreSQL, MariaDB, and Mongo. Investigators and project teams can access their space through a SOA and utilize their resources with the support of an integrated backend infrastructure.

 

SERVICES

Big Data and Data Analytics Services offers consultation for big data and bioinformatics analytics projects; includes consultation for the incorporation of appropriate computational support, including software licenses and analytical personnel for inclusion in grant proposals; development of prototypes or initial analysis in preparation for proposal submission; expert support and grant writing for proposal submission.

IDSC Advanced Computing provides the academic and research communities with comprehensive advanced computing resources ranging from hardware infrastructure to expertise in designing and implementing high-performance solutions. Available services include:

    •  Advanced Compute Processing  Service includes initial consultation for services and account set up for use of the supercomputing environment.  Each advanced compute service unit includes 200 GB home directory, 2 GB of RAM per CPU hour, account maintenance, and end-user support.  End-users are to delete data from scratch space within 24 hours of job completion to avoid additional charges. Users requiring assistance with extended temporary storage solutions are asked to consult with the Advanced Computing Team prior to starting any job.
    • High Performance Storage  Service includes initial consultation for services and set up of Windows, Linux, and OS X based storage for connectivity to the supercomputing environment.  Each High Performance Storage unit includes hardware, installation, systems administration, storage maintenance, account maintenance, and end-user support.
    • Advanced Computing Consulting  Service includes consultation for the design and implementation of high-performance solutions, scientific programming, parallel code profiling, and code optimization.