Course Overview
The HPE Performance Cluster Manager (HPCM) administration course provides knowledge and practice installing HPCM, managing data networks, provisioning servers, creating and modifying server images, working with software repositories and image version control, automating post installation tasks, configuring services, reviewing security features, and troubleshooting.
Moyens d'évaluation :
- Quiz pré-formation de vérification des connaissances (si applicable)
- Évaluations formatives pendant la formation, à travers les travaux pratiques réalisés sur les labs à l’issue de chaque module, QCM, mises en situation…
- Complétion par chaque participant d’un questionnaire et/ou questionnaire de positionnement en amont et à l’issue de la formation pour validation de l’acquisition des compétences
Who should attend
- Attend this class if you need to learn to install, configure and administer clusters managed with the HPE Performance Cluster Manager (HPCM)
- Experienced Linux system administrators
Prerequisites
• H8PE8S: HPE Performance Cluster Management Foundations
• The following Linux system administration skills are prerequisites for this course:
- Edit text with the vi editor
- Recognize regular expression syntax
- Access documentation with man and info file viewers
- Monitor, manage and maintain log files
- Enter common commands at the bash command line; create and interpret basic bash shell scripts
- Install and configure standard software components, services and security features
- Configure basic communication protocols that support networked communications
- Create and modify crontabs
- Monitor resources usage; be familiar with basic monitoring tools
- Install and configure a Linux distribution on a server
- Create, modify, and delete user accounts and group accounts
- Partition disks, manage filesystems and logical volumes
- Use RPM package management
- Install and use virtualized systems
- Understand basic hardware and hardware troubleshooting
Course Objectives
At the conclusion of this course, you should be able to:
- Install HPCM
- Add servers to the cluster
- Manage data networks
- Provision nodes
- Create and modify images and software repositories
- Use image version control
- Automate post installation tasks
- Configure shared filesystem, user accounts, applications and updates
- Troubleshoot cluster services
- Review cluster security features
Course Content
Module 1: Install Cluster
- Describe HPCM features
- Define operating system slots
- Build cluster from ground up
- Provision node with GUI
- Provision node with command line
- Add nodes to the cluster
- Explore auto installation tools
Module 2: Discover
- Discover nodes
- Interpret cluster configuration files
- Review cluster services
Module 3: Data Networks
- Describe technologies
- Describe InfiniBand configuration
- Describe Intel Omni-Path configuration
- Describe software components
- Use diagnostic commands
Module 4: Manage Images
- Manage software repositories
- List software repositories
- Add software repositories
- Remove software repositories
- Create repository groups
- Customize an image by using RPM lists
- Create a compute node image
- Create an ICE-compute node image
- Manage image version control
- Check in an image into version control
- Compare differences between two versions of an image
- List the versions of an image
- Deploy a specific version of an image
- Push an ICE-compute image to a rack
- Use parallel tools and inbuilt functionality to check differences between nodes
- Install batch scheduler server on a compute node
- Install batch scheduler client on a compute node and in ICE compute node
- Configure HPCM connectors to job schedulers
- Capture an image from a node (golden)
- Add RPMs to, remove RPMs from, and version control compute images
- Add and remove RPMs from running compute nodes
- Clone an ICE-compute image
- Add RPMs to ICE compute image Compare when and when not to use tmpfs root
- Determine which nodes use tmpfs root
- Configure nodes to use tmpfs root
- List tmpfs quota difference (rack leader quotas do not apply when ICE-compute nodes are in tmpfs)
- Set tmpfs mode
- Set disk mode
- Show which mode a node has booted with
- Show which mode a node is scheduled to boot into
Module 5: Automate Post Installation Tasks
- Review conf.d scripts
- Exclude a conf.d script
- Use pre_reconf.sh
- Use reconfig.sh
- Develop post install and per-host customization scripts
Module 6: Configure Shared Filesystem, User Accounts, Applications, and Updates NFS Export a filesystem on a compute node
- Mount an NFS filesystem and create a user on an ICE compute node
- Manage user accounts
- Synchronize UIDs and GIDs, LDAP, etc.
- Run an application on compute and ICE compute nodes
- Display BIOS settings
- Upgrade firmware
- Update kernel
- Update distribution
- Update HPCM
Module 7: Troubleshoot Cluster
- Backup cluster configuration
- Backup managed network switch configuration
- Use the central log repository
- Investigate log files
- Gather system information
- Interrogate iLOs, BMCs
- Confirm resources
- Create pdsh groups
- Investigate bond devices
- Inspect VLAN devices
- Capture a node crash dump
- Transfer an image from another slot or another system and confirm that the image can be used.
- Inject faults
Moyens Pédagogiques :