Developing and Using Best Practices for Managing HPC/DA Centers
TimeMonday, July 238:30am - 5pm
DescriptionWe propose a tutorial at PEARC18 to focus on using a suite of best practices for managing High Performance Computing and Data Analysis (HPC/DA) centers focused on research.
To achieve their mission and goals, HPC/DA centers continually strive to improve their resources and services to best serve their constituencies. Collectively, the community has learned a great deal about how to manage and operate HPC centers, provide robust and effective services, develop new communities, and other important aspects. Yet, cataloguing best practices to help inform the broader HPC community is not often done. The tutorial topics will include facility design and management, system evaluation and deployment, full service operations, allocations and accounting, job and resource scheduling, storage management, client support (help desks, consulting, advanced support), training and education, cyber-protection, risk management, performance evaluation and regression protection, special considerations for engagement with industrial users, customer satisfaction assessments and surveys, public relations and communication, budget planning, and vendor relations and contracts.
The Blue Waters project has internally documented sets of best practices, which will be shared with the attendees. We will lead the attendees through a process of identifying and documenting lessons learned and best practices and use the processes collectively to develop an even more comprehensive set of best practices. The tutorial will result in a report documenting the best practices collected from among the attendees, which will be published for broad dissemination.