A Performance Analysis tool for Unix Massively Parallel Computers
Date of Award
Doctor of Philosophy (PhD)
Graduate School of Computer and Information Sciences
Wilker Shane Bruce
S. Rollins Guild
The objective of this study was to provide a software tool capable of monitoring and optimizing a UNIX massively parallel computer by effectively removing the human operator as much as possible. Due to the complexity of the latest Unix System V release and the number of parameters that can affect system performance, it was highly desirable to provide a software tool capable of monitoring and tuning large parallel systems at frequencies within one second.
The target system was the Infinity series manufactured by Encore Computer. Some models can have up to 64 nodes or subsystems, each having at least four Motorola 88100 or 88110 processors. Users of large UNIX massively parallel computer systems lack the ability to monitor their system's health and performance accurately and efficiently. The consequences are most notably seen when optimizing their systems. The inability to determine which kernel parameters to tune and identify the correct troubled node or subsystem can lead to wasted efforts, time, money, and in some instances, lost contracts for the computer integrator.
The goal was accomplished with the creation of software agents that interact amongst themselves and the local kernels, collecting, normalizing, enforcing formal and heuristic rules, and presenting normalized data graphically within a second. The software agents were designed for efficiency and minimization of their signature loads on the system. Additional functionality included trend and predictive analysis modules. The capability to display global views on the system console was also provided via RPC data agents.
Andres A. Folleco. 1998. A Performance Analysis tool for Unix Massively Parallel Computers. Doctoral dissertation. Nova Southeastern University. Retrieved from NSUWorks, Graduate School of Computer and Information Sciences. (522)