The Computer Production Group (CPG) in collaboration with the Network Engineering and Telecommunications Section (NETS) has traditionally monitored the status of network-attached hosts throughout NCAR/UCAR. As part of this courtesy service, monitoring is limited to ten hosts per division / program. UCAR General Purpose hosts are not limited (e.g. web servers, auth servers, VPN servers, etc.). CPG will contact the appropriate system administrators when a problem is detected by our monitoring systems.
In order to manage this process, the following definitions of responsibilities and severities will constitute the host monitoring policy.
CPGUpon detecting a problem with a divisional host, CPG will make a reasonable effort to notify the appropriate staff. This reasonable effort will consist of:
Division / Program Representative
- E-mail or telephone the first contact person's designated contact number (urgent option for voice mail).
- Division / program Representatives may contact CPG at 303-497-1200
- If the first contact person has not responded within 15 minutes, the second person on the contact list will be contacted.
- If the second contact person has not responded within 15 minutes, the third and final person on the list will be contacted.
- If the third person does not respond CPG will make no further attempts.
Each division / program that has hosts monitored by CPG will have a designated division / program representative. That representative will have the following responsibilities:
- Notify CPG of any planned upgrades or outages to monitored hosts. Delegated division / program representatives may also schedule upgrades and outages.
- Notify CPG of any hosts that should be replaced or added with a work request. https://cislcustomersupport.ucar.edu/evj/ExtraView/evSignon
- Provide CPG with the prioritized list of three divisional contacts with a work request: https://cislcustomersupport.ucar.edu/evj/ExtraView/evSignon
- Notify CPG of any changes to the contact list with a work request: https://cislcustomersupport.ucar.edu/evj/ExtraView/evSignon
CPG will make every attempt to balance and distribute our monitoring services fairly. Severity 1 systems such as the Mass Storage System, production supercomputers, and the FRGP are assigned the highest severity.
In the event a Severity 1 system requires attention at the same time as a divisional host, the Severity 1 service will be addressed first. The responsible staff will be contacted regarding the divisional host when time permits.
This policy will be reviewed annually.