Host: mrt-lx1

Operation Guide (last review 2019.11.14)

Audience: This section shall be known by the Telescope Operators.

Current Problems

15-03-04 (WB): mrt-lx1 reboot problem

cd /etc
bash -v rc1.local

# or
cd /etc/init.d
./rpcbind status
./nfs-kernel-server status
./isc-dhcp-server status
# if those processes are not working use option start to start the server, ex:
./nfs-kernel-server start
./tcs start

Old Problems

Operation Notes

Reboot of mrt-lx1

Host mrt-lx1 is the central server for many other computers. Avoid reboots as much as possible. Get into contact with computer group staff before rebooting.

Before rebooting

The telescope control has to be stopped. Inform the local users that mrt-lx1 will be rebooted.

After rebooting

This list might not be complete.

Hardware Fallback

In case of a hardware failure which does not allow to run mrt-lx1 anymore:

Technical Guide

Audience: This sections is mainly for the Computer Group Staff.

Overview

mrt-lx1 runs essential programs for the NCS. The programs are

Debian Updates

Debian update shall only be done during maintenance and afterwards a reboot should be done.

NCS

Processes started during boot

NCS processes are started:

Icinga checks:

PROCESSES = [("Xvnc4", 1, INFINITE),
             ("wilmad", 2, 2),
             ("vespad", 2, 2),
             ("/usr/sbin/elvind", 1, 1),
             ("syncMessagesToElvin.py", 1, INFINITE),
             ("swCS.py", 1, 1),
             ("superviseReceiver.py", 1, INFINITE),
             ("superviseAntMD.py", 1, INFINITE),
             ("spoolMonitor.py", 1, INFINITE),
             ("scanInfoMessages.py", 1, INFINITE),
             ("rxCS.py", 1, 1),
             ("queueManager.py", 1, INFINITE),
             ("python /ncsServer/mrt/ncs/tools/telescopeStatus", 1, INFINITE),
             ("/ncsServer/mrt/ncs/bin/tauToDB.py", 1, INFINITE),
             ("ncsMonitorMessagesToElvin.py", 1, 1),
             ("ncsMessageMonitor.py", 1, INFINITE),
             ("monitorMessagesToTelescopeStatusDB.py", 1, INFINITE),
             ("messagesToSounds.py", 1, INFINITE),
             ("logMessagesPerScan.py", 1, INFINITE),
             ("linkSwitcher.py", 1, INFINITE),
             ("ftsd", 2, 2),
             ("emirWithMicrosourceSynthesizer.py", 1, INFINITE),
             ("dpCS.py", 1, 1),
             ("dbMessages1.py", 1, INFINITE),
             ("coCS.py", 1, 1),
             ("checkIcinga.py", 1, INFINITE),
             ("mysqld", 1, INFINITE),
             ("antennaWebPage.py", 1, INFINITE),
             ("antennatraced", 1, INFINITE),
             ("antCS.py", 1, 1),
             ("datacollector", 1, 1),
             ]

/etc/init.d/tcs.d

Check if processes are running: under root@mrt-lx1 do:

mrt-lx1:/usr/lib/nagios/plugins# /etc/init.d/tcs status
---------- Status tcs ----------------------------------------
Checking for Elvin daemon: elvind                                    [ OK ]
Checking for forwardMrtState: forwardMrtState                        [ OK ]
Checking for tcsLogServer: tcsLogServer                              [ OK ]
Checking for messagesToElvin: messagesToElvin                        [ OK ]
Checking for ncsMonitorMessagesToElvin: ncsMonitorMessagesToElvin    [ OK ]
/ncsServer/mrt/ncs/var/proc/ncsAlarmMessages.pid
Checking for ncsAlarmMessages: ncsMessageMonitor                     [ OK ]
logFile: /ncsServer/mrt/ncs/var/log/ncsLogHistory.log
Checking for ncsLogHistory: ncsMessageMonitor                        [ OK ]
Checking for wobblerServer: wobblerServer                            [ OK ]
Checking for ncsMonitorMessages: ncsMonitorMessages                  [ OK ]
Checking for tauToDB: tauToDB                                        [ OK ]
Checking for antennaWebPage: antennaWebPage                          [ OK ]
Checking for ncsMessagesToDB: ncsMessagesToDB                        [ OK ]

If a process does not show OK, try to start it:

cd tcs.d
# example:
Proc=tauToDB
./$Proc start

To reset all processes, do:

cd /etc/init.d/
./tcs stop (and wait 10 s.)
./tcs start

do goOnce and go

For software docs, see https://mrt-lx1.iram.es/ncs/doc, ex: emirWithMicrosourceSynthesizer.py https://mrt-lx1.iram.es/ncs/doc/rxCS.

Maintenance

Filesystem checks

Others

During maintenance do not forget to run:

/usr/sbin/logrotate -v /etc/logrotate.conf

PLEASE BE SURE THAT NO OBSERVATION IS RUNNING

Change from eth0 to eth1

We have had problems on mrt-lx1 which might have been related to problems on eth0. eth1 can also be used. In order to change from eth0 to eth1 change the link interfaces in /etc/network

Overview

See also [http://mrt-lx1.iram.es/mrt/ncs/doc/hostInfos/mrt-lx1Info.html|Last Snapshot]

Logbook

Hardware Modifications

Others

System Configuration Modifications

what

last checked

issues

notes

second IP address 192.168.224.59

cron.daily

/etc/cron.daily/logrotate file removed to avoid rsyslog restarts because it affects to the observation

What should be working on this computer (Checklist after an upgrade or reinstallation)

Please, be aware that this section has not been checked in a real upgrade. It has to be completed/corrected during next upgrade of the computer. rmm, 2009/03/27

NFS server and client with automount

NIS master server

apache server

Icinga client

IDL license server and software

mysql sever and databases

Project Accounts Management

CUPS normal client

/etc/hosts.allow /etc/hosts.deny

/etc/inetd.conf

/etc/rsyncd.conf

syslog and logrotate configuration

/etc/network/interfaces

crontab

Page Info

mrt-lx1 (last edited 2019-11-14 12:03:33 by mellado)