Friday, May 27, 2011

Wer den Schaden hat spottet jeder Beschreibung

(there is a picture of my gold stone beads at the bottom, if you don't want to read through the rest)

This saying is what came into my mind at the end of the day. I don't know whether it is a saying that's used outside my (German) family. It is a contraction of "Wer den Schaden hat braucht fuer den Spott nicht zu sorgen" (for which I can't think of a translation) und "das spottet jeder Beschreibung" (which amounts to, basically, "truth is stranger than fiction", though a more literal translation would be "this defies description").

The day started with me realizing at about 5:30 that the orange cat was sleeping in the priority mail box that my pewter beads are in. So I was up relatively early, started the coffee and checked my email. There were 223 messages in my Inbox. There had been about 90 when I went to sleep. Turns out, about 85 of them were water level alarms from one of our telescopes (JCMT). We know that the water level meter is oscillating (this is why I already changed the distribution list for those emails to go to only me rather than all the people who normally get them), and looking at the actual level I wasn't surprised: 30% is where it changes from minor alarms to major alarms and I get an email each time it changes from above to below or back. Given that it was still early I got some other stuff done like dishes and some coffee and other seed processing before going to work. I just knew I wasn't going to get out of there early.

Getting to work I found more water alarm emails, then responded to some faults, and went about my normal Friday chores of checking on disk space at both telescopes and the general state of computers at UKIRT (I don't really know enough about the state things should be in at JCMT, I'm working on it).

In my other email I found a link about olivine found in outer space which I posted to Facebook and won't repeat here ... (I did get through all my email eventually, deleting about another 100 water alarms in the process)

As our (UKIRT) data archiving student helper (who is very good) is going to be away next week I made sure that I know how to do his work while he is away (I just took this over recently) and went about checking the data tape he gave me. Much to my surprise I found that the data that are on that tape were missing their counterparts on disk at the telescope (and then some that hadn't been written to tape yet - they are by now, did I mention he's good?). So, as we have those data in Hilo I started copying them back up to the summit. We have a policy that we want data to be in at least two places at all times, and while for all I know all the data had already been transferred to the archive in Cambridge (UK), I still wanted 2 copies here.

As I had to reboot one of our WFCAM machines during the day I took some test data in between as well and everything appeared to be working. Went about some more work.

Then, as our telescope system specialist came in and ran up the system and started taking data, something failed. He showed it to me and explained what he had already done. I told him to let me have a go at it and I found it in more of a state than I can remember seeing before. There were processes I couldn't kill, positively couldn't (kill -9 as root). I logged into the console and tried to shut down and reboot the machine. It wouldn't go down. We have (for good reason - these computers are 30 miles away) networked power switches that all these computers are plugged into, and so I power cycled it. Linux (or any Unix) computers don't like to be power cycled, so it insisted on checking one of its file systems (luckily a small one) on the way back, rebooted itself again, mostly ran up but then something failed. I logged into it again (directly via network) and found the data disk missing. At that point I asked the telescope guy to call our Linux (and more general) system administrator who got the disk back but only one of the two mirrors (we have those disks mirrored for added data safety). It was only a bit over 9 hours from when I got there until I got back out, but if I didn't have my boss backing me up, with previous administrators I would have been in trouble for getting there as late as I did. On paper our flex hours for getting there end at 9 a.m.. I got there at 10:30 yesterday and at 9:45 today. I know why.

So much for that. I'm sorta toast and haven't checked my email again since I got home yet. I bet there are more water alarms. I know there is a fault I have to respond to (about the computer problem tonight, and I'm not going to let that sit until Tuesday).

No comments: