Top Level Shift Instructions
I- When you start your shift
1- Enter the names of the shift crew into the
elog.
See here for more details on using the elog logbook.
2- Check with the previous shift crew on the status and what the programme
is for your shift.
3- Almost all DAQ control should be done from the
~caliceon/online/
directory on calice00. Find out which terminal window is in use for this.
4- Follow the instructions below as and when they are needed. If the system
is already running and the beam is stable, you will not need to do most of
these.
II- Logging on
1- If you need to log onto the DAQ computers, then use the caliceon account.
The main computer to use is flccalice00
Login : caliceon
Pwd : consult your emails!
calice00> cd online
although you will also need to log onto flccalice01, flchcaldaq03 and
flchcalana01 with the same username and password.
III- Slow controls
1- Beam controls
2- ECAL power supply controls
3- ECAL movable stage controls
4- AHCAL slow controls
IV- Starting the DAQ program
1- The main DAQ background program is called runner and this executes on
flccalice00. In addition, other programs called emcSktReadout and
ahcSktReadout, connected via sockets, run on
flccalice01 and flchcaldaq03, respectively.
To start these, use
calice00> startUp
on flccalice00 and
calice01> startSkt
on flccalice01 and flchcaldaq03.
See here for more details on starting
the DAQ programs.
2- Make an entry in the elog giving the log file names reported by
startUp and startSkt. These logfiles can be found by using
calice00> ls -l data/log/
3- Check the processes have really started using
calice00> ps
and/or
calice00> tail data/log/[Logfilename]
Note, the DAQ is never inactive and defaults to a slwMonitor run (which
reads slow data only) at startUp.
This should be ongoing if everything started correctly.
4- If you need to stop the DAQ programs, then use
calice00> shutDown
The socket programs should stop automatically.
Check they all closed down properly using ps and tail. Report any errors
in the elog.
V- Starting a run
1- To start a new run, use
calice00> runStart [options]
See here for more details on starting
runs.
2- Check the run has started correctly using the monitoring tools (below).
3- Some runs end themselves after a predefined number of events or a
fixed time. Others will run forever. In all cases, to end a run, use
calice00> runEnd
The DAQ program will revert to a slwMonitor run.
VI- Monitoring a run
1- To print the type of an ongoing run to the screen, use
calice00> currentRun
See here for more details on this.
2- To print the numbers of records processed and event rates to the screen
use
calice00> runMonitor [options]
See here for more details on this.
This printout is refreshed after several seconds indefinitely; to stop
the program, do Ctrl-C in the terminal window.
3- To display the online (immediate) histograms, use
calice00> hstDisplay [options] [HstDisplayName]
See here for more details on this.
Histograms which appear in ROOT windows are periodically updated indefinitely.
To stop the program, do Ctrl-C in the terminal window or in the ROOT window.
4- To display the offline monitoring histograms made from raw data bin files
(including the ongoing run if desired) then use flchcalana01 and do
flchcalana01> ???
5- To run the local LCIO converter on a raw data bin file
and subsequently an analysis job on the output,
then use flchcalana01 and do
flchcalana01> ???
VII- Troubleshooting
1- If something doesn't work; firstly, don't panic (and try not to get
annoyed!). Secondly, try to document as many of the symptoms and diagnostics
as you can in the elog.
2- Try to redo whatever failed as temporarily glitches can occur in networks
and hardware, e.g. shutDown and startUp the DAQ program again, or restart
the run. If this clears the problem, then great. If not, then it is
at least reproducible.
3- If things were going smoothly up until now, then think about what might
have changed and tackle that first. Have you just restarted the DAQ program;
if so, did it start correctly? Did you just start a run and is it the same
or a different type from before? If different, do you know if the version
you used is what you intended? Is there still beam?
4- If you can, restart a failing run with a higher diagnostic printout
level
calice00> runStart -p 9 [other options]
or even -p 12 to get a lot of printout. This diagnostic printout goes
to the log files. This will help the experts see what is happening.
5- Call the experts; their names should be written on the whiteboard.
Do not start changing the hardware or hacking the software as this
will make solving the problem more difficult.
VII- Hardware and software changes
1- On a standard shift, you should only need to do hardware or software
changes infrequently.
VIII- Other documentation
-
Anne-Marie's original shift guide
(html,
doc,
pdf)
-
Erika's shift guide
(ps,
pdf)