Using the processes facility, you can test for the existence of processes, signal (kill) processes and optionally restart them again. Cfengine opens a pipe from the system ps command and searches through the output from this command using regular expressions to match the lines of output from ps. The regular expression does not have to be an exact match, only a substring of the process line. The form of a process command is
processes:
"quoted regular expression"
restart "shell command"
useshell=true/false/dumb
owner=restart-uid
group=restart-gid
chroot=directory
chdir=directory
umask=mask
signal=signal name
matches=number
define=classlist
action=signal/do/warn/bymatch
include=literal
exclude=literal
syslog=true/on/false/off
inform=true/on/false/off
SetOptionString "quoted option string"
|
SetOptionString command to redefine
the option string. Cfengine assumes only that the first identifiable number
on each line is the process identifier for the processes, so you must not
choose options for ps which change this basic requirement (this is not a
problem in practice). Cfengine reads the output of the ps-command normally only
once, and searches through it in memory. The process table is only
re-consulted if SetOptionString is called. The options have the
following meanings:
signal=signal name hup 1 hang-up
int 2 interrupt
quit 3 quit
ill 4 illegal instruction
trap 5 trace trap
iot 6 iot instruction
emt 7 emt instruction
fpe 8 floating point exception
kill 9 kill signal
bus 10 bus error
segv 11 segmentation fault
sys 12 bad argument to system call
pipe 13 write to non existent pipe
alrm 14 alarm clock
term 15 software termination signal
urg 16 urgent condition on I/O channel
stop 17 stop signal (not from tty)
tstp 18 stop from tty
cont 19 continue
chld 20 to parent on child exit/stop
gttin 21 to readers pgrp upon background tty read
gttou 22 like TTIN for output if (tp->t_local<OSTOP)
io 23 input/output possible signal
xcpu 24 exceeded CPU time limit
xfsz 25 exceeded file size limit
vtalrm 26 virtual time alarm
prof 27 profiling time alarm
winch 28 window changed
lost 29 resource lost (eg, record-lock lost)
usr1 30 user defined signal 1
usr2 31 user defined signal 2
Note that cfengine will not attempt to signal or restart processes 0 to 3
on any system since such an attempt could bring down the system. The only
exception is that the hangup (hup) signal may be sent to process 1
(init) which normally forces init to reread its terminal configuration
files.
restart "shell command"owner=,group=chrootchdiruseshell=true/false/dumbSome programs (like cron) do not handle I/O properly when they fork
their daemon parts, this causes a zombie process and normally
hangs cfengine. By choosing the value `dumb' for this, cfengine
ignores all output from a program and does not use a startup shell.
This prevents programs like cron from hanging cfengine.
matches=number matches=<6 # warn number of matches is greater than or equal to 6
matches=1 # warn if not exactly 1 matching process
matches=>2 # warn if there are less than or equal to 2 matching processes
include=literalexclude=literaldefine=classlistaction=signal/do/warnsignal option) to matching
processes. This is equivalent to setting the value of this
parameter to signal or do. If you set this option
to warn, cfengine sends no signal, but prints a message
detailing the processes which match the regular expression.
If the option is set to bymatch, then signals are only sent
to the processes if the matches criteria fail.
Here is an example script which sends the hang-up signal to cron, forcing it to reread its crontab files:
processes:
"cron" signal=hup
Here is a second example which may be used to restart the nameservice on a solaris system:
processes:
solaris::
"named" signal=kill restart "/usr/sbin/in.named"
A more complex match could be used to look for processes belonging to a particular user. Here is a script which kills ftp related processes belonging to a particular user who is known to spend the whole day FTP-ing files:
control:
actionsequence = ( processes )
#
# Set a kill signal here for convenience
#
sig = ( kill )
#
# Better not find that dumpster here!
#
matches = ( 1 )
processes:
#
# Look for Johnny Mnemonic trying to dump his head, user = jmnemon
#
".*jmnemon.*ftp.*" signal=$(sig) matches=<$(matches) action=$(do)
# No mercy!
The regular expression .* matches any number of characters, so this command searches for a line containing both the username and something to do with ftp and sends these processes the kill signal. Further examples may be found in the FAQ section See FAQS and Tips.
You can arrange for signals to be sent, only if the number of matches
fails the test. The action=bymatch option is used for this.
For instance, to kill process `XXX' only if the number
of matches is greater than 20, one would write:
processes:
"XXX" matches=<20 action=bymatch signal=kill
See also filters See filters, for more complex searches.