[Date Prev] [Date Next] | [Thread Prev] [Thread Next] | [Date Index] [Thread Index] |
[snips-users] "ERROR: snips_rrd() empty devicename '' or variable ''" fixed (long)
|
Hello, I and another SNIPS user, Todd Edmands, have found the cause of the "ERROR: snips_rrd() empty devicename '' or variable ''" message. We implemented a fix 6 days ago and haven't had the problem return. Before the fix I generally never went more than 5 hours or so before it returned. Todd's system got the error less frequently, approximately weekly or less, so he is still testing. The problem occurs when a monitored variable gets expired for not being updated for longer than the EXPIRE_AGE time and then gets updated later. (Because a host returned, or a temporary variable like a MailQDest with a particular destination returns.) The expiration code actually alters the "event" for a monitored variable, overwriting the record in hostmon-output with a null (empty) event. But the index for that variable is kept. If this variable later returns, the same record for this event is used again, but the nulled information is not rewritten. Hence the empty devicename (and other information). In the code for handling an old data event (which isn't old enough to have expired yet), age is handled by setting the n_OLDDATA flag in the event record. This flag is cleared later if the event is updated before expiration. It turns out that there is an unused state flag named n_NODISPLAY. This flag is never set in any of the C nor Perl code, though it is checked at one point in the snipstv code and used to ignore an event. Since this is reminiscent of the behavior we want, we adopted this state flag for expiration. By using this flag, there is no need to alter the event with the null values. Changes to hostmon: ------------------- The new expiration code is copied from the old data code that follows it and modified to use the n_NODISPLAY variable. The hash, %oldage, is reused to indicate an expired event by setting it's value to 2. The code which unsets the n_OLDDATA flag has an additional test added to also unset the n_NODISPLAY flag. Changes to genweb.cgi: ---------------------- The test for expired row data originally checked for empty device name and device address. We added a test for the n_NODISPLAY flag, leaving the original test in case there might be other causes of empty device names and addresses. Changes to snipstv: ------------------- None were necessary. The existing test for the value of the n_NODISPLAY flag in snipstv does exactly what we wanted--skips displaying an expired event. Context diffs ------------- =============Cut here============================================ *** hostmon 2001-09-24 09:32:22.000000000 -0600 --- hostmon.new 2003-12-11 12:54:42.000000000 -0700 *************** *** 393,399 **** $timestamp{$item} = 0 if (! defined($timestamp{$item}) ); my $age = $curtime - $timestamp{$item}; # print STDERR "Age for $item is $age secs\n"; ! if ($age >= $EXPIRE_AGE) { rewrite_event($datafd, $nullevent); next; } if ($age >= $OLD_AGE) { if (! defined ($oldage{$item})) { my %event = unpack_event($event); --- 393,414 ---- $timestamp{$item} = 0 if (! defined($timestamp{$item}) ); my $age = $curtime - $timestamp{$item}; # print STDERR "Age for $item is $age secs\n"; ! ! # Previous code used alter_event to blank fields in the record for ! # this event. This code uses the NODIPLAY flag in the state field ! # of the event instead. ! if ($age >= $EXPIRE_AGE) { ! if (! defined ($oldage{$item}) || $oldage{$item} < 2) { ! my %event = unpack_event($event); ! $event{state} = $event{state} | $n_NODISPLAY; ! $event = pack_event(%event); ! $oldage{$item} = 2; ! } ! my ($status,$value,$thres,$maxseverity) = split(/\t/, $curstat{$item}); ! update_event($event, 0, $value, $thres, $maxseverity);# escalate severity ! rewrite_event($datafd, $event); ! next; ! } # age > $EXPIRE_AGE if ($age >= $OLD_AGE) { if (! defined ($oldage{$item})) { my %event = unpack_event($event); *************** *** 409,414 **** --- 424,432 ---- if (defined $oldage{$item}) { my %event = unpack_event($event); $event{state} = $event{state} & (~$n_OLDDATA); + if ($oldage{$item} == 2) { + $event{state} = $event{state} & (~$n_NODISPLAY); + } $event = pack_event(%event); undef $oldage{$item}; } =============Cut here============================================ *** genweb.cgi 2002-01-29 22:42:45.000000000 -0700 --- genweb.cgi.new 2003-12-12 15:16:08.000000000 -0700 *************** *** 542,548 **** while ( ($event = read_event($datafd)) ) { my %ev = unpack_event($event); ! next if ($ev{device_name} eq "" && $ev{device_addr} eq ""); $ev{file}=$file; # store the filename also --- 542,549 ---- while ( ($event = read_event($datafd)) ) { my %ev = unpack_event($event); ! next if ($ev{device_name} eq "" && $ev{device_addr} eq "") ! || ($ev{state} & $n_NODISPLAY); $ev{file}=$file; # store the filename also =============Cut here============================================ -- Anthony Vealé National Snow and Ice Data Center E-Mail: veale at nsidc org Phone: (303)735-5069 |