[Date Prev]   [Date Next] [Thread Prev]   [Thread Next] [Date Index]   [Thread Index]

 

     [snips-users] hostmon falling over / snipstv/eventselect segmentation fault:

After moving to a new server, and a little bit of a battle to get snips
working on the new 64bit platform, all the monitors seem to work, except
hostmon is falling over after a few successful polls, eventaully it dies.

strace output looks like:

lseek(4, -400, SEEK_CUR)                = 8000
write(4, "hostmon\0\0\0\0\0\0\0\0\0localhost\0\0\0\0\0\0\0"..., 400) = 400
read(4, "hostmon\0\0\0\0\0\0\0\0\0localhost\0\0\0\0\0\0\0"..., 400) = 400
stat("/usr/local/snips/rrddata/l/localhost/_home+DFspace_%used.rrd", {st_mode=S_IFREG|0664, st_size=23808, ...}) = 0
open("/usr/local/snips/rrddata/l/localhost/_home+DFspace_%used.rrd", O_RDWR) = 5
fstat(5, {st_mode=S_IFREG|0664, st_size=23808, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3471da76c000
read(5, "RRD\0000003\0\0\0\0\0\0\0\0/%\300\307C+\37[\2\0\0\0\0\0"..., 4096) = 4096
lseek(5, 0, SEEK_CUR)                   = 4096
fstat(5, {st_mode=S_IFREG|0664, st_size=23808, ...}) = 0
lseek(5, 20480, SEEK_SET)               = 20480
read(5, "\0\0\0\0\0\0\370\377\0\0\0\0\0\0\370\377\0\0\0\0\0\0\370"..., 4096) = 3328
lseek(5, 0, SEEK_SET)                   = 0
read(5, "RRD\0000003\0\0\0\0\0\0\0\0/%\300\307C+\37[\2\0\0\0\0\0"..., 4096) = 4096
fcntl(5, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
mmap(NULL, 23808, PROT_READ|PROT_WRITE, MAP_SHARED, 5, 0) = 0x3471db0e5000
munmap(0x3471db0e5000, 23808)           = 0
lseek(5, 4096, SEEK_SET)                = 4096
lseek(5, -3248, SEEK_CUR)               = 848
write(5, "\270O\251D\0\0\0\0\0\0\0\0\0\0\0\0UNKN\0\0\0\0\0\0\0\0"..., 912) = 912
close(5)                                = 0
munmap(0x3471da76c000, 4096)            = 0
lseek(4, -400, SEEK_CUR)                = 8400
write(4, "hostmon\0\0\0\0\0\0\0\0\0localhost\0\0\0\0\0\0\0"..., 400) = 400
read(4, "hostmon\0\0\0\0\0\0\0\0\0localhost\0\0\0\0\0\0\0"..., 400) = 400
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++


Anyone seen similar before?

We have modified some things slightly to get it to compile on the new server
(Linux 2.6.14 kernel, x86_64 AMD Opteron(tm) Processor 246)

If anybody wants to see the code, then I can make it available.

Everything else seems to run okay, except it didn't like some of the rrd
files from the old system, so I just deleted them.

snipstv and eventselect compile and run, but crash with this segmentation fault:

open("/etc/localtime", O_RDONLY)        = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=1323, ...}) = 0
fstat(4, {st_mode=S_IFREG|0644, st_size=1323, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x31babfa16000
read(4, "TZif\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\7\0\0\0\7\0"..., 4096) = 1323
close(4)                                = 0
munmap(0x31babfa16000, 4096)            = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++


I know long term we would be better moving off snips as although it's pretty
much developed and does what I need, it is a dead product. 

Last time I looked at this, there wasn't really much out there that has
similar functionality and is as lightweight and simple. (i.e, I suppose we
could use nagios but that's overkill for what I want, and it seemed messy
getting it to work across multiple boxes, but things may have improved with 
it since then.)

Is anybody still using snips? Am I the only one? :)

Rob


-- 
Robert Lister   -   London Internet Exchange    -  http://www.linx.net/
robl at linx net   -   tel: +44 (0)20 7645 3510    -  RL786-RIPE

Zyrion Traverse Network Monitoring & Network Management Software