[Date Prev]   [Date Next] [Thread Prev]   [Thread Next] [Date Index]   [Thread Index]


     free software is always perfect

then Rick Robino's all..
> Ok - Now I feel prompted to throw my $0.02's worth in
> here, and I feel like I represent _alot_ of other people (users/
> developers). I can certainly say I represent a few tens of 
> thousands of users who were protected from inconvenience or
> disaster because of NOCOL, just from my own experience.
> Here is what I have wanted to say since Velocet first posted his
> antagonistic message: This sound you hear on this mailing list is
> the sound of a many years old software package which many people
> are happy with. Honestly, what more could one want besides NOCOL?

Well, I think I already indicated what I wanted from nocol. Im not saying
its a PoS. Its been useful for us, but we're looking for a few more
features, if you can imagine that.

Today, for eg, I got paged this morning at 6:23 am (a few hours before I
*LIKE* to wake) regarding PageOut being out of control on our box.  Really
what happened was some jr developper ran a python script in cron that
swapped out the entire box. Not a huge deal, the machine recovered fine,
services on it werent affected, etc. However, it went into crit because the
threshold we set was too low, since this has never happened before (I dont
know whats really crit on these boxes, they have new fast disks and nice
phat CPus). It went back down to error about 20 min later, after I had
already tried to get back to sleep, and paged me again. :(

I should have been able to indicate I checked the problem out, and therefore
cancelled the followup "state ok now" page. External packages may be able
to do that, but shouldnt nocol hve that internally?

If this were not one of our core service boxes, I woulda been more annoyed.
The filters I was thinking of installing would have let this through
and paged me so late, because it had our core box's name on it. But there is
no mechanism for filtering out other pages until 8:00am when we'd otherwise
like to react to a box of lesser importance.

Also, I really dont care if paging climbs to "alot" (whatever that is) for a
whole 2 or even 20 minutes. What I want to see is it SIT THERE for a while,
then thats a real problem. I spose, really, LOAD would have climbed if it
was a real problem, but it wasnt. Ya its not good to leave the disk being
hammered like that if it was constant, but in this case it was very
transient. (And the coder has been notified of his runaway python. Next
time its a core and swap lim for him! :)

There's no mechanism to indicate that a state has to be at a certain level
for a long TIME before its really critical. I'd like that.

I originally set PageOut's threshold at a level that I think would be a
concern, but more so if I saw it sitting there for many hours or a day.
> I've been using it for 4 years and it has been very satisfactory.
> So much so that as long as I've used it and made good use of it
> I've only sent one other post to this list. I don't remember when
> I signed up for the list, but I remember smiling happily when
> at my first ISP NOCOL was displaying its status for all of our
> servers on our little Sun IPC, and paging me when things went
> wrong.  Worked just as well when I started working for large
> ISP's - better than the very expensive commercial flavor of
> the corporate day.

Curious: how many machines are you routing? Do you manage a BGP feed?
Are you multihomed? Do you run a news box? Do you monitor customer
connections all over the continent?

We do. Now this might be small to others, but for us, the 63 boxes/
routers we monitor, and the 1000s of data points that this generates
are quite a load for nocol. It can handle it, but it leaves too much parsing
for humans to do without a very nice hierarchical alert system in place.
Nocol DOES fail here. When we had only 10 boxes to watch, it was fine,
but now at 63, things are a bit too much. Im starting to turn my pager
off at night on my non-on-call nights (Im always on secondary call, I own
the place :) This is a BAD THING, but IM tired of being woken up
on a night when some box is flopping around in a busy but not critical
state. Yes, I should write a filter, but I havent had time, and Im also
paralysed trying to find the time to think about an intelligent design
that will be useful in 2 years without having to go buy NETCOOL.
That should turn your crank: lets make nocol useful for monitoring
100s of boxes, not just a dozen.

So, I agree its quite possible nocol is VERY useful. We've used
it for 3 years and  yet my first post was this week. However, I want to
IMPROVE it if its possible. I dont necessarily want to run to
Netsaint right away or anything, tho I will take a look at it of course.
I want to improve nocol, because I dont think netsaint is going
to satisfy all my requirements either.

> When I need/want to, I modify it - in C, perl, python, whatever - it
> works fine. I like low traffic lists, and even more, I like software

Did you submit your patches back to the effort? Are they clean enough
to distribute? Was there a mechanism to include your patches?

> that has worked out it's bugs and is stable. Seems as though there is
> an entire consumer culture nowadays that just can't believe something
> can be just fine _without_ being updated just for the hell of it.

This isnt for the hell of it. When one DSL line goes down to our office,
I see 15 services go down. 3 people on call makes 45 pages. Even if I
went and made some sort of pager filter, that collect pages and sends
them out in batches, I would still have to write hierarchical logic
to do that filtering entirely outside of nocol. If I write a hack
in a dozen hours, it probably wont be what we need in 4-6 months.

Maybe its a good thing that its not part of nocol, cuz really the paging
filter is a processor (tho it depends on noclogd being up, whereas the cgi
doesnt, from what I gather - one less failure point). A generic log
processor that can take action on log state in a hierarchical fashion could
almost be a package in itself. I wouldnt be suprised to see that someone had
hacked something up like this already, without even knowing about nocol.

So maybe its good thats its not included, maybe its bad. Not included
means more flexibility, but more configuration to get it going.

> Programmers need to get paid, and you'd expect the respectable

This is untrue. They need to get something for their work, but that
can be anything the programer thinks is worth while. For many it's
money, but for others who do things as a hobby, it's for personal
satisfaction, or as putting in time to amass minimal credentials to
become an embarassing OSS pundit. :) Whatever it may be, yes people
do things for reasons. I dont think its always money.

> users out there to appreciate good FREE work when they run across
> it. Saves alot of time and energy towards a particular end.

Yes, the problem is that it has to be pretty good. "Free" only goes so far
in some situations, which is why alot of software still sells, even in
the Unix world. Im not saying nocol isnt good, we've used it for 3 years
as I mentioned. I like it. It has a few small problems tho, and those problems
need to be addressed nicely. We've kludged around a few others, but the
weight of the kludges is getting to be large. This may be our own fault -
for not talking to the list and saying "I want X!", or worse for not
saying "I want X and I have the time to code it, Vikas just point where i
insert my stuff."

> Velocet - go buy HPOV or NetCool (remember when they posted here taking
> hints?), or Sun Net Manager, or whatever other marketechture suits your
> "job". I wouldn't be surprised if you bought Big Brother if it cost
> alot of $$. Those systems and NOCOL each have their place in the

I dont think you noticed that I said $100K for netcool was too much.  We run
a small shop here and OSS has been our saviour. We do what we can to give
back to the community - every time we use an more obscure OSS package, we try
to submit something back to it. We've contributed at least 5 bug fixes to
the FreeBSD effort, and one driver. Some of those bug hunts took alot of
time and effort on our part, but its of course worth it. 

However, they were bugs. And we were willing to fix them.

Nocol doenst have bugs that I really notice, it just has shortcomings that
I and others have expressed possession of a bit of time, energy and willing-
ness to fix. All we need is guidance, and an open source tree.

> Silence is indeed golden.
> Thank you, Vikas.

I think you missed something. "Silence" REALLY IS NOT what OSS is all about.

I hardly needed to discuss this with you here, I fear, since there will
surely be 10 other people telling you to go read ESRs material. (an ok
intro for the most part). I discovered what OSS was about when I wrote
a tradewars BBS game door log parser in 1986, and people submitted lots
of suggestions, kudos and bugfixes back to me. Some people were angry
because it wrecked the game by making players 10-20 times more efficient.
(I went an won a bunch of tradewars tournament with my own software under
alpha test, then declined from final rounds explaining what I had created.
I relased it the next day (after the tourney).

The whole experience was great. Someone wrote some routines for me to do
some neat stuff, and we cobbled it all together and sent it out into the
world. I dont think I would have worked on it NEARLY as long had I somehow
GUESSED that people were using it, but none told me, and none submitted
fixes or extra functionality for it back to me. They kept me interested
probably about 2 years longer than I would have cared.

Im glad they cared, wrote, suggested, complained in some cases and all
the rest. Im sure Vikas understands this too, and doesnt agree with you.
For, if he receives NO feedback from this project, WHY is he doing it?
("Cuz his workplace needs it."? Probably not the major reason if at all.)

Yes-men posts and empty kudos probably dont give him the quality of experience
he was looking for in his OSS spiritual adventure. ;)


> --Rick
> > On Thu, Oct 14, 1999 at 09:25:58AM +0100, Peter Galbavy wrote:
> > On Thu, Oct 14, 1999 at 02:42:03AM -0400, Vikas Aggarwal wrote:
> > > It has been hard to dedicate time to development of nocol, and
> > > contributions have been pretty few. I can work on enhancing this package
> > > but I do need some dedicated time and support from others in the list
> > > also.
> > 
> > This is understandable, as someone who hasn't got enough time to do
> > any of 20 different outstanding patches to various bits of software, I
> > understand.
> > 
> > > The current release has a lot of contributions from users in the field,
> > > and that has helped NOCOL grow to its current stage (web interface, perl
> > > interface, paging, etc.). If there is support from others on this list,
> > > then I can coordinate enhancing the package and adding new features.
> > 
> > This is the first "problem". Let me expand later.
> > 
> > > If there are folks willing to take up some of the above responsibilities
> > > and dedicate about a month's effort, I think we can roll out a new release 
> > > with most of the above features by November end.
> > > 
> > > If you think you can take up a piece of this list above, then please send
> > > email and I can _actively_ start working with you all on the next
> > > release... but I _need_ your support to do this.
> > 
> > I think that the problem with NOCOL, while trying very hard not to
> > bring personalities into it, it that while open source, NOCOL is
> > "closed". The catherdral and the bazaar all over again. The cathedral
> > model (which I personally prefer for stable, solid code) only works if
> > the architect is full time, like Apache, *BSD's etc. NOCOL, as you
> > say, has not been getting the attention from you that it may get
> > elsewhere, but while the development cycle is closed people will
> > continue to make local hack or move on to other projects.
> > 
> > NOCOL was great in its day, but other packages, in my personal view
> > NetSaint, have jumped over NOCOL in leaps and bounds. I do not believe
> > that my companies resources (yes, we actually have decided to put some
> > real time into this) would be best served by trying to kick start
> > NOCOL development back up, but rather by support a "live" project like
> > NetSaint.
> > 
> > Sorry for being so direct, I hope I have not offended.
> > 
> > Regards,
> > -- 
> > Peter Galbavy
> > Knowledge Matters Ltd
> > http://www.knowledge.com/
> -- 
>                                 ^^^^^^^^^^^^^^
> Rick Robino                                  mailto:rrobino@wavedivision.com

Ken Chase, Director Operations                  Velocet Communications Inc.
math@velocet.ca                                              Toronto CANADA
"Sometimes two [harmless] words, when put together, strike fear in the
  hearts of men -- Microsoft Wallet."                           - Dave Gilbert