OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
Re: Load balancing with hoststated fails

From: Aaron Glenn (aaron.glenngmail.com)
Date: Fri Jan 18 2008 - 19:52:14 CST


A uname -a and the relevant snippets of hoststated.conf would go a
long way in assisting you...

On 1/18/08, Rami Sik <rsikipapplications.com> wrote:
> I have been using PF on openBSD as a firewall box without any problem. I
> have two boxes in redundant configuration with CARP. Afterwards, I
> needed to use load balancing for both http and https using hoststated.
> However, load balancing does not seem to stable. In my case, it is
> almost working for a week. Then, it starts seeing the primary web server
> down, and it tries to use the backup web server. Sometimes, it fail
> overs to the backup server, sometimes it sees the backup one down as
> well. Although my both web servers are up and running, it never sees
> them up, and load balancing just stays in down state. I tried to reload
> hoststated but it did not make any difference. I also tried to stop
> hoststated, but it failed to stop. I also tried to disable/enable PF,
> and it did not make any difference. Only way to recover is to reboot the
> boxes once hoststated is down. Then, the cycle starts again, and it goes
> well for a week and the same thing again.
>
>
>
> While troubleshooting, I only noticed that the total memory usage
> reported by "top" always gets higher and higher. I have 2 G of physical
> memory on the boxes. However, my observation is that when the total mem
> hits 260M level, it may fail anytime.
>
>
>
> Here is the top from a newly rebooted box:
>
>
>
> load averages: 0.11, 0.09, 0.08
> 11:16:54
>
> 36 processes: 35 idle, 1 on processor
>
> CPU0 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt,
> 100% idle
>
> CPU1 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt,
> 100% idle
>
> Memory: Real: 12M/97M act/tot Free: 3865M Swap: 0K/3585M used/tot
>
>
>
> PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU
> COMMAND
>
> 7 root 10 0 0K 84M sleep/0 pftm 0:00 0.00%
> pfpurge
>
> 9 root -18 0 0K 84M idle reaper 0:00 0.00%
> reaper
>
> 11 root 18 0 0K 84M sleep/0 syncer 0:00 0.00%
> update
>
> 0 root -18 0 0K 84M sleep/0 schedul 0:00 0.00%
> swapper
>
> 13 root 14 0 0K 84M idle crypto_ 0:00 0.00%
> crypto
>
> 10 root -13 0 0K 84M idle cleaner 0:00 0.00%
> cleaner
>
> 4 root 10 0 0K 84M idle usbevt 0:00 0.00% usb0
>
> 6 root 10 0 0K 84M idle usbevt 0:00 0.00% usb1
>
> 12 root -18 0 0K 84M idle aiodone 0:00 0.00%
> aiodoned
>
> 8 root -18 0 0K 84M idle pgdaemo 0:00 0.00%
> pagedaemon
>
> 5 root 10 0 0K 84M idle usbtsk 0:00 0.00%
> usbtask
>
> 3 root 10 0 0K 84M idle bored 0:00 0.00% syswq
>
> 2 root -18 0 0K 84M idle kmalloc 0:00 0.00%
> kmthread
>
> 640 root 2 0 2252K 4480K sleep/0 select 0:00 0.00% snmpd
>
> 24451 _hoststa 2 0 1356K 2456K idle kqread 0:12 0.00%
> hoststated
>
>
>
>
>
> The following is from a box rebooted yesterday (note the increased
> memory usage of system processes under RES column):
>
>
>
> load averages: 0.16, 0.14, 0.09
> 11:18:01
>
> 37 processes: 1 running, 35 idle, 1 on processor
>
> CPU0 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt,
> 100% idle
>
> CPU1 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt,
> 100% idle
>
> Memory: Real: 13M/206M act/tot Free: 1802M Swap: 0K/3584M used/tot
>
>
>
> PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU
> COMMAND
>
> 8 root 10 0 0K 193M sleep/0 pftm 0:00 0.00%
> pfpurge
>
> 4 root 10 0 0K 193M sleep/0 ipmi_po 0:00 0.00% ipmi0
>
> 10 root -18 0 0K 193M idle reaper 0:00 0.00%
> reaper
>
> 12 root 18 0 0K 193M sleep/0 syncer 0:00 0.00%
> update
>
> 0 root -18 0 0K 193M sleep/0 schedul 0:00 0.00%
> swapper
>
> 14 root 14 0 0K 193M idle crypto_ 0:00 0.00%
> crypto
>
> 11 root -13 0 0K 193M idle cleaner 0:00 0.00%
> cleaner
>
> 5 root 10 0 0K 193M sleep/0 usbevt 0:00 0.00% usb0
>
> 7 root 10 0 0K 193M sleep/0 usbevt 0:00 0.00% usb1
>
> 13 root -18 0 0K 193M idle aiodone 0:00 0.00%
> aiodoned
>
> 9 root -18 0 0K 193M idle pgdaemo 0:00 0.00%
> pagedaemon
>
> 6 root 10 0 0K 193M idle usbtsk 0:00 0.00%
> usbtask
>
> 3 root 10 0 0K 193M idle bored 0:00 0.00% syswq
>
> 2 root -18 0 0K 193M idle kmalloc 0:00 0.00%
> kmthread
>
> 29491 root 2 0 2308K 4536K sleep/0 select 0:03 0.00% snmpd
>
> 22579 _hoststa 2 0 2280K 2832K idle kqread 3:11 0.00%
> hoststated
>
>
>
>
>
> Rami