OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
Re: Watchdog timeout reset in 5.1 on intel nic:s

From: Per-Olov Sjöholm (posincedo.org)
Date: Sat May 19 2012 - 09:40:08 CDT


On 19 maj 2012, at 08:11, Garry Dolley <gdolleyarpnetworks.com> wrote:

> On Sat, May 19, 2012 at 01:54:54AM +0200, Per-Olov Sjvholm wrote:
>> On 17 maj 2012, at 12:53, Garry Dolley wrote:
>>
>>> On Thu, May 17, 2012 at 03:19:07AM -0700, Garry Dolley wrote:
>>>> On Fri, May 11, 2012 at 09:13:30AM -0400, Simon Perreault wrote:
>>>>> On 2012-05-11 04:15, Garry Dolley wrote:
>>>>>> I now have an amd64 test VM set up, where I installed stock 5.0.
>>>>>>
>>>>>> I ran a lot of traffic over em0 without any timeouts.
>>>>>
>>>>> That's expected. 5.0 has been running without issue for me for a long
>> time.
>>>>>
>>>>>> I also have been trying several -current kernels.
>>>>>>
>>>>>> As of:
>>>>>>
>>>>>> OpenBSD 5.1-current (GENERIC) #205: Wed Mar 28 21:40:45 MDT 2012
>>>>>>
>>>>>> I don't see any em0 timeouts.
>>>>>>
>>>>>> I will continue to try newer ones and report back here...
>>>>>
>>>>> Why not just test 5.1? Problems have been reported against 5.1, not
>>>>> -current.
>>>>
>>>> I now have a stock 5.1 test VM set up.
>>>>
>>>> OpenBSD 5.1 (GENERIC) #181: Sun Feb 12 09:35:53 MST 2012
>>>> deraadtamd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
>>>>
>>>> I don't see any timeouts. I grabbed the ports tree via curl several
>>>> times and have been slaving away at it over SSH. I don't notice
>>>> anything wrong.
>>>>
>>>> So, perhaps this issue does not appear in stock 5.1, but in a newer
>>>> kernel. I'll try something newer soon...
>>>
>>> I have tried the following newer kernels:
>>>
>>> bsd.20120330
>>> bsd.20120419
>>> bsd.20120427
>>> bsd.20120516
>>>
>>> I still can't reproduce the problem.
>>>
>>> I have disabled mpbios on all these kernels, forgot to mention that.
>>>
>>> I will leave this be for now; will pick it up again if any new
>>> information should arise.
>>>
>>> --
>>> Garry Dolley
>>> ARP Networks, Inc. | http://www.arpnetworks.com | (818) 206-0181
>>> Data center, VPS, and IP Transit solutions
>>> Member Los Angeles County REACT, Unit 336 | WQGK336
>>> Blog http://scie.nti.st
>>>
>>
>>
>> I have a running 4.9 release + patches ( i.e 4.9 stable) working perfect.
When
>> Updated to 5.1 release + patches I have real problems with watchdog
timeout
>> resets on my intel nic:s. Same hardware, but just different OpenBSD
version.
>>
>> I have tried a bunch of kernels from Stuart Henderson (Broken after
4.9.....).
>> I have also recompiled the 5.1 stable kernel with most versions of the
>> if_em.c driver. I have compiled and tried the following...
>> (note that the userland was 5.1 stable with all kernel tests)
>>
>> bsd-5.1-stable
>> bsd-5.1-stable_plus_if_em.c-1.249
>> bsd-5.1-stable_plus_if_em.c-1.250
>> bsd-5.1-stable_plus_if_em.c-1.251
>> bsd-5.1-stable_plus_if_em.c-1.252
>> bsd-5.1-stable_plus_if_em.c-1.253
>> bsd-5.1-stable_plus_if_em.c-1.254
>> bsd-5.1-stable_plus_if_em.c-1.263
>>
>> Watchdog timeout resets on all versions.....
>>
>> NOTE that the Watchdog timeout reset appears in version 1.249 of if_em.c
as
>> well. And that version is default in 4.9 stable which works fantastic. So
if I
>> haven't done anything totally wrong it must be related to something else
in
>> the kernel. So.... my nic hardware and the kvm bios is the same. And an
>> if_em.c version that works in 4.9 is tried. ????????
>>
>>
>> I can see above that you got rid of the problem by testing the same version
as
>> me.. But you use AMD and I use i386.
>> Also... I have a firewall with 2 nic:s. Often ONE nic works but the other
>> gives watchdog timeout resets and wont work.
>>
>> Any clues?
>
> I don't have any clues. I wasn't able to reproduce the problem,
> even though one customer I have who also upgraded experienced this
> behavior. They did not do a fresh install (that I'm aware), but
> upgraded (similar to you). I'm not sure what the previous version
> was. They have one NIC and I believe run amd64.
>
> The only difference that I can see is that on a fresh 5.1 install,
> there is no issue. But if you upgrade from a previous release, then
> the issue *might* appear.
>
> --
> Garry Dolley
> ARP Networks, Inc. | http://www.arpnetworks.com | (818) 206-0181
> Data center, VPS, and IP Transit solutions
> Member Los Angeles County REACT, Unit 336 | WQGK336
> Blog http://scie.nti.st
>

I have a fresh 5.1 rel plus stable patches. No upgrade...

Per-Olov