OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
Re: non-blocking connect and EAGAIN

From: Chad MILLER (cmillermysql.com)
Date: Wed Sep 19 2007 - 09:04:53 CDT


Hi, Dmitriy, Vladimir!

On 19 Sep 2007, at 07:40, Vladimir Shebordaev wrote:

> Hi, Dmitriy,
>
> would you please specify when you get those reconnects?
>
> The Linux connect() system call on non-blocking AF_UNIX sockets
> should return immediately with EAGAIN when the peer's backlog queue
> is full.

Vladimir's right here. The Linux kernel doesn't normally send errno
EINPROGRESS, but it does send EAGAIN for this case:

         if (skb_queue_len(&other->sk_receive_queue) >
             other->sk_max_ack_backlog) {
                 err = -EAGAIN;
                 if (!timeo)
                         goto out_unlock;

                 timeo = unix_wait_for_peer(other, timeo);

                 err = sock_intr_errno(timeo);
                 if (signal_pending(current))
                         goto out;
                 sock_put(other);
                 goto restart;
         }

Notably, the BSDs don't send EAGAIN, as far as I can tell.

> Otherwise connect() will block until there is some room available
> on receiving end. MySQL client intention is to literally follow
> that system call when there is no timeout option explicitly
> specified (see the comments in my_connect() right above the lines
> you've cited). So, what you get looks like intended behavior from
> both kernel and MySQL side.

Agreed, for the most part. (I don't know that the kernel sends
EAGAIN /only/ for no-timeout/non-blocking connect()ion attempts. I
didn't dig wider than the above.)

The Linux kernel truly couldn't accept the connect() syscall, and
this is a valid problem. The library code behaves correctly because
the library /should/ pass errors from the kernel up to the client.
This specific case isn't one I think we considered, but client code
should handle all errors the OS could generate; the library shouldn't
insulate the client from the kernel, but it should from the server.

> Please check out the MySQL 5.0 trouble shooting page at <http://
> dev.mysql.com/doc/refman/5.0/en/can-not-connect-to-server.html>.
> You've probably got your server crashed or stalled due to some real
> bug. If so, you should try to reproduce it and file a bug report.
> But please upgrade to decent MySQL version first of all.

It could be a crashed server that's causing the problem, I suppose.
More likely, if it's not, please keep us included if there's another
bottleneck in connecting that you find.

- chad

> Dmitriy MiksIr wrote:
>> Hello!
>> I got a lot of mysql errors "Can't connect to local MySQL server
>> through socket '/var/lib/mysql/mysql.sock' (11)".
>> I trace one of this error and see, what non-blocking connect
>> return EAGAIN. See:
>> fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
>> connect(3, {sa_family=AF_FILE, path="/var/lib/mysql/mysql.sock"},
>> 110) = -1 EAGAIN (Resource temporarily unavailable)
>> Mysql's connect do not detect this error:
>> if ((res != 0) && (s_err != EINPROGRESS))
>> {
>> errno= s_err; /* Restore it */
>> return(-1);
>> }
>> Is this kernel bug (Linux 2.6.16-std26-smp-alt1)?... which return
>> EAGAIN instead of EINPROGRESS, or some other troubles can force
>> EAGAIN on unix socket connect?

--
Chad Miller, Software Developer chadmysql.com
MySQL Inc., www.mysql.com
Orlando, Florida, USA 13-20z, UTC-0400
Office: +1 408 213 6740 sip:6740sip.mysql.com

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (Darwin)

iD8DBQFG8SyG/peCpMTxrLsRAgfUAKCYKgUIISjEnAG5wBEcsFQ9No/tVACfZ7qI
V93Wbsk0rdhuBf9qdkdSYDQ=
=eM80
-----END PGP SIGNATURE-----