OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
Re: Add timeout_add_abstv(9)

From: Philip Guenther (guenthergmail.com)
Date: Sun Aug 22 2010 - 01:25:38 CDT


On Fri, 20 Aug 2010, Bret S. Lambert wrote:
> Replied to quickly just now and missed the following...
>
> > >From an abstraction point of view, I don't think it makes sense for so
> > much code to needlessly know about "ticks".
>
> Yes, thus the conversions to actual time values. And, again, a mention
> that a discussion needs to be had concerning issues raised by those who
> have need of more specific behaviors than the current timeout API
> provides (for the purposes of POSIX-specified thread behavior, IIRC, was
> what guenther needed).

Well, what I mentioned to you wasn't really thread related, but it is
touched on by POSIX. It's mostly about general correctness.

A number of POSIX interfaces that have timeouts specify them as absolute
timeouts: you pass in the time (as a timespec) at which the operation
should time out. This is compared to other interfaces where you pass in a
relative timeout, i.e., an interval. The list of interfaces that take an
absolute timeout is actually pretty short:

 - clock_nanosleep() w/TIMER_ABSTIME
 - pthread_cond_timedwait()
 - pthread_mutex_timedlock()
 - pthread_rwlock_timedrdlock()
 - pthread_rwlock_timedwrlock()
 - timer_settime() w/TIMER_ABSTIME
 - sem_timedwait()
 - mq_timedreceive()
 - mq_timedsend()

Of those, we currently only support the four pthread_* interfaces.

Now, there are two interesting things about absolute timeouts:
1) you have to specify which clock they are compared against, and
2) some clocks can be changed, causing them to jump

Putting those together: if the clock jumps after you start waiting, the
eventual duration of your wait may need to be adjusted. Right now, we
don't do that. All our timeouts are converted to absolute timeouts
against a clock that never jumps and just stay where they were.

So, the diff at the bottom of this message does the following:
1) adds timeout_add_abs_ts(), for specifying an absolute timeout
   against a specific clock via timespec + clockid_t
2) makes settime() adjust the timeout of all absolute timeouts that
   where specified as being against CLOCK_REALTIME
3) provides tsleep_abs(): tsleep() with an absolute timeout via
   timespec + clockid_t
4) adds kernel support for clock_nanosleep(), using tsleep_abs()
5) updated thrsleep() to use tsleep_abs() (thus making the rthreads
   implementation of the mutex and rwlock functions handle clock changes)

That diff hasn't been fully tested: I tested the time jump bits when I
first wrote them months ago, but I ran out of tuits before I finished the
clock_nanosleep() stuff; note the missing libc stubs.

Now, my thoughts about Matthew's diff.

The constipation of the timeout_* args in the manpage is obviously correct
and should be done in its own commit, without any of the abstv bits. As
the loser who made the change without fixing the manpage, that part has my
ok.

As for timeout_add_abstv(), I think my blathering above should indicate
that I don't like this particular change. I see the benefit of a "timeout
at an absolute time" function as deriving from its ability to let them
handle time jumps correctly. Note that if you use timeout_add_abs_ts()
w/CLOCK_MONOTIME, then there will be no jump adjustments because that
clock never jumps, so this subsumes the functionality of
timeout_add_abstv(). Yeah, it uses timespec instead of timeval. Given
that the user-space interfaces all use timespecs, it makes more sense to
me to do that and not a timeval version...and most definitely *not* 15
versions with the units-de-jure.

A deeper question is whether of the timeout_add(foo, hzto(bar)) touched by
the proposed diff are simply non-optimal as is. Consider the second such
file, sys/net/pfkeyv2_convert.c, where the tv passed to hzto() is first
initialized with getmicrotime() and then offset from there. AFAICT, the
code would be simpler if written using timeout_add_sec(). The others
looks a bit saner, but I haven't dug backwards the full data flow that
leads to them to see whether I think there's a better way to track them.

Philip Guenther

Index: sys/syscall.h
===================================================================
RCS file: /cvs/src/sys/sys/syscall.h,v
retrieving revision 1.114
diff -u -p -r1.114 syscall.h
--- sys/syscall.h 3 Jul 2010 04:44:51 -0000 1.114
+++ sys/syscall.h 22 Aug 2010 05:49:35 -0000
-1,10 +1,10
-/* $OpenBSD: syscall.h,v 1.114 2010/07/03 04:44:51 guenther Exp $ */
+/* $OpenBSD$ */
 
 /*
  * System call numbers.
  *
  * DO NOT EDIT-- this file is automatically generated.
- * created from; OpenBSD: syscalls.master,v 1.101 2010/07/01 23:10:40 tedu Exp
+ * created from; OpenBSD: syscalls.master,v 1.102 2010/07/03 04:44:51 guenther Exp
  */
 
 /* syscall: "syscall" ret: "int" args: "int" "..." */
-542,6 +542,9
 
 /* syscall: "nanosleep" ret: "int" args: "const struct timespec *" "struct timespec *" */
 #define SYS_nanosleep 240
+
+/* syscall: "clock_nanosleep" ret: "int" args: "clockid_t" "int" "const struct timespec *" "struct timespec *" */
+#define SYS_clock_nanosleep 241
 
 /* syscall: "minherit" ret: "int" args: "void *" "size_t" "int" */
 #define SYS_minherit 250
Index: sys/syscallargs.h
===================================================================
RCS file: /cvs/src/sys/sys/syscallargs.h,v
retrieving revision 1.116
diff -u -p -r1.116 syscallargs.h
--- sys/syscallargs.h 3 Jul 2010 04:44:51 -0000 1.116
+++ sys/syscallargs.h 22 Aug 2010 05:49:35 -0000
-1,10 +1,10
-/* $OpenBSD: syscallargs.h,v 1.116 2010/07/03 04:44:51 guenther Exp $ */
+/* $OpenBSD$ */
 
 /*
  * System call argument lists.
  *
  * DO NOT EDIT-- this file is automatically generated.
- * created from; OpenBSD: syscalls.master,v 1.101 2010/07/01 23:10:40 tedu Exp
+ * created from; OpenBSD: syscalls.master,v 1.102 2010/07/03 04:44:51 guenther Exp
  */
 
 #ifdef syscallarg
-975,6 +975,13 struct sys_nanosleep_args {
         syscallarg(struct timespec *) rmtp;
 };
 
+struct sys_clock_nanosleep_args {
+ syscallarg(clockid_t) clock_id;
+ syscallarg(int) flags;
+ syscallarg(const struct timespec *) rqtp;
+ syscallarg(struct timespec *) rmtp;
+};
+
 struct sys_minherit_args {
         syscallarg(void *) addr;
         syscallarg(size_t) len;
-1460,6 +1467,7 int sys_clock_gettime(struct proc *, voi
 int sys_clock_settime(struct proc *, void *, register_t *);
 int sys_clock_getres(struct proc *, void *, register_t *);
 int sys_nanosleep(struct proc *, void *, register_t *);
+int sys_clock_nanosleep(struct proc *, void *, register_t *);
 int sys_minherit(struct proc *, void *, register_t *);
 int sys_rfork(struct proc *, void *, register_t *);
 int sys_poll(struct proc *, void *, register_t *);
Index: sys/systm.h
===================================================================
RCS file: /cvs/src/sys/sys/systm.h,v
retrieving revision 1.82
diff -u -p -r1.82 systm.h
--- sys/systm.h 20 Aug 2010 22:03:22 -0000 1.82
+++ sys/systm.h 22 Aug 2010 05:49:35 -0000
-242,10 +242,13 int sleep_finish_signal(struct sleep_sta
 void sleep_queue_init(void);
 
 struct mutex;
+struct timespec;
 void wakeup_n(const volatile void *, int);
 void wakeup(const volatile void *);
 #define wakeup_one(c) wakeup_n((c), 1)
 int tsleep(const volatile void *, int, const char *, int);
+int tsleep_abs(const volatile void *, int, const char *, clockid_t,
+ const struct timespec *);
 int msleep(const volatile void *, struct mutex *, int, const char*, int);
 void yield(void);
 
Index: sys/timeout.h
===================================================================
RCS file: /cvs/src/sys/sys/timeout.h,v
retrieving revision 1.20
diff -u -p -r1.20 timeout.h
--- sys/timeout.h 26 May 2010 17:50:00 -0000 1.20
+++ sys/timeout.h 22 Aug 2010 05:49:35 -0000
-70,6 +70,7 struct timeout {
 #define TIMEOUT_ONQUEUE 2 /* timeout is on the todo queue */
 #define TIMEOUT_INITIALIZED 4 /* timeout is initialized */
 #define TIMEOUT_TRIGGERED 8 /* timeout is running or ran */
+#define TIMEOUT_ABSOLUTE 16 /* adjust timeout from settime */
 
 #ifdef _KERNEL
 /*
-91,7 +92,9 void timeout_add_sec(struct timeout *, i
 void timeout_add_msec(struct timeout *, int);
 void timeout_add_usec(struct timeout *, int);
 void timeout_add_nsec(struct timeout *, int);
+int timeout_add_abs_ts(struct timeout *, clockid_t, const struct timespec *);
 void timeout_del(struct timeout *);
+void timeout_adjust_abs(int);
 
 void timeout_startup(void);
 
Index: kern/init_sysent.c
===================================================================
RCS file: /cvs/src/sys/kern/init_sysent.c,v
retrieving revision 1.114
diff -u -p -r1.114 init_sysent.c
--- kern/init_sysent.c 3 Jul 2010 04:44:51 -0000 1.114
+++ kern/init_sysent.c 22 Aug 2010 05:49:35 -0000
-1,10 +1,10
-/* $OpenBSD: init_sysent.c,v 1.114 2010/07/03 04:44:51 guenther Exp $ */
+/* $OpenBSD$ */
 
 /*
  * System call switch table.
  *
  * DO NOT EDIT-- this file is automatically generated.
- * created from; OpenBSD: syscalls.master,v 1.101 2010/07/01 23:10:40 tedu Exp
+ * created from; OpenBSD: syscalls.master,v 1.102 2010/07/03 04:44:51 guenther Exp
  */
 
 #include <sys/param.h>
-633,8 +633,8 struct sysent sysent[] = {
             sys_nosys }, /* 239 = unimplemented timer_getoverrun */
         { 2, s(struct sys_nanosleep_args), 0,
             sys_nanosleep }, /* 240 = nanosleep */
- { 0, 0, 0,
- sys_nosys }, /* 241 = unimplemented */
+ { 4, s(struct sys_clock_nanosleep_args), 0,
+ sys_clock_nanosleep }, /* 241 = clock_nanosleep */
         { 0, 0, 0,
             sys_nosys }, /* 242 = unimplemented */
         { 0, 0, 0,
Index: kern/kern_synch.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_synch.c,v
retrieving revision 1.95
diff -u -p -r1.95 kern_synch.c
--- kern/kern_synch.c 29 Jun 2010 00:28:14 -0000 1.95
+++ kern/kern_synch.c 22 Aug 2010 05:49:35 -0000
-136,6 +136,45 tsleep(const volatile void *ident, int p
         return (error);
 }
 
+int
+tsleep_abs(const volatile void *ident, int priority, const char *wmesg,
+ clockid_t clock_id, const struct timespec *ts)
+{
+ struct sleep_state sls;
+ int error, error1;
+ int do_sleep = 1;
+
+ if (cold || panicstr) {
+ int s;
+ /*
+ * After a panic, or during autoconfiguration,
+ * just give interrupts a chance, then just return;
+ * don't run any other procs or panic below,
+ * in case this is the idle process and already asleep.
+ */
+ s = splhigh();
+ splx(safepri);
+ splx(s);
+ return (0);
+ }
+
+ sleep_setup(&sls, ident, priority, wmesg);
+ if (ts)
+ do_sleep = timeout_add_abs_ts(&curproc->p_sleep_to,
+ clock_id, ts);
+ sleep_setup_signal(&sls, priority);
+
+ sleep_finish(&sls, do_sleep);
+ error1 = sleep_finish_timeout(&sls);
+ error = sleep_finish_signal(&sls);
+
+ /* Signal errors are higher priority than timeouts. */
+ if (error == 0 && error1 != 0)
+ error = error1;
+
+ return (error);
+}
+
 /*
  * Same as tsleep, but if we have a mutex provided, then once we've
  * entered the sleep queue we drop the mutex. After sleeping we re-lock.
-414,40 +453,28 sys_thrsleep(struct proc *p, void *v, re
         long ident = (long)SCARG(uap, ident);
         _spinlock_lock_t *lock = SCARG(uap, lock);
         static _spinlock_lock_t unlocked = _SPINLOCK_UNLOCKED;
- long long to_ticks = 0;
+ struct timespec ts, *tp;
         int error;
 
         if (!rthreads_enabled)
                 return (ENOTSUP);
- if (SCARG(uap, tp) != NULL) {
- struct timespec now, ats;
-
- if ((error = copyin(SCARG(uap, tp), &ats, sizeof(ats))) != 0 ||
- (error = clock_gettime(p, SCARG(uap, clock_id), &now)) != 0)
+ if (SCARG(uap, clock_id) != CLOCK_REALTIME &&
+ SCARG(uap, clock_id) != CLOCK_MONOTONIC)
+ return (EINVAL);
+ if (SCARG(uap, tp) == NULL)
+ tp = NULL;
+ else {
+ tp = &ts;
+ if ((error = copyin(SCARG(uap, tp), &ts, sizeof(ts))) != 0)
                         return (error);
-
- if (timespeccmp(&ats, &now, <)) {
- /* already passed: still do the unlock */
- if (lock)
- copyout(&unlocked, lock, sizeof(unlocked));
- return (EWOULDBLOCK);
- }
-
- timespecsub(&ats, &now, &ats);
- to_ticks = (long long)hz * ats.tv_sec +
- ats.tv_nsec / (tick * 1000);
- if (to_ticks > INT_MAX)
- to_ticks = INT_MAX;
- if (to_ticks == 0)
- to_ticks = 1;
         }
 
         p->p_thrslpid = ident;
 
         if (lock)
                 copyout(&unlocked, lock, sizeof(unlocked));
- error = tsleep(&p->p_thrslpid, PUSER | PCATCH, "thrsleep",
- (int)to_ticks);
+ error = tsleep_abs(&p->p_thrslpid, PUSER | PCATCH, "thrsleep",
+ SCARG(uap, clock_id), tp);
 
         if (error == ERESTART)
                 error = EINTR;
Index: kern/kern_time.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_time.c,v
retrieving revision 1.71
diff -u -p -r1.71 kern_time.c
--- kern/kern_time.c 30 Jun 2010 01:47:35 -0000 1.71
+++ kern/kern_time.c 22 Aug 2010 05:49:35 -0000
-59,6 +59,8 int64_t ntp_tick_acc;
 #endif
 
 void itimerround(struct timeval *);
+int clock_nanosleep(struct proc *, clockid_t, int, const struct timespec *,
+ struct timespec *);
 
 /*
  * Time of day and interval timer support.
-76,6 +78,8 int
 settime(struct timespec *ts)
 {
         struct timespec now;
+ struct timeval delta;
+ int ticks_delta;
 
         /*
          * Adjtime in progress is meaningless or harmful after
-108,13 +112,30 settime(struct timespec *ts)
          * setting arbitrary time stamps on files.
          */
         nanotime(&now);
- if (securelevel > 1 && timespeccmp(ts, &now, <)) {
+ timespecsub(ts, &now, &now);
+ if (securelevel > 1 && now.tv_sec < 0) {
                 printf("denied attempt to set clock back %ld seconds\n",
- now.tv_sec - ts->tv_sec);
+ -now.tv_sec);
                 return (EPERM);
         }
 
         tc_setclock(ts);
+
+ /* Deal with pending real-time timeouts */
+ TIMESPEC_TO_TIMEVAL(&delta, &now);
+ if (now.tv_sec >= 0)
+ ticks_delta = tvtohz(&delta);
+ else {
+ /* tvtohz() only handles positive deltas */
+ delta.tv_sec = -delta.tv_sec;
+ if (delta.tv_usec > 0) {
+ delta.tv_sec--;
+ delta.tv_usec = 100000 - delta.tv_usec;
+ }
+ ticks_delta = -tvtohz(&delta);
+ }
+ timeout_adjust_abs(ticks_delta);
+
         resettodr();
 
         return (0);
-125,6 +146,7 settime(struct timespec *ts)
 {
         struct timeval delta, tvv, *tv;
         int s;
+ int ticks_delta;
 
         /* XXX - Ugh. */
         tv = &tvv;
-160,7 +182,6 settime(struct timespec *ts)
                 return (EPERM);
         }
 
- /* WHAT DO WE DO ABOUT PENDING REAL-TIME TIMEOUTS??? */
         s = splclock();
         timersub(tv, &time, &delta);
         time = *tv;
-173,6 +194,20 settime(struct timespec *ts)
         tickdelta = 0;
         timedelta = 0;
 
+ /* Deal with pending real-time timeouts */
+ if (delta.tv_sec >= 0)
+ ticks_delta = tvtohz(&delta);
+ else {
+ /* tvtohz() only handles positive deltas */
+ delta.tv_sec = -delta.tv_sec;
+ if (delta.tv_usec > 0) {
+ delta.tv_sec--;
+ delta.tv_usec = 100000 - delta.tv_usec;
+ }
+ ticks_delta = -tvtohz(&delta);
+ }
+ timeout_adjust_abs(ticks_delta);
+
         splx(s);
         resettodr();
 
-281,55 +316,92 sys_clock_getres(struct proc *p, void *v
         return error;
 }
 
-/* ARGSUSED */
 int
-sys_nanosleep(struct proc *p, void *v, register_t *retval)
+clock_nanosleep(struct proc *p, clockid_t clock_id, int flags,
+ const struct timespec *rqtp, struct timespec *rmtp)
 {
         static int nanowait;
- struct sys_nanosleep_args/* {
- syscallarg(const struct timespec *) rqtp;
- syscallarg(struct timespec *) rmtp;
- } */ *uap = v;
- struct timespec rqt, rmt;
+ long long to_ticks;
+ struct timespec rqt;
         struct timespec sts, ets;
- struct timespec *rmtp;
- struct timeval tv;
         int error, error1;
 
- rmtp = SCARG(uap, rmtp);
- error = copyin(SCARG(uap, rqtp), &rqt, sizeof(struct timespec));
+ if (flags & ~TIMER_ABSTIME)
+ return (EINVAL);
+
+ if (clock_id != CLOCK_REALTIME && clock_id != CLOCK_MONOTONIC)
+ return (EOPNOTSUPP);
+
+ error = copyin(rqtp, &rqt, sizeof(struct timespec));
         if (error)
                 return (error);
 
- TIMESPEC_TO_TIMEVAL(&tv, &rqt);
- if (itimerfix(&tv))
+ if (timespecfix(&rqt))
                 return (EINVAL);
 
- if (rmtp)
- getnanouptime(&sts);
-
- error = tsleep(&nanowait, PWAIT | PCATCH, "nanosleep",
- MAX(1, tvtohz(&tv)));
+ if (flags & TIMER_ABSTIME) {
+ error = tsleep_abs(&nanowait, PWAIT | PCATCH, "nanosleep",
+ clock_id, &rqt);
+ } else {
+ if (rmtp != NULL)
+ getnanouptime(&sts);
+
+ to_ticks = (long long)hz * rqt.tv_sec +
+ rqt.tv_nsec / (tick * 1000);
+ if (to_ticks > INT_MAX)
+ to_ticks = INT_MAX;
+ else if (to_ticks == 0)
+ to_ticks = 1;
+ error = tsleep(&nanowait, PWAIT | PCATCH, "nanosleep",
+ (int)to_ticks);
+ }
         if (error == ERESTART)
                 error = EINTR;
         if (error == EWOULDBLOCK)
                 error = 0;
-
- if (rmtp) {
+ else if ((flags & TIMER_ABSTIME) == 0 && rmtp != NULL) {
                 getnanouptime(&ets);
 
                 timespecsub(&ets, &sts, &sts);
- timespecsub(&rqt, &sts, &rmt);
+ timespecsub(&rqt, &sts, &sts);
 
- if (rmt.tv_sec < 0)
- timespecclear(&rmt);
+ if (sts.tv_sec < 0)
+ timespecclear(&sts);
 
- error1 = copyout(&rmt, rmtp, sizeof(rmt));
+ error1 = copyout(&sts, rmtp, sizeof(sts));
                 if (error1 != 0)
                         error = error1;
         }
 
         return error;
+}
+
+/* ARGSUSED */
+int
+sys_nanosleep(struct proc *p, void *v, register_t *retval)
+{
+ struct sys_nanosleep_args/* {
+ syscallarg(const struct timespec *) rqtp;
+ syscallarg(struct timespec *) rmtp;
+ } */ *uap = v;
+
+ return (clock_nanosleep(p, CLOCK_REALTIME, 0, SCARG(uap, rqtp),
+ SCARG(uap,rmtp)));
+}
+
+/* ARGSUSED */
+int
+sys_clock_nanosleep(struct proc *p, void *v, register_t *retval)
+{
+ struct sys_clock_nanosleep_args /* {
+ syscallarg(clockid_t) clock_id;
+ syscallarg(int) flags;
+ syscallarg(const struct timespec *) rqtp;
+ syscallarg(struct timespec *) rmtp;
+ } */ *uap = v;
+
+ return (clock_nanosleep(p, SCARG(uap, clock_id), SCARG(uap, flags),
+ SCARG(uap, rqtp), SCARG(uap,rmtp)));
 }
 
 /* ARGSUSED */
Index: kern/kern_timeout.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_timeout.c,v
retrieving revision 1.32
diff -u -p -r1.32 kern_timeout.c
--- kern/kern_timeout.c 4 Nov 2009 19:14:10 -0000 1.32
+++ kern/kern_timeout.c 22 Aug 2010 05:49:35 -0000
-41,6 +41,8
 #include <ddb/db_output.h>
 #endif
 
+void timeout_add_finish(struct timeout *, int);
+
 /*
  * Timeouts are kept in a hierarchical timing wheel. The to_time is the value
  * of the global variable "ticks" when the timeout should be called. There are
-152,22 +154,12 timeout_set(struct timeout *new, void (*
 
 
 void
-timeout_add(struct timeout *new, int to_ticks)
+timeout_add_finish(struct timeout *new, int to_ticks)
 {
         int old_time;
 
-#ifdef DIAGNOSTIC
- if (!(new->to_flags & TIMEOUT_INITIALIZED))
- panic("timeout_add: not initialized");
- if (to_ticks < 0)
- panic("timeout_add: to_ticks (%d) < 0", to_ticks);
-#endif
-
- mtx_enter(&timeout_mutex);
- /* Initialize the time here, it won't change. */
         old_time = new->to_time;
         new->to_time = to_ticks + ticks;
- new->to_flags &= ~TIMEOUT_TRIGGERED;
 
         /*
          * If this timeout already is scheduled and now is moved
-187,6 +179,67 timeout_add(struct timeout *new, int to_
 }
 
 void
+timeout_add(struct timeout *new, int to_ticks)
+{
+#ifdef DIAGNOSTIC
+ if (!(new->to_flags & TIMEOUT_INITIALIZED))
+ panic("timeout_add: not initialized");
+ if (to_ticks < 0)
+ panic("timeout_add: to_ticks (%d) < 0", to_ticks);
+#endif
+
+ mtx_enter(&timeout_mutex);
+
+ new->to_flags &= ~(TIMEOUT_TRIGGERED | TIMEOUT_ABSOLUTE);
+
+ timeout_add_finish(new, to_ticks);
+}
+
+int
+timeout_add_abs_ts(struct timeout *new, clockid_t clock_id,
+ const struct timespec *ts)
+{
+ struct timespec sts;
+ long long to_ticks;
+
+#ifdef DIAGNOSTIC
+ if (!(new->to_flags & TIMEOUT_INITIALIZED))
+ panic("timeout_add: not initialized");
+ if (clock_id != CLOCK_REALTIME && clock_id != CLOCK_MONOTONIC)
+ panic("timeout_add: bad clockid");
+ if (ts == NULL)
+ panic("timeout_add: NULL ts");
+#endif
+
+ mtx_enter(&timeout_mutex);
+ if (clock_id == CLOCK_REALTIME) {
+ /* only realtime timeouts are absolute */
+ nanotime(&sts);
+ new->to_flags |= TIMEOUT_ABSOLUTE;
+ } else
+ nanouptime(&sts);
+
+ /*
+ * if the target time has already passed the don't set a timeout,
+ * but rather just return an indication to not sleep
+ */
+ if (timespeccmp(ts, &sts, <=))
+ return 0;
+
+ timespecsub(ts, &sts, &sts);
+ to_ticks = (long long)hz * sts.tv_sec + sts.tv_nsec / (tick * 1000);
+ if (to_ticks > INT_MAX)
+ to_ticks = INT_MAX;
+
+ new->to_flags &= ~TIMEOUT_TRIGGERED;
+
+ timeout_add_finish(new, (int)to_ticks);
+
+ /* yep, we set a timeout, so go to sleep */
+ return 1;
+}
+
+void
 timeout_add_tv(struct timeout *to, const struct timeval *tv)
 {
         long long to_ticks;
-339,6 +392,55 softclock(void *arg)
         mtx_leave(&timeout_mutex);
 }
 
+void
+timeout_adjust_abs(int adj)
+{
+#ifndef SMALL_KERNEL
+ db_expr_t offset;
+ char *name;
+ struct timeout *to;
+ struct circq *p;
+ int b, old;
+
+ mtx_enter(&timeout_mutex);
+ for (b = 0; b < nitems(timeout_wheel); b++) {
+ p = CIRCQ_FIRST(&timeout_wheel[b]);
+ while (p != &timeout_wheel[b]) {
+ to = (struct timeout *)p; /* XXX */
+ p = CIRCQ_FIRST(p);
+
+ /* only update absolute timeouts */
+ if ((to->to_flags & TIMEOUT_ABSOLUTE) == 0)
+ continue;
+
+ old = to->to_time;
+ if (adj > 0) {
+ /* clock moved forward */
+ if (to->to_time - ticks < adj)
+ to->to_time = ticks;
+ else
+ to->to_time -= adj;
+ CIRCQ_REMOVE(&to->to_list);
+ CIRCQ_INSERT(&to->to_list, &timeout_todo);
+ } else {
+ /* clock moved backwards: beware wrapping */
+ if (to->to_time - ticks > INT_MAX + adj)
+ to->to_time = ticks + INT_MAX;
+ else
+ to->to_time -= adj;
+ }
+
+ db_find_sym_and_offset((db_addr_t)to->to_func, &name,
+ &offset);
+ name = name ? name : "?";
+ printf("\tadjusted %s(%8x) from %9d to %9d\n", name,
+ to->to_arg, old - ticks, to->to_time - ticks);
+ }
+ }
+ mtx_leave(&timeout_mutex);
+#endif
+}
+
 #ifdef DDB
 void db_show_callout_bucket(struct circq *);
 
-354,9 +456,9 db_show_callout_bucket(struct circq *buc
                 to = (struct timeout *)p; /* XXX */
                 db_find_sym_and_offset((db_addr_t)to->to_func, &name, &offset);
                 name = name ? name : "?";
- db_printf("%9d %2d/%-4d %8x %s\n", to->to_time - ticks,
+ db_printf("%9d %2d/%-4d %5x %8x %s\n", to->to_time - ticks,
                     (bucket - timeout_wheel) / WHEELSIZE,
- bucket - timeout_wheel, to->to_arg, name);
+ bucket - timeout_wheel, to->to_flags, to->to_arg, name);
         }
 }
 
-366,7 +468,7 db_show_callout(db_expr_t addr, int hadd
         int b;
 
         db_printf("ticks now: %d\n", ticks);
- db_printf(" ticks wheel arg func\n");
+ db_printf(" ticks wheel flags arg func\n");
 
         db_show_callout_bucket(&timeout_todo);
         for (b = 0; b < nitems(timeout_wheel); b++)
Index: kern/syscalls.c
===================================================================
RCS file: /cvs/src/sys/kern/syscalls.c,v
retrieving revision 1.115
diff -u -p -r1.115 syscalls.c
--- kern/syscalls.c 3 Jul 2010 04:44:51 -0000 1.115
+++ kern/syscalls.c 22 Aug 2010 05:49:35 -0000
-1,10 +1,10
-/* $OpenBSD: syscalls.c,v 1.115 2010/07/03 04:44:51 guenther Exp $ */
+/* $OpenBSD$ */
 
 /*
  * System call names.
  *
  * DO NOT EDIT-- this file is automatically generated.
- * created from; OpenBSD: syscalls.master,v 1.101 2010/07/01 23:10:40 tedu Exp
+ * created from; OpenBSD: syscalls.master,v 1.102 2010/07/03 04:44:51 guenther Exp
  */
 
 char *syscallnames[] = {
-318,7 +318,7 char *syscallnames[] = {
         "#238 (unimplemented timer_gettime)", /* 238 = unimplemented timer_gettime */
         "#239 (unimplemented timer_getoverrun)", /* 239 = unimplemented timer_getoverrun */
         "nanosleep", /* 240 = nanosleep */
- "#241 (unimplemented)", /* 241 = unimplemented */
+ "clock_nanosleep", /* 241 = clock_nanosleep */
         "#242 (unimplemented)", /* 242 = unimplemented */
         "#243 (unimplemented)", /* 243 = unimplemented */
         "#244 (unimplemented)", /* 244 = unimplemented */
Index: kern/syscalls.master
===================================================================
RCS file: /cvs/src/sys/kern/syscalls.master,v
retrieving revision 1.102
diff -u -p -r1.102 syscalls.master
--- kern/syscalls.master 3 Jul 2010 04:44:51 -0000 1.102
+++ kern/syscalls.master 22 Aug 2010 05:49:35 -0000
-475,7 +475,9
 ;
 240 STD { int sys_nanosleep(const struct timespec *rqtp, \
                             struct timespec *rmtp); }
-241 UNIMPL
+241 STD { int sys_clock_nanosleep(clockid_t clock_id, \
+ int flags, const struct timespec *rqtp, \
+ struct timespec *rmtp); }
 242 UNIMPL
 243 UNIMPL
 244 UNIMPL