[PATCH] DHCP Timeout issues

Denys Vlasenko vda.linux at googlemail.com
Fri May 16 18:11:43 PDT 2008


On Friday 16 May 2008 15:06, Nico Erfurth wrote:
> Hi,
> 
> the current dhcpc is broken a bit. It uses the last timeout-value over 
> and over again  When invalid packets arrive, this sometimes results in 
> it never switching from RENEW to REBIND state.
> 
> We've the issue here with a weird cable modem. Resulting in a lot of 
> "Ignoring XID" and "Unrelated/bogus packet" warnings.
> 
> The attached patch (against current svn-trunk) changes the timeout 
> handling to be based on monotonic_sec, it also rearanges some code and 
> adds the G-trick.

I am first to admit that I am no genius.

From the looks of it, I do not immediately see
the "meat" of the change among the lesser edits.
Reviewing is harder than it needs to be because you
have unrelated edits in your patch.
Can you split it in smaller bits?
For example, conversion to G trick can be in separate patch, etc.

@@ -42,6 +62,11 @@ static void change_listen_mode(int new_m
                close(sockfd);
                sockfd = -1;
        }
+       if (new_mode == LISTEN_KERNEL) {
+               sockfd = listen_socket(CLIENT_PORT, client_config.interface);
+       } else if (new_mode == LISTEN_RAW) {
+               sockfd = raw_socket(client_config.ifindex);
+       }
...
-               if (listen_mode != LISTEN_NONE && sockfd < 0) {
-                       if (listen_mode == LISTEN_KERNEL)
-                               sockfd = listen_socket(/*INADDR_ANY,*/ CLIENT_PORT, client_config.interface);
-                       else
-                               sockfd = raw_socket(client_config.ifindex);
-               }

Is this code movement essential for the correctness of the patch?
If not, can it be in follow-up patch?

-       int tryagain_timeout = 20;
-       int discover_timeout = 3;
+       time_t tryagain_timeout = 20;
+       time_t discover_timeout = 3;

Why?

-               tv.tv_sec = timeout;
+               tv.tv_sec = timeout - monotonic_sec();
+               if (tv.tv_sec < 0)
+                       tv.tv_sec = 0;

I'm puzzled. On the first iteration timeout == 0,
and tv.tv_sec gets totally bogus value, because
"- (current number of seconds since boot +
   arbitrary constant we don't know)" makes no sense.


> Another thing i'm currently not sure about (comments welcome), how long 
> should the timeout on a renew be? Currently its just lease_timeout - 
> time_when_renew_was_sent. So, it will always only send one renew, is 
> this how it should be?
> 
> I would suggest -> timeout = (lease_timeout - NOW)/3
> Maybe with a minimum of 10 secs or so. This would allow a better/faster 
> recovery when network problems occur.

You strayed in the land inhabited by two variables with
nice descriptive names: t1 and t2. [not really]

They are initially set to:

                                        t1 = lease_seconds / 2;
                                        /* little fixed point for n * .875 */
                                        t2 = (lease_seconds * 7) >> 3;
                                        timeout = t1;

And here is the renew code:

                        case BOUND:
                                /* Lease is starting to run out, time to enter renewing state */
                                change_listen_mode(LISTEN_KERNEL);
                                DEBUG("Entering renew state");
                                state = RENEWING;
                                /* fall right through */
                        case RENEWING:
                                /* Either set a new T1, or enter REBINDING state */
                                if ((t2 - t1) > (lease_seconds / (4*60*60) + 1)) {
                                        /* send a request packet */
                                        send_renew(xid, server_addr, requested_ip); /* unicast */
                                        t1 += (t2 - t1) / 2;
                                        timeout = t1 - ((unsigned)monotonic_sec() - timestamp_got_lease);
                                        continue;
                                }
                                /* Timed out, enter rebinding state */
                                DEBUG("Entering rebinding state");
                                state = REBINDING;
                                timeout = (t2 - t1);
                                continue;

"If t1 did not reach t2 yet, send a request packet, move t1 closer to t2,
and wait till new t1"

IOW: timeout already starts as lease_seconds / 2, and is halved after each
renew packet is sent. Although code indeed looks, eh, contrived,
to say the least. (But it was worse).

I propose the following two patches:

9.patch: decrease timeout so that bogus packets do not restart waiting
at the same timeout value

and

a.patch: simplify RENEW / REBIND time calculations. [Die t1 & t2, die!]

Can you test (or even improve) them?
--
vda
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 9.patch
Type: text/x-diff
Size: 2642 bytes
Desc: not available
Url : http://busybox.net/lists/busybox/attachments/20080517/15b66c5d/attachment-0002.patch 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: a.patch
Type: text/x-diff
Size: 5452 bytes
Desc: not available
Url : http://busybox.net/lists/busybox/attachments/20080517/15b66c5d/attachment-0003.patch 


More information about the busybox mailing list