[PATCH 1/2] udhcpd: sanitize invalid hostnames to match rfcs

walter harms wharms at bfs.de
Mon Oct 19 08:52:27 UTC 2015



Am 18.10.2015 23:26, schrieb Isaac Dunham:
> On Sun, Oct 18, 2015 at 07:55:38PM +0200, walter harms wrote:
>>
>>
>> Am 18.10.2015 07:54, schrieb Isaac Dunham:
>>> RFC952/RFC1123 limit the characters in a hostname for a node to
>>> [-a-zA-Z0-9], with '-' being legal only in the middle; we were
>>> accepting everything from ' ' to '~'.
>>> (As a byproduct of this, the hostname in dumpleases can now be safely
>>> used from scripts without sanitization.)
>>>
>>> function                                             old     new   delta
>>> add_lease                                            326     363     +37
>>> ------------------------------------------------------------------------------
>>> (add/remove: 0/0 grow/shrink: 1/0 up/down: 37/0)               Total: 37 bytes
>>>    text	   data	    bss	    dec	    hex	filename
>>>  892983	   6844	   7288	 907115	  dd76b	busybox_old
>>>  893020	   6844	   7288	 907152	  dd790	busybox_unstripped
>>> ---
>>>  networking/udhcp/leases.c | 13 ++++++++++---
>>>  1 file changed, 10 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/networking/udhcp/leases.c b/networking/udhcp/leases.c
>>> index 745340a..1f7af87 100644
>>> --- a/networking/udhcp/leases.c
>>> +++ b/networking/udhcp/leases.c
>>> @@ -65,12 +65,19 @@ struct dyn_lease* FAST_FUNC add_lease(
>>>  			if (hostname_len > sizeof(oldest->hostname))
>>>  				hostname_len = sizeof(oldest->hostname);
>>>  			p = safe_strncpy(oldest->hostname, hostname, hostname_len);
>>> -			/* sanitization (s/non-ASCII/^/g) */
>>> +			/* sanitization - per rfcs 952 & 1123 only [-a-zA-Z0-9] are legal
>>> +			 * with '-' being allowed only in the middle
>>> +			 */
>>>  			while (*p) {
>>> -				if (*p < ' ' || *p > 126)
>>> -					*p = '^';
>>> +				if (! (isupper((char)*p) || islower((char)*p) ||
>>> +						isdigit((char)*p) || (char)*p == '-') )
>>> +					*p = '-';
>>>  				p++;
>>>  			}
>>> +			if (p--, *p == '-')
>>> +				*p = 'X';
>>> +			if (p = oldest->hostname, *p == '-')
>>> +				*p = 'X';
>>>  		}
>>>  		if (chaddr)
>>>  			memcpy(oldest->lease_mac, chaddr, 6);
>>
>> since several tools check for hostnames,
>> maybe it is useful to make this a function ?
> 
> What this does is not  simply 'check for validity'; it *makes* a hostname
> valid, which is not what most tools need.
> It also is exclusively for leaf node names, rather than an FQDN (ie,
> '.' is not valid here).
> 
> It would be possible to design a function that can check or fix the
> hostname depending how it's called, though I wonder if that's
> doing too much in a single call.
> 
> It would probably have to be something like this:
> 
> #define HOSTCHECK_LEAF	0x1 //leaf hostname-no '.' allowed
> #define HOSTCHECK_FIX	0x2 //fix-replace invalid chars with '-'/'X'
> 
> //return NULL if valid, pointer to first invalid char otherwise
> char * validate_hostname(char *p, int flags);
> 
> This does not handle transforming a URL via punycode, of course.
> 
> Would such an interface be desireable?

note: i did not make an inventory if this is needed by other
      programms but i can imagine that with 'hostname' it would be useful.

for a bit more flexibility:
        int status=valid_hostname(char *in, char **out, int flags);

In a first step it would be sufficient to move this code into a function
and then look for more uses.

re,
 wh

> Thanks,
> Isaac Dunham
> 


More information about the busybox mailing list