fix large PID overflow their column in top command

Matthew Chae matthewc at axis.com
Fri Mar 1 11:53:41 UTC 2024


On 3/1/24 12:36, Matthew Chae wrote:
> On 3/1/24 12:29, Matthew Chae wrote:
>> On 2/21/24 21:23, Bernhard Reutner-Fischer wrote:
>>> Hi Sukyoung!
>>>
>>> please don't top-post
>>>
>>> On Mon, 19 Feb 2024 10:22:05 +0000
>>> Matthew Chae <Matthew.Chae at axis.com> wrote:
>>>
>>>> Hi Bernhard,
>>>>
>>>> I'm sending new patch and the result of bloatcheck.
>>>> For me, this size is reduced to 135 bytes. What do you think?
>>>
>>> Well, 135b is better than the initial 360b :)
>>>
>>>> Can you take a look at these attachments?
>>>
>>> I'd love to.
>>> But it doesn't apply, unfortunately:
>>> $ git am -s 
>>> 0004-Allocate-PID-PPID-space-dynamically-in-top-command.patch
>>> Applying: Allocate PID/PPID space dynamically in top command
>>> error: patch failed: procps/top.c:637
>>> error: procps/top.c: patch does not apply
>>> Patch failed at 0001 Allocate PID/PPID space dynamically in top command
>>> hint: Use 'git am --show-current-patch=diff' to see the failed patch
>>> When you have resolved this problem, run "git am --continue".
>>> If you prefer to skip this patch, run "git am --skip" instead.
>>> To restore the original branch and stop patching, run "git am --abort".
>>> $ git am --abort
>>> $ patch -p1 -i 
>>> 0004-Allocate-PID-PPID-space-dynamically-in-top-command.patch
>>> patching file procps/top.c
>>> Hunk #1 FAILED at 637.
>>> Hunk #2 FAILED at 696.
>>> Hunk #3 FAILED at 709.
>>> 3 out of 3 hunks FAILED -- saving rejects to file procps/top.c.rej
>>>
>>> So I'd have to manually fiddle the patch to apply, which i honestly
>>> don't have time for ATM, as much as i love code-golf in general.
>>>
>> It’s very strange. Anyway, I'm sorry for causing you inconvenience.
>> At least now it shows successful results like below.
>>
>> git am -s 0005-Allocate-PID-PPID-space-dynamically-in-top-command.patch
>> Applying: Allocate PID/PPID space dynamically in top command
>>
>> Plus, the size is reduced more. It shows 115 bytes. You can check this 
>> in the bloatcheck_result file.
>>
>>> Furthermore (and i'm about to update https://busybox.net/developer
>>> accordingly), for legal reasons, we require a Signed-off-by, as
>>> detailed in
>>> https://www.kernel.org/doc/html/latest/process/submitting-patches.html?highlight=sign%20off#sign-your-work-the-developer-s-certificate-of-origin
>>> so please check you legal department (which should be fine for axis)
>>> and mark your contributions accordingly by 'git commit -s ...' iff this
>>> is in line with your legal situation (again, axis legal will most
>>> likely understand what this is about without much further ado, AFAIK).
>> I already added this in the previous patch. Can you check the 
>> attachment? If I'm missing something, please let me know.
>>
>>>
>>>>
>>>> PS:
>>>> The function of count_digits() is implemented inside of 
>>>> display_process_list().
>>>> To reduce the size, strlen() is not used.
>>>
>>> Did you look if manually outlining count_digits() like you did in the
>>> previous version is beneficial?
>>>
>> I don’t think outlining count_digits() is beneficial now. It appears 
>> that, at least until now, there is no use case within BusyBox for 
>> counting the digits of an integer variable. In terms of size as well, 
>> using strlen() and utoa() as in the attached code appears to be more 
>> advantageous. BTW, using make_human_readable_str() in the approach you 
>> recommended does not seem appropriate for calculating the number of 
>> digits. (pid_len = strlen(make_human_readable_str(pid_max,0,0))
>> If I misunderstood, please let me know.
>>
>>> And, did you check my previous question if it is better to use the
>>> manual buf "painting", perusing in this case pid_len (for the
>>> compile-time constant 6 as it is now) and ppid_len (ditto), and, for
>>> your other patch on top, username_len (for the current compile-time
>>> constant 8)? The loop to determine the max {,p}pid len is not for free
>>> of course, so it's okish if that manifests size-wise.
>>>
>> When large numbers are assigned to PID and PPID, they occupy many 
>> digits, leading to overflow in the PID and PPID columns. Consequently, 
>> it becomes impossible to display the entire data accurately and 
>> neatly. Additionally, a significant portion of the user name may get 
>> truncated, providing very limited information about the user name. By 
>> using this patch, although the loop is not for free, it allows for the 
>> very accurate display of results from the top command. Furthermore, 
>> the size is reduced to an appropriate level(115 bytes), enabling very 
>> efficient results with minimal investment."
>>
>> Below is an example of how it can be improved.
>> Before:
>>    PID  PPID USER
>> 4876586 4394176 busy
>>
>> After:
>>     PID    PPID USER
>> 4876548 4394517 busybox
> 
> This example appears well-organized in Thunderbird as I intended, but it 
> does not appear to be aligned properly in Outlook. If any recipients use 
> Outlook to view this example, please ignore this example.
> 
I apologize for the continued emails. When copying and pasting the 
results from the command prompt into Thunderbird or Outlook, the example 
appears much better organized than the actual result. In reality, the 
'Before' results of the 'top' command are much less organized than they 
appear above.

>>
>>> PS: 135b is better than the initial suggestion of ~300b, but given
>>> architectures tend to end up with very different codegen per arch and
>>> compilers used, i'm always curious which arch and which compiler
>>> (version) was used to obtain the alleged results. Guess you target
>>> chris with gcc-12-ish?
>>> Stating the target arch usually allows us a rough estimate
>>> about overall impact on other arches.
>>
>> I’m giving you the arch and compiler information below.
>> C Compiler: gcc -> gcc-10*
>> ARCH x86_64
>>
>>>
>>> Many thanks in advance and cheers,
>>> Bernhard
>>>
>>>>
>>>> Br-Matthew
>>>>
>>>> ________________________________
>>>> From: Bernhard Reutner-Fischer <rep.dot.nop at gmail.com>
>>>> Sent: Thursday, February 15, 2024 9:46 PM
>>>> To: Matthew Chae <Matthew.Chae at axis.com>
>>>> Cc: rep.dot.nop at gmail.com <rep.dot.nop at gmail.com>; David Laight 
>>>> <David.Laight at ACULAB.COM>; 'Denys Vlasenko' 
>>>> <vda.linux at googlemail.com>; busybox at busybox.net 
>>>> <busybox at busybox.net>; Christopher Wong <Christopher.Wong at axis.com>
>>>> Subject: Re: fix large PID overflow their column in top command
>>>>
>>>> On Wed, 14 Feb 2024 14:05:15 +0000
>>>> Matthew Chae <Matthew.Chae at axis.com> wrote:
>>>>
>>>>> Hi Bernhard,
>>>>>
>>>>> I'm sending new patch and the result of bloatcheck.
>>>>
>>>> Many thanks for the updated patch!
>>>>
>>>> function                                             old     new   
>>>> delta
>>>> display_process_list                                1406    1765    
>>>> +359
>>>> .rodata                                            99721   
>>>> 99724      +3
>>>> ------------------------------------------------------------------------------
>>>> (add/remove: 0/0 grow/shrink: 2/0 up/down: 362/0)             Total: 
>>>> 362 bytes
>>>>     text    data      bss      dec      hex  filename
>>>> 1009548   16507     1840  1027895   faf37  busybox_old
>>>> 1009910   16507     1840  1028257   fb0a1 busybox_unstripped
>>>>
>>>> I think that's too much. For me this gives +293 bytes, still way too 
>>>> much.
>>>> Can you please see if it helps to retain the manual formatting of
>>>> PID PPID USER?
>>>>
>>>> PS:
>>>>
>>>> procps/top.c: In function ‘display_process_list’:
>>>> procps/top.c:664:1: warning: ISO C90 forbids mixed declarations and 
>>>> code [-Wdeclaration-after-statement]
>>>>    664 | typedef struct { unsigned quot, rem; } bb_div_t;
>>>>        | ^~~~~~~
>>>>
>>>> so please move your new #define PID_STR block down to right before
>>>> /* what info of the processes is shown */
>>>>
>>>> In
>>>> +       int lines = (lines_rem - 1);
>>>> please drop the superfluous braces.
>>>>
>>>> It is most likely not smaller to use
>>>> pid_len = strlen(make_human_readable_str(pid_max,0,0))
>>>> than to introduce this private count_digits(), i fear?
>>>> Maybe we could amortize the addition of count_digits by
>>>> reusing it elsewhere, as a follow-up.
>>>>
>>>> thanks
>>>>
>>>>> Can you review these and give me your thoughts?
>>>>> Just let me know if you think that the code size needs to be reduced.
>>>>>
>>>>> Br-Matthew
>>>>> ________________________________
>>>>> From: Bernhard Reutner-Fischer <rep.dot.nop at gmail.com>
>>>>> Sent: Tuesday, February 13, 2024 7:16 PM
>>>>> To: Matthew Chae <Matthew.Chae at axis.com>
>>>>> Cc: rep.dot.nop at gmail.com <rep.dot.nop at gmail.com>; David Laight 
>>>>> <David.Laight at ACULAB.COM>; 'Denys Vlasenko' 
>>>>> <vda.linux at googlemail.com>; busybox at busybox.net 
>>>>> <busybox at busybox.net>; Christopher Wong <Christopher.Wong at axis.com>
>>>>> Subject: Re: fix large PID overflow their column in top command
>>>>>
>>>>> On Mon, 5 Feb 2024 09:56:20 +0000
>>>>> Matthew Chae <Matthew.Chae at axis.com> wrote:
>>>>>
>>>>>> Hi David,
>>>>>>
>>>>>> I'm sending an improved patch based on your comments.
>>>>>>
>>>>>> Not only does it not care about the PID_MAX value,
>>>>>> it searches all the contents to be output to recognize the 
>>>>>> required column width
>>>>>> and dynamically allocates the space for PID and PPID appropriately 
>>>>>> without creating a lot of empty space.
>>>>>>
>>>>>> Additionally, this patch still allows user names to be displayed 
>>>>>> up to 8 characters without truncation.
>>>>>>
>>>>>> Can you look through the attachment?
>>>>>
>>>>> Unfortunately the patch does not apply to current master.
>>>>> How much would your patch add to the size? Can you bring it down to a
>>>>> minimum?
>>>>> See make baseline; apply the patch;make bloatcheck
>>>>>
>>>>> thanks
>>>>>
>>>>>> (0002-Allocate-PID-PPID-space-dynamically-in-top-command.patch)
>>>>>>
>>>>>> BR-Matthew Chae
>>>>>> ________________________________
>>>>>> From: David Laight <David.Laight at ACULAB.COM>
>>>>>> Sent: Thursday, November 23, 2023 6:10 PM
>>>>>> To: 'Denys Vlasenko' <vda.linux at googlemail.com>; Matthew Chae 
>>>>>> <Matthew.Chae at axis.com>
>>>>>> Cc: busybox at busybox.net <busybox at busybox.net>; Christopher Wong 
>>>>>> <Christopher.Wong at axis.com>
>>>>>> Subject: RE: fix large PID overflow their column in top command
>>>>>>
>>>>>> ...
>>>>>>> +       fp = xfopen_for_read("/proc/sys/kernel/pid_max");
>>>>>>> +       if (!fgets(pid_buf, PID_DIGITS_MAX + 1, fp)) {
>>>>>>> ...
>>>>>>> +       if (strncmp(pid_buf, "32768", 5) == 0)
>>>>>>> +               pid_digits_num = 5;
>>>>>>> +       else
>>>>>>> +               pid_digits_num = PID_DIGITS_MAX;
>>>>>>>
>>>>>>> The logic above is not sound. Even if sysctl kernel.pid_max
>>>>>>> is 32768, there can be already running processes with pids > 99999.
>>>>>>
>>>>>> It's also probably wrong for pretty much all other values.
>>>>>>
>>>>>> I'd just base the column width on strlen(pid_buf) with a minimum
>>>>>> value of 5.
>>>>>>
>>>>>> It is unlikely that pid_max has been reduced - so column overflow
>>>>>> it that case probably doesn't really matter.
>>>>>>
>>>>>> The more interesting case is really a system with a very large 
>>>>>> pid_max
>>>>>> that has never run many processes.
>>>>>> You don't really want lots of blank space.
>>>>>>
>>>>>> I can't remember whether top reads everything before doing any 
>>>>>> output?
>>>>>> Since the output is sorted it probably almost always does.
>>>>>> In which case it knows the column width it will need.
>>>>>>
>>>>>> I did post a patch a while back that enabled 'Irix mode'.
>>>>>> (100% cpu is one cpu at 100%, not all cpus at 100%)
>>>>>> Maybe I should dig it out again.
>>>>>>
>>>>>>          David
>>>>>>
>>>>>> -
>>>>>> Registered Address Lakeside, Bramley Road, Mount Farm, Milton 
>>>>>> Keynes, MK1 1PT, UK
>>>>>> Registration No: 1397386 (Wales)
>>>>>
>>>>
>>>
> 



More information about the busybox mailing list