[OT] poll() vs. AIO (was: [PATCH] ash: clear NONBLOCK flag from stdin when in foreground)

Sat Aug 20 14:00:06 UTC 2011

On Sat, Aug 20, 2011 at 01:37:23AM +0200, Bernd Petrovitsch wrote:
> > Multiprocessing can accomplish *some* of the same things, but not all,
> 
> What does "multiprocessing" exactly mean?

Writing a program that consists of more than one process running in
parallel.

> > - A library can't fork new processes without the calling applicating
> >   being aware of this and tip-toeing around it. You have to make sure
> >   neither consumes the wait/exit status wanted by the other, and there
> >   can be only one (process-wide) SIGCHLD handler (not to mention
> >   signals are evil).
> 
> popen() is one prime example of this.
> And, of course, it needs the library (or popen(), respectively)
> designer/implementer needs to cope with this. And do not see any
> significant problem with a sane implementation of popen().

A library cannot use popen without documenting this so the caller
knows. This is because a caller that's not expecting to have child
processes except the ones it creates might install a signal handler
for SIGCHLD that just immediately reaps all children if it doesn't
care about their exit status. Then, pclose will call waitpid on a pid
that was already waited-on. In the common case that will just fail,
but if the same pid happened to get used for another process the
caller created, pclose could wait for the *wrong* pid and hang the
program!

> And if the library is badly coded, just do not use it.

You might call such calling programs badly coded, but it's a common
idiom. The alternative, if you want to make a "detached child process"
(one you don't have to keep track of and later wait for) is to "double
fork" and immediately wait for the child, running the actual code in
the grandchild process. But this costs 3+ syscalls (2 of them
extremely expensive) for something that normally takes one syscall and
increases the peak commit charge by 50%.

In short, processes are *not* easy to use. They have ugly corner
cases like this all over the place.

> > Note that just using threads does not commit you to making heavy use
> > of memory sharing. You can still write your code very similar to how
> 
> Ooops, threads share the whole (virtual) memory by design.

That does not mean you have to use it at all. It's completely possible
to write a multi-threaded program where each thread never accesses any
object except its own automatic variables and memory it allocated
itself with malloc. The only cases I can think of where memory virtual
address space is a liability are:

- you're short on virtual address space
- your threads have different conceptual privilege levels (e.g.
  interact with different users).
- your program is buggy and sometimes crashes from invalid memory
  accesses/corruption and you want to be able to just restart the
  crashing process rather than fixing your extremely dangerous code.
- you have a global variable addiction

Rich