[RFA] Factorize killing the children in linux-ptrace.c, and fix a 'process leak'.

Discussion:

Philippe Waroquiers

2018-11-03 20:19:29 UTC

Running the gdb testsuite under Valgrind started to fail after 100+ tests,
due to out of memory caused by lingering processes.

The lingering processes are caused by the combination
of a limitation in Valgrind signal handling when using PTRACE_TRACEME
and a (minor) bug in GDB.

The Valgrind limitation is : when a process is ptraced and raises
a signal, Valgrind will replace the raised signal by SIGSTOP as other
signals are masked by Valgrind when executing a system call.
Removing this limitation seems far to be trivial, valgrind signal
handling is very complex.

Due to this valgrind limitation, GDB linux_ptrace_test_ret_to_nx gets
a SIGSTOP signal instead of the expected SIGTRAP or SIGSEGV.
In such a case, linux_ptrace_test_ret_to_nx does an early return, but
does not kill the child (running under valgrind), child stays in a STOP-ped
state.
These lingering processes then eat the available system memory,
till launching a new process starts to fail.

This patch fixes the GDB minor bug by killing the child in case
linux_ptrace_test_ret_to_nx does an early return.

nat/linux-ptrace.c has 3 different logics to kill a child process.
So, this patch factorizes killing a child in the function kill_child.

The 3 different logics are:
* linux_ptrace_test_ret_to_nx is calling both kill (child, SIGKILL)
and ptrace (PTRACE_KILL, child, ...), and then is calling once
waitpid.
* linux_check_ptrace_features is calling ptrace (PTRACE_KILL, child, ...)
+ my_waitpid in a loop, as long as the waitpid status was WIFSTOPPED.
* linux_test_for_tracefork is calling once ptrace (PTRACE_KILL, child, ...)
+ my_waitpid.

The linux ptrace documentation indicates that PTRACE_KILL is deprecated,
and tells to not use it, as it might return success but not kill the tracee.
The documentation indicates to send SIGKILL directly.

I suspect that linux_ptrace_test_ret_to_nx calls both kill and ptrace just
to be sure ...
I suspect that linux_check_ptrace_features calls ptrace in a loop
to bypass the PTRACE_KILL limitation.
And it looks like linux_test_for_tracefork does not handle the PTRACE_KILL
limitation.
Also, 2 of the 3 logics are calling my_waitpid, which seems better,
as this is protecting the waitpid syscall against EINTR.

So, the logic in kill_child is just using kill (child, SIGKILL)
+ my_waitpid, and then does a few verifications to see everything worked
accordingly to the plan.

Tested on Debian/x86_64.

2018-11-03 Philippe Waroquiers <***@skynet.be>

* nat/linux-ptrace.c (kill_child): New function.
(linux_ptrace_test_ret_to_nx): Use kill_child instead of local code.
Add a call to kill_child in case of early return after fork.
(linux_check_ptrace_features): Use kill_child instead of local code.
(linux_test_for_tracefork): Likewise.
---
gdb/nat/linux-ptrace.c | 77 ++++++++++++++++++++++--------------------
1 file changed, 40 insertions(+), 37 deletions(-)

diff --git a/gdb/nat/linux-ptrace.c b/gdb/nat/linux-ptrace.c
index 1f21ef03a3..49a1d011bd 100644
--- a/gdb/nat/linux-ptrace.c
+++ b/gdb/nat/linux-ptrace.c
@@ -79,6 +79,39 @@ EXTERN_C void linux_ptrace_test_ret_to_nx_instr (void);

#endif /* defined __i386__ || defined __x86_64__ */

+/* Kill CHILD. WHO is used to report warnings. */
+
+static void
+kill_child (pid_t child, const char *who)
+{
+ pid_t got_pid;
+ int kill_status;
+
+ if (kill (child, SIGKILL) != 0)
+ {
+ warning (_("%s: failed to kill child pid %ld %s\n"),
+ who, (long) child, safe_strerror (errno));
+ return;
+ }
+
+ errno = 0;
+ got_pid = my_waitpid (child, &kill_status, 0);
+ if (got_pid != child)
+ {
+ warning (_("%s: "
+ "kill waitpid returned %ld: %s"),
+ who, (long) got_pid, safe_strerror (errno));
+ return;
+ }
+ if (!WIFSIGNALED (kill_status))
+ {
+ warning (_("%s: "
+ "kill status %d is not WIFSIGNALED!"),
+ who, kill_status);
+ return;
+ }
+}
+
/* Test broken off-trunk Linux kernel patchset for NX support on i386. It was
removed in Fedora kernel 88fa1f0332d188795ed73d7ac2b1564e11a0b4cd.

@@ -91,7 +124,7 @@ linux_ptrace_test_ret_to_nx (void)
pid_t child, got_pid;
gdb_byte *return_address, *pc;
long l;
- int status, kill_status;
+ int status;
elf_gregset_t regs;

return_address
@@ -169,6 +202,7 @@ linux_ptrace_test_ret_to_nx (void)
{
warning (_("linux_ptrace_test_ret_to_nx: status %d is not WIFSTOPPED!"),
status);
+ kill_child (child, "linux_ptrace_test_ret_to_nx");
return;
}

@@ -178,6 +212,7 @@ linux_ptrace_test_ret_to_nx (void)
warning (_("linux_ptrace_test_ret_to_nx: "
"WSTOPSIG %d is neither SIGTRAP nor SIGSEGV!"),
(int) WSTOPSIG (status));
+ kill_child (child, "linux_ptrace_test_ret_to_nx");
return;
}

@@ -195,26 +230,7 @@ linux_ptrace_test_ret_to_nx (void)
# error "!__i386__ && !__x86_64__"
#endif

- kill (child, SIGKILL);
- ptrace (PTRACE_KILL, child, (PTRACE_TYPE_ARG3) NULL,
- (PTRACE_TYPE_ARG4) NULL);
-
- errno = 0;
- got_pid = waitpid (child, &kill_status, 0);
- if (got_pid != child)
- {
- warning (_("linux_ptrace_test_ret_to_nx: "
- "PTRACE_KILL waitpid returned %ld: %s"),
- (long) got_pid, safe_strerror (errno));
- return;
- }
- if (!WIFSIGNALED (kill_status))
- {
- warning (_("linux_ptrace_test_ret_to_nx: "
- "PTRACE_KILL status %d is not WIFSIGNALED!"),
- status);
- return;
- }
+ kill_child (child, "linux_ptrace_test_ret_to_nx");

/* + 1 is there as x86* stops after the 'int3' instruction. */
if (WSTOPSIG (status) == SIGTRAP && pc == return_address + 1)
@@ -352,16 +368,8 @@ linux_check_ptrace_features (void)

linux_test_for_exitkill (child_pid);

- /* Clean things up and kill any pending children. */
- do
- {
- ret = ptrace (PTRACE_KILL, child_pid, (PTRACE_TYPE_ARG3) 0,
- (PTRACE_TYPE_ARG4) 0);
- if (ret != 0)
- warning (_("linux_check_ptrace_features: failed to kill child"));
- my_waitpid (child_pid, &status, 0);
- }
- while (WIFSTOPPED (status));
+ /* Kill child_pid. */
+ kill_child (child_pid, "linux_check_ptrace_features");
}

/* Determine if PTRACE_O_TRACESYSGOOD can be used to catch
@@ -444,12 +452,7 @@ linux_test_for_tracefork (int child_pid)

/* Do some cleanup and kill the grandchild. */
my_waitpid (second_pid, &second_status, 0);
- ret = ptrace (PTRACE_KILL, second_pid, (PTRACE_TYPE_ARG3) 0,
- (PTRACE_TYPE_ARG4) 0);
- if (ret != 0)
- warning (_("linux_test_for_tracefork: "
- "failed to kill second child"));
- my_waitpid (second_pid, &status, 0);
+ kill_child (second_pid, "linux_test_for_tracefork");
}
}
else

--
2.19.1

Simon Marchi

2018-11-09 21:44:19 UTC

Permalink

Post by Philippe Waroquiers
Running the gdb testsuite under Valgrind started to fail after 100+ tests,
due to out of memory caused by lingering processes.
The lingering processes are caused by the combination
of a limitation in Valgrind signal handling when using PTRACE_TRACEME
and a (minor) bug in GDB.
The Valgrind limitation is : when a process is ptraced and raises
a signal, Valgrind will replace the raised signal by SIGSTOP as other
signals are masked by Valgrind when executing a system call.
Removing this limitation seems far to be trivial, valgrind signal
handling is very complex.
Due to this valgrind limitation, GDB linux_ptrace_test_ret_to_nx gets
a SIGSTOP signal instead of the expected SIGTRAP or SIGSEGV.
In such a case, linux_ptrace_test_ret_to_nx does an early return, but
does not kill the child (running under valgrind), child stays in a STOP-ped
state.
These lingering processes then eat the available system memory,
till launching a new process starts to fail.
This patch fixes the GDB minor bug by killing the child in case
linux_ptrace_test_ret_to_nx does an early return.
nat/linux-ptrace.c has 3 different logics to kill a child process.
So, this patch factorizes killing a child in the function kill_child.
* linux_ptrace_test_ret_to_nx is calling both kill (child, SIGKILL)
and ptrace (PTRACE_KILL, child, ...), and then is calling once
waitpid.
* linux_check_ptrace_features is calling ptrace (PTRACE_KILL, child, ...)
+ my_waitpid in a loop, as long as the waitpid status was WIFSTOPPED.
* linux_test_for_tracefork is calling once ptrace (PTRACE_KILL, child, ...)
+ my_waitpid.
The linux ptrace documentation indicates that PTRACE_KILL is deprecated,
and tells to not use it, as it might return success but not kill the tracee.
The documentation indicates to send SIGKILL directly.
I suspect that linux_ptrace_test_ret_to_nx calls both kill and ptrace just
to be sure ...
I suspect that linux_check_ptrace_features calls ptrace in a loop
to bypass the PTRACE_KILL limitation.
And it looks like linux_test_for_tracefork does not handle the PTRACE_KILL
limitation.
Also, 2 of the 3 logics are calling my_waitpid, which seems better,
as this is protecting the waitpid syscall against EINTR.
So, the logic in kill_child is just using kill (child, SIGKILL)
+ my_waitpid, and then does a few verifications to see everything worked
accordingly to the plan.
Tested on Debian/x86_64.
* nat/linux-ptrace.c (kill_child): New function.
(linux_ptrace_test_ret_to_nx): Use kill_child instead of local code.
Add a call to kill_child in case of early return after fork.
(linux_check_ptrace_features): Use kill_child instead of local code.
(linux_test_for_tracefork): Likewise.

This makes sense to me, but I'd like to invoke pedro(), see if he sees
anythin

Pedro Alves

2018-11-10 18:29:53 UTC

Permalink

Post by Simon Marchi

This makes sense to me, but I'd like to invoke pedro(), see if he sees
anything wrong with it.

There's some long and ugly history around PTRACE_KILL vs SIGKILL.
See gdb/linux-nat.c kill_wait_one_lwp and kill_one_lwp, and the comments
within those functions. I don't know whether the kernels those were for
are relevant any more. Likely using git blame and finding the corresponding
patch posts would shed some more light. Whatever we do, it'd be nice to
make gdb/linux-nat.c use the unified code as well.

Thanks,
Pedro Alves

Philippe Waroquiers

2018-11-11 17:24:35 UTC

Permalink

Post by Pedro Alves
There's some long and ugly history around PTRACE_KILL vs SIGKILL.
See gdb/linux-nat.c kill_wait_one_lwp and kill_one_lwp, and the comments
within those functions. I don't know whether the kernels those were for
are relevant any more. Likely using git blame and finding the corresponding
patch posts would shed some more light. Whatever we do, it'd be nice to
make gdb/linux-nat.c use the unified code as well.

I have done some digging in the source code and searched some
more info using git blame.
=> I found 7 different linux logics related to kill(SIGKILL)
and ptrace(PTRACE_KILL).
I found only one identification of kernels giving problems.
ChangeLogs/commit msg were not really giving much
background information : these are telling what is being changed,
not why it is being changed.
linux-low.c has the most interesting comments explaining
the kill related problems.

Below is some updated text with the found info (sorry, you
will have to re-read some parts already in the RFA).

I have to admit that a 'linux-ptrace.c only factorization'
as suggested in the RFA (at least as a first step, till we have
validated with it that a simpler kill logic is ok on various platforms)
is somewhat less frightening me than reducing all 7 different
logics to a single one.
(there is a table at the end of this mail that summarises all
7 logics).

I must say that all this is killing me :).
Maybe I would feel better if I would just have added my own
local new 8th kill logic where I need it :).

Philippe

nat/linux-ptrace.c has 3 different logics to kill a child process.
So, this patch factorizes killing a child in the function kill_child.

The 3 different logics are:
* linux_ptrace_test_ret_to_nx is calling both kill (child, SIGKILL)
and ptrace (PTRACE_KILL, child, ...), and then is calling once
waitpid.
* linux_check_ptrace_features is calling ptrace (PTRACE_KILL, child, ...)
+ my_waitpid in a loop, as long as the waitpid status was WIFSTOPPED.
* linux_test_for_tracefork is calling once ptrace (PTRACE_KILL, child, ...)
+ my_waitpid.

The linux ptrace documentation indicates that PTRACE_KILL is deprecated,
and tells to not use it, as it might return success but not kill the tracee.
The documentation indicates to send SIGKILL directly.

I suspect that linux_ptrace_test_ret_to_nx calls both kill and ptrace just
to be sure ...
git blame indicates that the call to kill(SIGKILL) was added in addition to the
pre-existing ptrace PTRACE_KILL. The commit log also indicates to ignore
PTRACE_KILL errors. But there is no explanation why adding the SIGKILL call
was needed, and why PTRACE_KILL was kept.

I suspect that linux_check_ptrace_features calls ptrace in a loop
to bypass the PTRACE_KILL limitation.
git blame indicates that this function was heavily based on
linux-nat.c:linux_test_for_tracefork (which was moved to linux-ptrace.c,
see next paragraph). I found no log/comments about the killing subtilities.

linux_test_for_tracefork seems to not handle the PTRACE_KILL
limitation/has not been updated like linux_check_ptrace_features
(which has a loop).

Also, 2 of the 3 logics are calling my_waitpid, which seems better,
as this is protecting the waitpid syscall against EINTR.

We also find a bunch of other kill logics in gdbserver/linux-low.c and
linux-nat.c and linux-fork.c.

gdbserver/linux-low.c has linux_kill_one_lwp, that has an extensive
comment explaining why it uses both PTRACE_KILLME and SIGKILL
Note however that the gdbserver code does the SIGKILL using
'syscall (__NR_tkill', not using 'kill (... SIGKILL).
gdbserver/linux-low.c does a loop calling both PTRACE_KILLME and SIGKILL,
but has a comment telling that the loop is most likely unnecessary.

Also, gdbservers/linux-low.c handles the case of waiting for "clone" children,
indicating that __WALL depends on SIGCHLD, and killing a stopped process
does not generate this signal.
It also indicates it supports old kernels not providing clone(CLONE_THREAD),
by sending a SIGKILL for each thread (not only one for the process).
There is also a comment telling that only PTRACE_KILL was used for years,
and that paranoia tells to do both SIGKILL and PTRACE_KILL, in case
PTRACE_KILL might work better on old kernels (the comment says it is dubious
there is such an old kernel, but explains that this is the paranoia).
linux-low.c also has a special logic to kill the thread that has its
lwpid == pid, because killing the parent before the children can
cause zombies on kernels at least 2.6.0-test7 ... 2.6.8-rc4.

linux-nat.c first uses SIGKILL, and then PTRACE_KILL. A comment (from 2011)
is telling that PTRACE_KILL may resume the inferior (so the need for SIGKILL).
Another comment (also 2011) tells that some kernels ignore even SIGKILL
for processes under ptrace (so the reason for keeping the PTRACE_KILL).
Note that linux-nat.c also loops (like linux-low.c), but indicates
the loop is needed as the linux kernel sometimes fails to kill a thread
after PTRACE_KILL.
To the contrary of linux-low.c, linux-nat.c uses waitpid (__WALL) rather
than do 2 successive waitpid calls like linux-low.c.
linux-nat.c also stops all threads before killing them, with a comment
explaining that this is needed to have PTRACE_KILL working.

linux-fork.c has also its own duplicated kill logics.
One uses (only) SIGKILL (with a comment telling this always works,
contrary to PTRACE_KILL).
The other logic (related to killing checkpoints) only uses PTRACE_KILL.
This logic does not call waitpid, instead it does an inferior call
to have waitpid called by an inferior (no idea why, must be something
special related to checkpoint implementation).

Below is a table summarising what has been found.
loop&stopcond is '-' if the logic at 'where' does not loop, otherwise
it details when the loop continues. SIGKILL and PTRACE_KILL column contains
a YES if the logic uses them, otherwise '-' indicates
absence of call. A 'V' indicates the return status is verified,
otherwise errors are ignored. TK indicates SIGKILL is sent with __NR_tkill,
not with the usual kill(SIGKILL).

waitpid gives some details about how the waitpid is done:
'MY' indicates my_waitpid is called (to protect against EINTR),
otherwise, it is a direct call to waitpid.
Note that the return status of waitpid is not systematically checked,
or is not checked the same way or errors are not necessarily reported.
Flags are 0, or the specifically used flags. 2 set of flags separated
by a , means there are 2 calls to waitpid.


where                        loop&cond        SIGKILL PTRACE_KILL waitpid
--------------------         -------------    ------- ----------- -------
linux-ptrace.c
linux_ptrace_test_ret_to_nx -                YES     YES         0
linux_check_ptrace_features WIFSTOPPED       -       YES V       MY 0
linux_test_for_tracefork    -                -       YES V       MY 0

linux-low.c                  WIFSTOPPED       YES TK  YES         MY 0,__WCLONE

linux-nat.c                  waitpid == pid   YES TK  YES         MY __WALL

linux-fork.c
linux_fork_killall          WIFSTOPPED       YES     -           0
delete_checkpoint_command   -                -       YES V       (infcall) 0

Philippe Waroquiers

2018-11-25 23:13:34 UTC

Permalink

Ping ?
The patch discussed here is already a first improvement
(which makes running tests under valgrind more comfortable).
Some more factorisation might be done with the 4 other
places where children are killed, but these seems more risky.

So, maybe this patch could be pushed as a first low risk
step towards the good direction ?

Thanks

Philippe

Post by Philippe Waroquiers

I have done some digging in the source code and searched some
more info using git blame.
=> I found 7 different linux logics related to kill(SIGKILL)
and ptrace(PTRACE_KILL).
I found only one identification of kernels giving problems.
ChangeLogs/commit msg were not really giving much
background information : these are telling what is being changed,
not why it is being changed.
linux-low.c has the most interesting comments explaining
the kill related problems.
Below is some updated text with the found info (sorry, you
will have to re-read some parts already in the RFA).
I have to admit that a 'linux-ptrace.c only factorization'
as suggested in the RFA (at least as a first step, till we have
validated with it that a simpler kill logic is ok on various platforms)
is somewhat less frightening me than reducing all 7 different
logics to a single one.
(there is a table at the end of this mail that summarises all
7 logics).
I must say that all this is killing me :).
Maybe I would feel better if I would just have added my own
local new 8th kill logic where I need it :).
Philippe
nat/linux-ptrace.c has 3 different logics to kill a child process.
So, this patch factorizes killing a child in the function kill_child.
* linux_ptrace_test_ret_to_nx is calling both kill (child, SIGKILL)
and ptrace (PTRACE_KILL, child, ...), and then is calling once
waitpid.
* linux_check_ptrace_features is calling ptrace (PTRACE_KILL, child, ...)
+ my_waitpid in a loop, as long as the waitpid status was WIFSTOPPED.
* linux_test_for_tracefork is calling once ptrace (PTRACE_KILL, child, ...)
+ my_waitpid.
The linux ptrace documentation indicates that PTRACE_KILL is deprecated,
and tells to not use it, as it might return success but not kill the tracee.
The documentation indicates to send SIGKILL directly.
I suspect that linux_ptrace_test_ret_to_nx calls both kill and ptrace just
to be sure ...
git blame indicates that the call to kill(SIGKILL) was added in addition to the
pre-existing ptrace PTRACE_KILL. The commit log also indicates to ignore
PTRACE_KILL errors. But there is no explanation why adding the SIGKILL call
was needed, and why PTRACE_KILL was kept.
I suspect that linux_check_ptrace_features calls ptrace in a loop
to bypass the PTRACE_KILL limitation.
git blame indicates that this function was heavily based on
linux-nat.c:linux_test_for_tracefork (which was moved to linux-ptrace.c,
see next paragraph). I found no log/comments about the killing subtilities.
linux_test_for_tracefork seems to not handle the PTRACE_KILL
limitation/has not been updated like linux_check_ptrace_features
(which has a loop).
Also, 2 of the 3 logics are calling my_waitpid, which seems better,
as this is protecting the waitpid syscall against EINTR.
We also find a bunch of other kill logics in gdbserver/linux-low.c and
linux-nat.c and linux-fork.c.
gdbserver/linux-low.c has linux_kill_one_lwp, that has an extensive
comment explaining why it uses both PTRACE_KILLME and SIGKILL
Note however that the gdbserver code does the SIGKILL using
'syscall (__NR_tkill', not using 'kill (... SIGKILL).
gdbserver/linux-low.c does a loop calling both PTRACE_KILLME and SIGKILL,
but has a comment telling that the loop is most likely unnecessary.
Also, gdbservers/linux-low.c handles the case of waiting for "clone" children,
indicating that __WALL depends on SIGCHLD, and killing a stopped process
does not generate this signal.
It also indicates it supports old kernels not providing clone(CLONE_THREAD),
by sending a SIGKILL for each thread (not only one for the process).
There is also a comment telling that only PTRACE_KILL was used for years,
and that paranoia tells to do both SIGKILL and PTRACE_KILL, in case
PTRACE_KILL might work better on old kernels (the comment says it is dubious
there is such an old kernel, but explains that this is the paranoia).
linux-low.c also has a special logic to kill the thread that has its
lwpid == pid, because killing the parent before the children can
cause zombies on kernels at least 2.6.0-test7 ... 2.6.8-rc4.
linux-nat.c first uses SIGKILL, and then PTRACE_KILL. A comment (from 2011)
is telling that PTRACE_KILL may resume the inferior (so the need for SIGKILL).
Another comment (also 2011) tells that some kernels ignore even SIGKILL
for processes under ptrace (so the reason for keeping the PTRACE_KILL).
Note that linux-nat.c also loops (like linux-low.c), but indicates
the loop is needed as the linux kernel sometimes fails to kill a thread
after PTRACE_KILL.
To the contrary of linux-low.c, linux-nat.c uses waitpid (__WALL) rather
than do 2 successive waitpid calls like linux-low.c.
linux-nat.c also stops all threads before killing them, with a comment
explaining that this is needed to have PTRACE_KILL working.
linux-fork.c has also its own duplicated kill logics.
One uses (only) SIGKILL (with a comment telling this always works,
contrary to PTRACE_KILL).
The other logic (related to killing checkpoints) only uses PTRACE_KILL.
This logic does not call waitpid, instead it does an inferior call
to have waitpid called by an inferior (no idea why, must be something
special related to checkpoint implementation).
Below is a table summarising what has been found.
loop&stopcond is '-' if the logic at 'where' does not loop, otherwise
it details when the loop continues. SIGKILL and PTRACE_KILL column contains
a YES if the logic uses them, otherwise '-' indicates
absence of call. A 'V' indicates the return status is verified,
otherwise errors are ignored. TK indicates SIGKILL is sent with __NR_tkill,
not with the usual kill(SIGKILL).
'MY' indicates my_waitpid is called (to protect against EINTR),
otherwise, it is a direct call to waitpid.
Note that the return status of waitpid is not systematically checked,
or is not checked the same way or errors are not necessarily reported.
Flags are 0, or the specifically used flags. 2 set of flags separated
by a , means there are 2 calls to waitpid.

where                        loop&cond        SIGKILL PTRACE_KILL waitpid
--------------------         -------------    ------- ----------- -------
linux-ptrace.c
linux_ptrace_test_ret_to_nx -                YES     YES         0
linux_check_ptrace_features WIFSTOPPED       -       YES V       MY 0
linux_test_for_tracefork    -                -       YES V       MY 0

linux-low.c                  WIFSTOPPED       YES TK  YES         MY 0,__WCLONE
linux-nat.c                  waitpid == pid   YES TK  YES         MY __WALL
linux-fork.c
linux_fork_killall          WIFSTOPPED       YES     -           0
delete_checkpoint_command   -                -       YES V       (infcall) 0