| by YoungTimes | No comments

C++多线程-GDB调试

1、GDB多线程调试基础

先看一个简单的场景,丈夫和妻子赚钱养家,丈夫每天赚6美元,妻子每天赚3美元,(生活不易,哈哈),它们每天把挣到的钱存到共用账户。C++代码示例如下:

#include <iostream>
#include <thread>
#include <mutex>
#include <unistd.h>
#include <string>

using namespace std;

mutex _mutex;

static int money = 0;

void earn_money(string& name, int num) {
    while(1){
        sleep(2);
        lock_guard<mutex> lock(_mutex);
        money += num;
        cout << "total money:" << money << endl;
    }
}

int main() {
    thread person_wife(earn_money, "wife", 3);
    thread person_husband(earn_money, "husband", 6);
    person_wife.join();
    person_wife.join();
    
    return 0;
}

代码编译:

g++ -std=c++11 hard_life.cpp -o hard_life -lpthread -g

运行程序:

wife earn:3
total money:3
husband earn:6
total money:9
wife earn:3
total money:12
husband earn:6
total money:18
wife earn:3
total money:21
husband earn:6
total money:27
...

进入GDB调试:

gdb ./hard_life

GNU gdb (Ubuntu 8.1-0ubuntu3.1) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./hard_life...done.
(gdb) 

1.1 设置断点 break hard_life.cpp:18

GDB设置断点,运行程序:

(gdb) break hard_life.cpp:18
Breakpoint 1 at 0x13f4: file hard_life.cpp, line 18.
(gdb) run
Starting program: /home/xxxxxx/Documents/C++/GDB/hard_life 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff6e85700 (LWP 7356)]
[New Thread 0x7ffff6684700 (LWP 7357)]
[Switching to Thread 0x7ffff6e85700 (LWP 7356)]

Thread 2 "hard_life" hit Breakpoint 1, earn_money (name="wife", num=3) at hard_life.cpp:18
18              cout << name << " earn:" << num << endl;

1.2 查看运行的线程: info threads

查看当前运行的线程:info threads,可以看到当前wife线程断在断点,husband线程在等待wife线程释放锁。

(gdb) info threads
  Id   Target Id         Frame 
  1    Thread 0x7ffff7fce740 (LWP 7352) "hard_life" 0x00007ffff7bbed2d in __GI___pthread_timedjoin_ex (threadid=140737335809792, 
    thread_return=0x0, abstime=0x0, block=<optimized out>) at pthread_join_common.c:89
* 2    Thread 0x7ffff6e85700 (LWP 7356) "hard_life" earn_money (name="wife", num=3) at hard_life.cpp:18
  3    Thread 0x7ffff6684700 (LWP 7357) "hard_life" __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135

1.3 切换运行线程: thread thread_Id

thread xxx可以切换正在执行的线程,xxx为线程的Id。此处由于wife线程断在锁内,husband线程在等待锁,所以切换到线程3后,husband线程一直在等待锁。

(gdb) thread 3
[Switching to thread 3 (Thread 0x7ffff6684700 (LWP 7357))]
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
135     in ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S

1.4 打印所有线程堆栈: thread apply all bt

有时候程序出Core或者死锁,追查问题原因时,需要查看其它线程在做什么。

Thread 3 (Thread 0x7ffff6684700 (LWP 7357)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007ffff7bc0023 in __GI___pthread_mutex_lock (mutex=0x555555758140 <_mutex>) at ../nptl/pthread_mutex_lock.c:78
#2  0x000055555555536f in __gthread_mutex_lock (__mutex=0x555555758140 <_mutex>)
    at /usr/include/x86_64-linux-gnu/c++/7/bits/gthr-default.h:748
#3  0x000055555555575a in std::mutex::lock (this=0x555555758140 <_mutex>) at /usr/include/c++/7/bits/std_mutex.h:103
#4  0x00005555555557b6 in std::lock_guard<std::mutex>::lock_guard (this=0x7ffff6683d40, __m=...)
    at /usr/include/c++/7/bits/std_mutex.h:162
#5  0x00005555555553e3 in earn_money (name="husband", num=6) at hard_life.cpp:16
#6  0x0000555555555dfe in std::__invoke_impl<void, void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> (
    __f=@0x55555576b010: 0x5555555553a7 <earn_money(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int)>) at /usr/include/c++/7/bits/invoke.h:60
#7  0x0000555555555876 in std::__invoke<void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> (
    __fn=@0x55555576b010: 0x5555555553a7 <earn_money(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int)>) at /usr/include/c++/7/bits/invoke.h:95
#8  0x00005555555563f7 in std::thread::_Invoker<std::tuple<void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> >::_M_invoke<0ul, 1ul, 2ul> (this=0x55555576afe8) at /usr/include/c++/7/thread:234
#9  0x000055555555637c in std::thread::_Invoker<std::tuple<void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::al---Type <return> to continue, or q <return> to quit---
locator<char> >, int), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> >::operator() (
    this=0x55555576afe8) at /usr/include/c++/7/thread:243
#10 0x000055555555634c in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > >::_M_run (this=0x55555576afe0) at /usr/include/c++/7/thread:186
#11 0x00007ffff78ea6df in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#12 0x00007ffff7bbd6db in start_thread (arg=0x7ffff6684700) at pthread_create.c:463
#13 0x00007ffff734588f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7ffff6e85700 (LWP 7356)):
#0  earn_money (name="wife", num=3) at hard_life.cpp:18
#1  0x0000555555555dfe in std::__invoke_impl<void, void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> (
    __f=@0x55555576aea0: 0x5555555553a7 <earn_money(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int)>) at /usr/include/c++/7/bits/invoke.h:60
#2  0x0000555555555876 in std::__invoke<void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> (
    __fn=@0x55555576aea0: 0x5555555553a7 <earn_money(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int)>) at /usr/include/c++/7/bits/invoke.h:95
#3  0x00005555555563f7 in std::thread::_Invoker<std::tuple<void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> >::_M_invoke<0ul, 1ul, 2ul> (this=0x55555576ae78) at /usr/include/c++/7/thread:234
---Type <return> to continue, or q <return> to quit---
#4  0x000055555555637c in std::thread::_Invoker<std::tuple<void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> >::operator() (
    this=0x55555576ae78) at /usr/include/c++/7/thread:243
#5  0x000055555555634c in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > >::_M_run (this=0x55555576ae70) at /usr/include/c++/7/thread:186
#6  0x00007ffff78ea6df in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007ffff7bbd6db in start_thread (arg=0x7ffff6e85700) at pthread_create.c:463
#8  0x00007ffff734588f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7ffff7fce740 (LWP 7352)):
#0  0x00007ffff7bbed2d in __GI___pthread_timedjoin_ex (threadid=140737335809792, thread_return=0x0, abstime=0x0, 
    block=<optimized out>) at pthread_join_common.c:89
#1  0x00007ffff78ea933 in std::thread::join() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x000055555555557f in main () at hard_life.cpp:26

2. GDB调试多线程死锁

我们修改earn_money()函数,制造一个死锁(DeadLock)。

void earn_money(string name, int num) {
    while(1){
        sleep(2);
        // lock_guard<mutex> lock(_mutex);
        _mutex.lock();

        money += num;
        cout << name << " earn:" << num << endl;
        cout << "total money:" << money << endl;

        // forget to unlock()
    }
}

GDB中执行程序,可以看到程序卡住了。

2.1 第一种调试的方法

(gdb) r
Starting program: /home/xxxxxx/Documents/C++/GDB/hard_life 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff6e85700 (LWP 16989)]
[New Thread 0x7ffff6684700 (LWP 16990)]
wife earn:3
total money:3

运行过程中,按下Ctrl+C:

^C
Thread 1 "hard_life" received signal SIGINT, Interrupt.
0x00007ffff7bbed2d in __GI___pthread_timedjoin_ex (threadid=140737335809792, thread_return=0x0, abstime=0x0, block=<optimized out>)
    at pthread_join_common.c:89
89      pthread_join_common.c: No such file or directory.

查看当前线程的堆栈信息:

(gdb) info stack
#0  0x00007ffff7bbed2d in __GI___pthread_timedjoin_ex (threadid=140737335809792, thread_return=0x0, abstime=0x0, 
    block=<optimized out>) at pthread_join_common.c:89
#1  0x00007ffff78ea933 in std::thread::join() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00005555555554c3 in main () at hard_life.cpp:40

info threads查看所有线程id,前面有*的,代表正在运行的线程,其他没有*的极有可能是在阻塞或者死锁的。

thread apply all bt (thread apply all  命令,gdb会让所有线程都执行这个命令,比如命令为bt,查看所有线程的具体的栈信息)

线程堆栈中的lock_wait可能就是被死锁的线程。

2.2 第二种调试的方法

先把程序run起来,然后通过ps -aux| grep “hard_life”查看程序进程号。

ps -aux| grep "hard_life"
xxxx  20785  0.0  0.0  98104  1836 pts/2    Sl+  11:29   0:00 ./hard_life

直接使用gdb  attach  进程号;或者先进入gdb后,attach 进程号;或者gdb 可执行文件  进程号;(这里注意,需要使用root权限)

gdb attach 20785

然后再执行:

(gdb) info threads
(gdb) thread apply all bt

从输出的信息中,选择有lock_wait的线程进行问题排查。

Thread 3 (Thread 0x7f8214a8e700 (LWP 20787)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f8215fca023 in __GI___pthread_mutex_lock (
    mutex=0x55a6770f7140 <_mutex>) at ../nptl/pthread_mutex_lock.c:78
#2  0x000055a676ef431f in __gthread_mutex_lock (
    __mutex=0x55a6770f7140 <_mutex>)
    at /usr/include/x86_64-linux-gnu/c++/7/bits/gthr-default.h:748
#3  0x000055a676ef469e in std::mutex::lock (this=0x55a6770f7140 <_mutex>)
    at /usr/include/c++/7/bits/std_mutex.h:103
#4  0x000055a676ef434d in earn_money (name="husband", num=6)
    at hard_life.cpp:17

比如这里看到Thread 3中有lock_wait函数,我们使用thread 3命令切换到该线程,调用bt查看线程的执行堆栈,进行问题定位。

(gdb) bt
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f8215fca023 in __GI___pthread_mutex_lock (
    mutex=0x55a6770f7140 <_mutex>) at ../nptl/pthread_mutex_lock.c:78
#2  0x000055a676ef431f in __gthread_mutex_lock (
    __mutex=0x55a6770f7140 <_mutex>)
    at /usr/include/x86_64-linux-gnu/c++/7/bits/gthr-default.h:748
#3  0x000055a676ef469e in std::mutex::lock (this=0x55a6770f7140 <_mutex>)
    at /usr/include/c++/7/bits/std_mutex.h:103
#4  0x000055a676ef434d in earn_money (name="husband", num=6)
    at hard_life.cpp:17

发表评论