2017-09-25 80 views
0

我试图拦截在ubuntu14.04在pthread_create,代码是这样的:拦截pthread_create的Linux的功能,导致JVM/SSH崩溃

struct thread_param{ 
    void * args; 
    void *(*start_routine) (void *); 
}; 

typedef int(*P_CREATE)(pthread_t *thread, const pthread_attr_t *attr,void * 
    (*start_routine) (void *), void *arg); 

void *intermedia(void * arg){ 

struct thread_param *temp; 
temp=(struct thread_param *)arg; 
//do some other things 
return temp->start_routine(temp->args); 
} 

int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void * 
(*start_routine)(void *), void *arg){ 
    static void *handle = NULL; 
    static P_CREATE old_create=NULL; 
    if(!handle) 
    { 
     handle = dlopen("libpthread.so.0", RTLD_LAZY); 
     old_create = (P_CREATE)dlsym(handle, "pthread_create"); 
    } 
    struct thread_param temp; 
    temp.args=arg; 
    temp.start_routine=start_routine; 

    int result=old_create(thread,attr,intermedia,(void *)&temp); 
//  int result=old_create(thread,attr,start_routine,arg); 
    return result; 
} 

它可以正常工作,我自己在pthread_create测试用例(用C语言编写)。但是当我在jvm上使用hadoop时,它给了我这样的错误报告:

Starting namenodes on [ubuntu] 
ubuntu: starting namenode, logging to /home/yangyong/work/hadooptrace/hadoop-2.6.5/logs/hadoop-yangyong-namenode-ubuntu.out 
ubuntu: starting datanode, logging to /home/yangyong/work/hadooptrace/hadoop-2.6.5/logs/hadoop-yangyong-datanode-ubuntu.out 
ubuntu: /home/yangyong/work/hadooptrace/hadoop-2.6.5/sbin/hadoop-daemon.sh: line 131: 7545 Aborted     (core dumped) nohup nice -n 
$HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "[email protected]" > "$log" 2>&1 < /dev/null 
Starting secondary namenodes [0.0.0.0 
# 
# A fatal error has been detected by the Java Runtime Environment: 
# 
# SIGSEGV (0xb) at pc=0x0000000000000000, pid=7585, tid=140445258151680 
# 
# JRE version: OpenJDK Runtime Environment (7.0_121) (build 1.7.0_121-b00) 
# Java VM: OpenJDK 64-Bit Server VM (24.121-b00 mixed mode linux-amd64 compressed oops) 
# Derivative: IcedTea 2.6.8 
# Distribution: Ubuntu 14.04 LTS, package 7u121-2.6.8-1ubuntu0.14.04.1 
# Problematic frame: 
# C 0x0000000000000000 
# 
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again 
# 
# An error report file with more information is saved as: 
# /home/yangyong/work/hadooptrace/hadoop-2.6.5/hs_err_pid7585.log 
# 
# If you would like to submit a bug report, please include 
# instructions on how to reproduce the bug and visit: 
# http://icedtea.classpath.org/bugzilla 
#] 
A: ssh: Could not resolve hostname a: Name or service not known 
#: ssh: Could not resolve hostname #: Name or service not known 
fatal: ssh: Could not resolve hostname fatal: Name or service not known 
been: ssh: Could not resolve hostname been: Name or service not known 
#: ssh: Could not resolve hostname #: Name or service not known 
#: ssh: Could not resolve hostname #: Name or service not known 
#: ssh: Could not resolve hostname #: Name or service not known 
^COpenJDK: ssh: Could not resolve hostname openjdk: Name or service not known 
detected: ssh: Could not resolve hostname detected: Name or service not known 
version:: ssh: Could not resolve hostname version:: Name or service not known 
JRE: ssh: Could not resolve hostname jre: Name or service not known 

我的代码有什么问题吗?还是因为其他的东西像JVM或SSH的保护机制? 谢谢。

+0

还有另一个类似的错误示例:[链接](https://sourceware.org/ml/glibc-linux/2001-q1/msg00048.html) – Chalex

回答

0

此代码会导致子线程具有无效arg值:

struct thread_param temp; 
    temp.args=arg; 
    temp.start_routine=start_routine; 

    int result=old_create(thread,attr,intermedia,(void *)&temp); 
//  int result=old_create(thread,attr,start_routine,arg); 
    return result; // <-- temp and its contents are now invalid 

temp不能保证存在了在新线程作为父调用您pthread_create()可能已经返回,无效的值它包含。

+0

谢谢!它解决了这个问题! – Chalex

0

这是你的代码中的一堆问题。我不知道哪些(如果有的话)会导致您遇到的问题,但您一定要修复它们。

首先,您可以打开核心转储(通常使用ulimit -c unlimited)并将核心加载到GDB中。看看回溯指向什么。

不要dlopen pthreads。相反,你应该只能使用dlsym(RTLD_NEXT, "pthread_create")

但是,最可能的麻烦来源是将原始参数存储在全局变量中。这意味着如果某人(比如Java运行时)同时打开大量线程,那么您将混淆意图做什么。

+0

谢谢你的回答。对于第一点,我对gdb调试不是很熟悉,之后我开启了它,但我仍然无法弄清楚问题所在。第二点,如果我只是使用dlsym(RTLD_NEXT,“pthread_create”),它会抛出警告,并且jvm仍然会崩溃。第三点,我不太确定哪个变量是全局的。无论如何,谢谢你的及时回应。 – Chalex