在gcc中的内联汇编中访问字符串的地址

我已经写了下面的汇编代码来将字符串从小写字母转换为大写字母，它并不完全工作，因为我无法访问字符串的地址， m转换。这段代码不工作为什么？在gcc中的内联汇编中访问字符串的地址

#include<stdio.h> 
    int convert(char *str) 
    { 
     char *ptr; 
    __asm__ __volatile__ ("movl (%1),%%ebx;" 
        "subl $1,%%ebx;" 
        "movl %%ebx,%0;" 
      "REPEAT: addl $1,%%ebx;" 
        "testl %%ebx,%%ebx;" 
        "je END;" 
        "movzbl 0(%%ebx),%%ecx;" 
        "cmpl $97, %%ecx;" 
        "jb END;" 
        "cmpl $122,%%ecx;" 
        "ja END;" 
        "subb $32,0(%%ebx);" 
        "jmp REPEAT;" 
       "END: movl %%ebx,(%0);" 
        :"=r" (ptr) 
        :"r" (str) 
       ); 
    printf("converted string =%s\n", str); 
} 

    int main() 
    { 
    int i; 
    char str[] = "convert"; 

    i = convert(str); 
    return 0; 

    }

来源

2014-11-05 goal4321

你的问题是什么？请问一个问题。 – fuz 2014-11-05 16:47:07

@FUZxxi：我无法访问字符串的地址，并且上面的代码不起作用！ – goal4321 2014-11-05 16:55:40

你在哪一点指定'ptr'或'str'？ – fuz 2014-11-05 16:56:50

这是我的解决方案与上面略有不同，感谢FUZxxi指出。我应该说，检索装配很大程度上有助于解决问题，但它可能很难理解，但它会给你带来实际问题。如果有人想了解我想达到的目标，我写了足够的评论。

/* code to convert from lower case to upper case */ 
int convert(char *str) 
{ 
    __asm__ __volatile__ ("movl %1,%%ebx;" // Get the address of str 
       "subl $1,%%ebx;"  
     "REPEAT: addl $1,%%ebx;"  
       "movl 0(%%ebx),%%edx" // Move the contents to edx 
       "movzbl %%dl,%%ecx;" // moving last character to ecx 
       "testl %%ecx,%%ecx;" // compare if it's null 
       "je END;"    
       "cmpl $97, %%ecx;"  
       "jb END;" 
       "cmpl $122,%%ecx;" 
       "ja END;" 
       "subb $32,(%%ebx);" // if its lower case, subtract 32 
       "jmp REPEAT;" 
      "END:;" 
       :   // No output specified 
       :"r" (str) //input 
       :"ecx","edx" //clobbers 
      ); 
    printf("converted string =%s\n", str); 
}

上面的代码，如果你编译使用“GCC -m32”选项应该工作，如果你是编译的AMD64。我遇到过这样做
问题“致命错误：SYS/cdefs.h：没有这样的文件或目录”

解决方案：安装该软件包：libc6的-DEV-I386

来源

2014-11-06 07:53:22 goal4321

这个解决方案如何不是高效的解决方案，如果你想快速做到这一点（主要是我们想要的），请使用矢量指令。 – goal4321 2014-11-28 17:21:53

在接受答案的代码似乎有几个问题：

正如所写，此代码不会编译（只有1个参数时它引用％1），并且它在第4个asm行缺少结束符。
此代码没有正确处理“aBc”之类的字符串。
此代码不使用“内存”clobber，即使它修改内存。
此代码（仍然）修改一个未被破坏的寄存器（ebx）。
不适用于x64。

怎么样更多的东西是这样的：

char *convert(char *str) 
{ 
    char *res = str; 
    char temp; 

    __asm__ __volatile__ (
     "dec %[str]\n" 
     "REPEAT:\n\t"  
     "inc %[str]\n\t" 
     "movb (%[str]), %[temp]\n\t" /* Read the next char */ 
     "testb %[temp], %[temp]\n\t" 
     "jz END\n\t"     /* Is the char null */ 
     "cmpb $97, %[temp]\n\t"  /* >= 'a'? */ 
     "jb REPEAT\n\t" 
     "cmpb $122, %[temp]\n\t"  /* <= 'z'? */ 
     "ja REPEAT\n\t" 
     "subb $32, %[temp]\n\t"  /* Perform lowercase */ 
     "mov %[temp], (%[str])\n\t" /* Write back to memory */ 
     "jmp REPEAT\n" 
     "END:\n" 
     : [str] "+r" (str), [temp] "=r" (temp) 
     : /* no inputs */ 
     : "memory" 
    ); 

    /* Note that at this point, str points to the null. 
     str - res is the length. */ 

    return res; 
}

此代码：

使用更少的寄存器（2比4）。
通过使用“= r”（temp），我们让编译器选择最好的寄存器来使用，而不是强制一个特定的寄存器。
只读取一次内存（而不是两次）。
返回一个指向字符串的指针（而不是什么都不返回？）。
IMO使用％[temp]和％[src]比％1稍微容易阅读。
使用\n\t（而不是;）使得从gcc -S的输出更易于阅读。
此代码修改str（这就是为什么它被列为“+ r”）。

或者，如果您确实想要看中，请在'c'中输入代码，然后使用gcc -O2 -S查看输出。

来源

2014-11-08 07:11:41

在gcc中的内联汇编中访问字符串的地址

回答

相关问题