整数类包装性能

我正在寻找重塑现有库的定点数。目前该库只是名称空间的函数，在32位有符号整数上运行。我想转过头来创建一个固定点类，它包装一个整数，但不希望支付任何与这些细粒度类相关的性能损失，因为性能是用例的一个问题。由于未来的类具有如此简单的数据需求，并且没有资源，我认为有可能使类“价值导向”，利用非修改操作并在合理的地方通过值传递实例。如果实施，这将是一个简单的类，而不是层次结构的一部分。整数类包装性能

我想知道是否有可能编写一个整数包装类的方式，没有真正的性能损失发生与使用原始整数相比。我几乎确信，情况确实如此，但对编译过程不太了解，只是跳到其中。

我知道，据说stl iterators被编译为简单的指针操作，并且希望只对整型操作做类似的操作。

作为项目的一部分，该库将作为项目的一部分更新为C++ 11，所以我希望至少通过constexpr和其他新的特性（如右值引用），我可以将此类的性能推到接近纯整数操作。

此外，任何关于两个实现之间的性能差异基准测试的建议将不胜感激。

来源

2012-03-12 metatheorem

'constexpr'会很有用，我怀疑你会对右值引用有任何用处，因为你的课程首先会很便宜。使用大量的内联，保持你的构造函数和析构函数不重要，你应该看到很好的性能。 – 2012-03-12 05:01:50

有什么有趣的这个问题是，它只是使编译器相关的。使用锵/ LLVM：

#include <iostream> 
using namespace std; 

inline int foo(int a) { return a << 1; } 

struct Bar 
{ 
    int a; 

    Bar(int x) : a(x) {} 

    Bar baz() { return a << 1; } 
}; 

void out(int x) __attribute__ ((noinline)); 
void out(int x) { cout << x; } 

void out(Bar x) __attribute__ ((noinline)); 
void out(Bar x) { cout << x.a; } 

void f1(int x) __attribute ((noinline)); 
void f1(int x) { out(foo(x)); } 

void f2(Bar b) __attribute ((noinline)); 
void f2(Bar b) { out(b.baz()); } 

int main(int argc, char** argv) 
{ 
    f1(argc); 
    f2(argc); 
}

给出the following IR：

define void @_Z3outi(i32 %x) uwtable noinline { 
    %1 = tail call %"class.std::basic_ostream"* 
       @_ZNSolsEi(%"class.std::basic_ostream"* @_ZSt4cout, i32 %x) 
    ret void 
} 

define void @_Z3out3Bar(i32 %x.coerce) uwtable noinline { 
    %1 = tail call %"class.std::basic_ostream"* 
       @_ZNSolsEi(%"class.std::basic_ostream"* @_ZSt4cout, i32 %x.coerce) 
    ret void 
} 

define void @_Z2f1i(i32 %x) uwtable noinline { 
    %1 = shl i32 %x, 1 
    tail call void @_Z3outi(i32 %1) 
    ret void 
} 

define void @_Z2f23Bar(i32 %b.coerce) uwtable noinline { 
    %1 = shl i32 %b.coerce, 1 
    tail call void @_Z3out3Bar(i32 %1) 
    ret void 
}

而且不出所料，所产生的组件仅仅是相同的：

.globl _Z2f1i 
    .align 16, 0x90 
    .type _Z2f1i,@function 
_Z2f1i:         # @_Z2f1i 
.Ltmp6: 
    .cfi_startproc 
# BB#0: 
    addl %edi, %edi 
    jmp _Z3outi     # TAILCALL 
.Ltmp7: 
    .size _Z2f1i, .Ltmp7-_Z2f1i 
.Ltmp8: 
    .cfi_endproc 
.Leh_func_end2: 


    .globl _Z2f23Bar 
    .align 16, 0x90 
    .type _Z2f23Bar,@function 
_Z2f23Bar:        # @_Z2f23Bar 
.Ltmp9: 
    .cfi_startproc 
# BB#0: 
    addl %edi, %edi 
    jmp _Z3out3Bar    # TAILCALL 
.Ltmp10: 
    .size _Z2f23Bar, .Ltmp10-_Z2f23Bar 
.Ltmp11: 
    .cfi_endproc 
.Leh_func_end3:

通常，只要在类的方法是内联，this参数和参考可以轻松省略。我不太清楚海湾合作委员会如何搞砸这件事。

来源

2012-03-12 09:17:48

“我怎么学会停止担心并且信任编译器”，我猜。 – metatheorem 2012-03-15 03:43:18

@metatheorem：一般来说，你应该相信编译器的细节，为了提高性能，你最好将精力集中在算法和I/O（磁盘，网络，数据库......）。如果计算速度慢，那么您将不得不分析并检查它是CPU还是内存瓶颈，并由分析器指导尝试在那里改进;但下降到这个水平是非常罕见的。 – 2012-03-15 08:38:10

impleming值语义的定点运算会产生，因为性能较差......

#include <iostream> 
using namespace std; 

inline int foo(int a) { return a << 1; } 

struct Bar 
{ 
    int a; 

    Bar(int x) : a(x) {} 

    Bar baz() { return a << 1; } 
}; 

void out(int x) __attribute__ ((noinline)); 
void out(int x) { cout << x; } 

void out(Bar x) __attribute__ ((noinline)); 
void out(Bar x) { cout << x.a; } 

void f1(int x) __attribute ((noinline)); 
void f1(int x) { out(foo(x)); } 

void f2(Bar b) __attribute ((noinline)); 
void f2(Bar b) { out(b.baz()); } 

int main(int argc, char** argv) 
{ 
    f1(argc); 
    f2(argc); 
}

现在，让我们看到了F1和F2拆卸...

00000000004006e0 <f1(int)>: 
    4006e0: 01 ff     add edi,edi 
    4006e2: e9 d9 ff ff ff   jmp 4006c0 <out(int)> 
    4006e7: 66 0f 1f 84 00 00 00 nop WORD PTR [rax+rax*1+0x0] 
    4006ee: 00 00 

00000000004006f0 <f2(Bar)>: 
    4006f0: 48 83 ec 08    sub rsp,0x8 
    4006f4: 01 ff     add edi,edi 
    4006f6: e8 d5 ff ff ff   call 4006d0 <out(Bar)> 
    4006fb: 48 83 c4 08    add rsp,0x8 
    4006ff: c3      ret

正如你可以看到F2对堆栈指针有一些额外的麻烦，这也防止了ret被消除。

（这是在-O3 G ++ 4.6.1）

来源

2012-03-12 06:04:39

我认为你应该让你的观点在原始问题上更加明确。我想你的意思是“用值语义实现定点算术会产生较差的性能，因为......“ – 2012-03-12 06:17:03

有一半的时间我看到像这样的结果，有一些我忽略了的细微问题（并且一旦发现就可以纠正），而不仅仅是”编译器没有提供你认为的那么好的代码“。 ... – Hurkyl 2012-03-12 06:47:11

这是一个简洁的，如果模糊的答案，我得到了相同的结果后，拆卸和玩你的代码任何想法是什么阻止优化？ – metatheorem 2012-03-12 08:36:16

整数类包装性能

回答

相关问题