2017-05-26 64 views
0

我正在执行使用LLVM的无类型编程语言来生成后端代码。为了跟踪当前类型的特定变量的我使用一个结构StructTy_struct_datatype_t其定义为:LLVM:如何跟踪无类型语言的运行时Value *的数据类型?

PointerTy_8 = PointerType::get(IntegerType::get(TheContext, 8), 0); 

StructTy_struct_datatype_t = StructType::create(TheContext, "struct.datatype_t"); 
std::vector<Type *> StructTy_struct_datatype_t_fields; 
StructTy_struct_datatype_t_fields.push_back(IntegerType::get(TheContext, 32)); 
StructTy_struct_datatype_t_fields.push_back(PointerTy_8); 

// which represents the struct 
typedef struct datatype_t { 
    int type; // holds an integer that tells me the type (1 = int, 2 = float, ...) 
    void* v; // holds a pointer to the actual value 
} datatype_t; 

然后,假设我有一个这样

def function_add(a, b) { 
    return a + b; 
} 

我想要的功能该功能能够接受

  • function_add(1, 1); // returns 2; (int)
  • function_add(1.0, 1.0); // returns 2.0 (float)
  • function_add("str1", "str2"); // returns "str1str2" (string)

处理该二进制运算即代码。 a + b如下

Value* L = lhs_codegen_elements.back(); 
Value* R = rhs_codegen_elements.back(); 

if (!L || !R) { 
    logError("L or R are undefined"); 
    return codegen; 
} 

AllocaInst* lptr_datatype = (AllocaInst*)((LoadInst*)L)->getPointerOperand(); 
AllocaInst* rptr_datatype = (AllocaInst*)((LoadInst*)R)->getPointerOperand(); 

ConstantInt* const_int32_0 = ConstantInt::get(TheContext, APInt(32, StringRef("0"), 10)); 
ConstantInt* const_int32_1 = ConstantInt::get(TheContext, APInt(32, StringRef("1"), 10)); 

GetElementPtrInst* lptr_type = 
    GetElementPtrInst::Create(StructTy_struct_datatype_t, lptr_datatype, {const_int32_0, const_int32_0}, "type"); 
GetElementPtrInst* rptr_type = 
    GetElementPtrInst::Create(StructTy_struct_datatype_t, rptr_datatype, {const_int32_0, const_int32_0}, "type"); 

GetElementPtrInst* lptr_v = 
    GetElementPtrInst::Create(StructTy_struct_datatype_t, lptr_datatype, {const_int32_0, const_int32_1}, "v"); 
GetElementPtrInst* rptr_v = 
    GetElementPtrInst::Create(StructTy_struct_datatype_t, rptr_datatype, {const_int32_0, const_int32_1}, "v"); 

LoadInst* lload_inst_type = load_inst_codegen(TYPE_INT, lptr_type); 
LoadInst* rload_inst_type = load_inst_codegen(TYPE_INT, rptr_type); 

LoadInst* lload_inst_v = load_inst_codegen(TYPE_VOID_POINTER, lptr_v); 
LoadInst* rload_inst_v = load_inst_codegen(TYPE_VOID_POINTER, rptr_v); 

CmpInst* cond1 = 
    new ICmpInst(ICmpInst::ICMP_EQ, lload_inst_type, ConstantInt::get(TheContext, APInt(32, TYPE_DOUBLE))); 

Function* function_bb = dyn_cast<Function>(bb); 

BasicBlock* label_if_then_double = BasicBlock::Create(TheContext, "if.then.double", function_bb); 
BasicBlock* label_if_then_long = BasicBlock::Create(TheContext, "if.then.long", function_bb); 

BranchInst* branch_inst = BranchInst::Create(label_if_then_double, label_if_else, cond1, bb); 

L->dump(); // %load_inst = load %struct.datatype_t, %struct.datatype_t* %alloca_datatype_v, align 8 
R->dump(); // %load_inst = load %struct.datatype_t, %struct.datatype_t* %alloca_datatype_v1, align 8 

L->getType()->dump(); // %struct.datatype_t = type { i32, i8* } 
R->getType()->dump(); // %struct.datatype_t = type { i32, i8* } 

lload_inst_type->dump(); // %load_inst = load i32, i32* %type, align 4 
rload_inst_type->dump(); // %load_inst = load i32, i32* %type, align 4 

lload_inst_v->dump(); // %load_inst = load i8*, i8** %v, align 8 
rload_inst_v->dump(); // %load_inst = load i8*, i8** %v, align 8 

if (op == '+') { 
    // issue: how to take the decision without knowing the type lload_inst_v holds 
    BinaryOperator::Create(Instruction::FAdd, lload_inst_v, rload_inst_v, "add", label_if_then_double); 
    // or 
    BinaryOperator::Create(Instruction::Add, lload_inst_v, rload_inst_v, "add", label_if_then_long); 
} 

所以问题是,我需要知道这是该类型lload_inst_typerload_inst_type保持,这样我就可以切换从LLVM API BinaryOperator::Create(Instruction::FAdd, ...)的方法floatsBinaryOperator::Create(Instruction::Add, ...)ints,例如。

但是,我刚刚意识到我无法弄清AllocaInst,LoadInst在生成后端时的价值(至少我不知道如何去做)。

  • 如何在运行时跟踪Value*的数据类型?
  • 我是否选择了一种错误的策略来实现无类型语言?

回答

1

如果你的源语言的类型系统是类型化的,这将必须从LLVM隐藏,因为它的IR键入。您必须设计一种方法来在运行时跟踪类型,也许某种枚举标签对象系统。您的函数调用必须检查运行时传入的类型,并选择适当的函数进行调用。

LLVM不提供任何此功能,这必须由您的语言的运行时类型系统负责。