0
我是OrientDB的新手,对Neo4J有一点经验,在使用OETL.BAT工具加载和创建边缘时遇到性能问题。我需要在节点之间创建约440万条边(约42百万个,并不是所有的都在这个阶段使用)。 “客户”节点已经加载,我加载的边界列表非常简单(如下所示),并且每个边缘的源目标ID仅为&,其目的是模拟客户之间的付款。根据etl工具,目前我的吞吐量为每秒23-30次。我使用了CSV文件,而不是JDBC连接到我的RDBMS,并且我也处于“plocal”模式。OrientDB慢边ETL创建
有没有更快的方法来做到这一点?或者我可能采取了错误的做法?
客户 - 顶点 CISNumber,名称
支付 - 边缘提前 SourceCISNumber,DestCISNumber,金额,TransactionCount
感谢
{
"source": { "file": { "path": "/datafiles/PersonalCustomers/Edges.csv" } },
"extractor": { "row": {} },
"transformers": [
{"csv": {} },
{"merge": {"joinFieldName": "SourceCISNumber", "lookup": "Customer.CISNumber"} },
{"vertex": {"class": "Customer", "skipDuplicates": true} },
{ "edge":
{
"class": "PAID",
"joinFieldName": "DestCISNumber",
"lookup": "Customer.CISNumber",
"unresolvedLinkAction": "SKIP",
"edgeFields":
{
"Volume": "${input.Transactioncount}",
"Value": "${input.Amount}"
}
}
},
{"field": {"fieldNames": ["SourceCISNumber", "DestCISNumber", "Transactioncount", "Amount"], "operation": "remove" } }
],
"loader": {
"orientdb": {
"dbURL": "plocal:/orientdb/databases/Customers",
"dbType": "graph",
"batchCommit": 500,
"useLightweightEdges" : true,
"classes": [
{"name": "PAID", "extends": "E"},
]
},
"indexes": [
{"class":"Customer", "fields":["CISNumber:long"] }
]
}
}
你可以看到这个[question](http://stackoverflow.com/questions/37053190/orientdb-fastest-batchimport/37065876#37065876) –