2016-03-03 92 views
1

我有一个Neo4j数据库,有大约130K个节点,可能在17M关系之间。我的电脑运行Windows 10有16GB的RAM,其中10GB(最大)分配给Neo4j-shell堆。Neo4j:从neo4j-shell运行密码查询时出现OutOfMemoryError

我想在命令提示符下使用neo4j-shell运行一个查询,并将结果重定向到一个csv文件。我使用该命令是:

Neo4jShell -v -file query.cql > results.csv 

当查询的形式为:

MATCH (subject)-[:type1]->(intNode1:label)<-[:type2]-(intNode2:label)<-[:type3]-(object) RETURN subject.property1, object.property1; 

的问题是,每当我运行此查询,我得到一个OutOfMemory错误(见错误消息在底部)。

有没有人有如何成功地运行这样的查询的建议?考虑到图形数据库的大小,我觉得10GB的RAM应该足够用于这样的查询。

该错误消息我得到的是:

ERROR (-v for expanded information): 
     Error occurred in server thread; nested exception is: 
     java.lang.OutOfMemoryError: GC overhead limit exceeded 
java.rmi.ServerError: Error occurred in server thread; nested exception is: 
     java.lang.OutOfMemoryError: GC overhead limit exceeded 
     at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source) 
     at sun.rmi.transport.Transport$2.run(Unknown Source) 
     at sun.rmi.transport.Transport$2.run(Unknown Source) 
     at java.security.AccessController.doPrivileged(Native Method) 
     at sun.rmi.transport.Transport.serviceCall(Unknown Source) 
     at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source) 
     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source) 
     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.access$400(Unknown Source) 
     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(Unknown Source) 
     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(Unknown Source) 
     at java.security.AccessController.doPrivileged(Native Method) 
     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source) 
     at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
     at java.lang.Thread.run(Unknown Source) 
     at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(Unknown Source) 
     at sun.rmi.transport.StreamRemoteCall.executeCall(Unknown Source) 
     at sun.rmi.server.UnicastRef.invoke(Unknown Source) 
     at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(Unknown Source) 
     at java.rmi.server.RemoteObjectInvocationHandler.invoke(Unknown Source) 
     at com.sun.proxy.$Proxy1.interpretLine(Unknown Source) 
     at org.neo4j.shell.impl.AbstractClient.evaluate(AbstractClient.java:149) 
     at org.neo4j.shell.impl.AbstractClient.evaluate(AbstractClient.java:133) 
     at org.neo4j.shell.StartClient.executeCommandStream(StartClient.java:393) 
     at org.neo4j.shell.StartClient.grabPromptOrJustExecuteCommand(StartClient.java:372) 
     at org.neo4j.shell.StartClient.startRemote(StartClient.java:330) 
     at org.neo4j.shell.StartClient.start(StartClient.java:196) 
     at org.neo4j.shell.StartClient.main(StartClient.java:135) 
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded 

回答

0

解决方案时,我哈德这个问题是要增加对neo4j.properties

# Default values for the low-level graph engine 

    cache_type=none 
    neostore.nodestore.db.mapped_memory=50M 
    neostore.relationshipstore.db.mapped_memory=500M 
    neostore.propertystore.db.mapped_memory=100M 
    neostore.propertystore.db.strings.mapped_memory=100M 
    neostore.propertystore.db.arrays.mapped_memory=0M 

也结束这一行尝试增加java中的Neo4j -wrapper.conf这一行

#wrapper.java.initmemory=2048 
#wrapper.java.maxmemory=2048 

还有一件事你有索引节点上,你查询?

+0

我只是用Neo4j的导入工具导入数据库。我不确定这是否会创建索引,或者如果我需要手动执行它?感谢您的建议。当我到家里的电脑时,我会尝试他们,看看它是否有效。 – 12MonthsASlav

+0

另外,你的电脑有多少内存?所以我知道如何设置我的记忆设置。 – 12MonthsASlav

+0

我添加#wrapper.java.initmemory = 2048 #wrapper.java.maxmemory = 2048 – NKD

0

您可以为Neo4jShell提供更多堆(使用JAVA_OPTS=-Xmx4G -Xms4G -Xmn1G环境变量)。

您是否尝试使用配置文件运行查询?由于您没有任何限制,我认为您可以跨越数十亿条路径。

您错过了主题和对象的标签,这会导致查询计划程序运行全图扫描。

MATCH (subject:label)-[:type1]->(intNode1:label) 
       <-[:type2]-(intNode2:label) 
       <-[:type3]-(object:label) 
WITH distinct subject, object 
RETURN subject.property1, object.property1; 

您应该降低中间基数和输出基数。

MATCH (subject:label)-[:type1]->(intNode1:label) 
       <-[:type2]-(intNode2:label) 
WITH distinct subject, intNode2 
MATCH (intNode2)<-[:type3]-(object:label) 
WITH distinct subject, object 
RETURN subject.property1, object.property1; 

更妙的是:

MATCH (subject:label)-[:type1]->(intNode1:label) 
       <-[:type2]-(intNode2:label) 
WITH intNode2, collect(distinct subject) as subjects 
MATCH (intNode2)<-[:type3]-(object:label) 
WITH distinct object, subjects 
UNWIND subjects as subject 
RETURN subject.property1, object.property1; 
+1

对,我将堆大小更改为10GB,但是这与我的计算机相当。我故意留下标题和目标,因为我不知道他们应该事先做些什么。查询的目标是找到指定模式中涉及的主体和对象的所有独特组合,并返回它们的属性。另外,我没有使用配置文件运行查询。这将如何帮助完全? – 12MonthsASlav