我有以下的数据集(这是一个示例):费舍尔耶茨洗牌在python
ID Sub1 Sub2 Sub3 Sub4
Creb3l1 10.14 9.67 10.14 10.42
Chchd6 11.25 10.74 10.80 11.07
Arih1 9.91 9.25 10.20 9.34
Prpf8 11.54 11.58 11.14 11.36
Rfng 11.71 11.56 10.81 10.72
Rnf114 12.66 12.60 12.59 12.56
我要进行的费雪耶茨对这个数据交叉设置10倍(即写10个输出文件,每一个使用Fisher Yates shuffle进行一次数据随机化)。
我写这个代码:
import sys
import itertools
from itertools import permutations
for line in open(sys.argv[1]).readlines()[2:]:
line = line.strip().split()
ID = line[0]
expression_values = line[1:]
for shuffle in permutations(expression_values):
print shuffle
此代码的输出是这样的(样品):
('11.25', '10.74', '10.80', '11.07')
('11.25', '10.74', '11.07', '10.80')
('11.25', '10.80', '10.74', '11.07')
('11.25', '10.80', '11.07', '10.74')
('11.25', '11.07', '10.74', '10.80')
('11.25', '11.07', '10.80', '10.74')
('10.74', '11.25', '10.80', '11.07')
('10.74', '11.25', '11.07', '10.80')
('10.74', '10.80', '11.25', '11.07')
('10.74', '10.80', '11.07', '11.25')
('10.74', '11.07', '11.25', '10.80')
('10.74', '11.07', '10.80', '11.25')
('10.80', '11.25', '10.74', '11.07')
('10.80', '11.25', '11.07', '10.74')
('10.80', '10.74', '11.25', '11.07')
('10.80', '10.74', '11.07', '11.25')
('10.80', '11.07', '11.25', '10.74')
('10.80', '11.07', '10.74', '11.25')
('11.07', '11.25', '10.74', '10.80')
('11.07', '11.25', '10.80', '10.74')
('11.07', '10.74', '11.25', '10.80')
('11.07', '10.74', '10.80', '11.25')
('11.07', '10.80', '11.25', '10.74')
('11.07', '10.80', '10.74', '11.25')
('9.91', '9.25', '10.20', '9.34')
('9.91', '9.25', '9.34', '10.20')
,我有麻烦正在产生的随机化数据的块的特定部分(例如给我一组7条Fisher-Yates随机线,我可以写入文件)。如果有人能告诉我如何编辑上面的代码来生成10个输出文件,每个文件包含7行文本(即与输入文件相同的编号),每个文件都带有一个随机化的Fisher Yates混洗值集合,我将不胜感激它。
编辑1:我已经尝试了几种不同的方式: 例如下面的代码:
for line in open(sys.argv[1]).readlines()[2:]:
line = line.strip().split()
gene_name = line[0]
expression_values = line[1:]
RandomList = []
for shuffle in permutations(expression_values):
while len(RandomList) <10:
RandomList.append(shuffle)
print RandomList
我以为会给我回每行10个randomisations。它给我回同样的随机线,10倍,每行:
[('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07')]
[('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34')]
[('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36')]
[('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72')]
[('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56')]
编辑2:肖恩:非常感谢你的帮助,所以我确实知道如何写入文件一般,例如我可以说:
for i in range(10):
output_file = "random." + str(i)
open_output_file = open(output_file, 'a')
***for each line of the randomised array***:
open_output_file.write(line + "\n")
open_output_file.close()
我有写文件的问题是,我甚至不能得到我想要打印到屏幕首先,例如,如果我运行这段代码是什么:
import sys
import itertools
from itertools import permutations
for i in range(10):
for line in open(sys.argv[1]).readlines()[2:]:
line = line.strip().split()
gene_name = line[0]
expression_values = line[1:]
for shuffle in permutations(expression_values):
print shuffle[:6]
print "***"
i +=1
我会希望输出是7条随机线,接着是“***”,然后是7条随机线,10次。但是它会打印每行的所有组合。
你被困在哪一部分?获得七个小组?将它们写入文件?所有这些东西都有答案。 – jonrsharpe
谢谢,我编辑了这个问题。是的,我得到的输出是120行打印到屏幕/写入文件。我很困惑如何获得7人组,例如每次打印一行7行,写入文件(然后执行10次)。 – user1288515
你有什么尝试?制作一份清单,也许?在达到适当的长度时行动?如果你已经做出努力,展示它。如果你还没有,就制作一个!或者只是[做一些研究](http://stackoverflow.com/questions/3992735/python-generator-that-groups-another-iterable-into-groups-of-n)。 – jonrsharpe