Python：在数组中的单独文件的数据末尾添加数据两次

因此我需要一些特定问题的帮助。我有.composite（例如：RR0063_0011.composite）文件，其中第二列（强度）被读入数组中，但我需要在单独文件的第二列中添加日期（Modified Julian Date）两次在数组转换并保存之前，每行的数量。示例性输入文件：Python：在数组中的单独文件的数据末尾添加数据两次

数据（.composite）文件：

滨＃.....强度

1. -0.234987 
2. 0.87734 
... 
512. -0.65523

修正儒略日期的文件：

文件从MJD是抓住..... MJD

RR0063_0011.profs 55105.07946 
RR0023_0061.profs 53495.367377 
RR0022_0041.profs 53492.307631

这是用于将数据读入数组并生成mjd.txt文件的代码。所有这些工作到目前为止，我只需要将MJD 两次添加到相应的.composite行的末尾。现在，我对Python知之甚少，但这是我目前的代码。

#!/usr/bin/python 
import sys 
import glob 
import numpy as np 
import os 

psrname = sys.argv[1] 
file_list = glob.glob('*.composite') 

cols = [1] 
data = [] 
for f in file_list: 
    # Split the filename from the extension to use later  
    filename = os.path.splitext('{0}'.format(f)) 
    data.append(np.loadtxt(f, usecols=cols)) 
    print data 

# Run 'vap' (a PSRCHIVE command) to grap the MJD from the .profs file for each observation and write out to a file called 'mjd.txt' 
os.system('vap -nc mjd ../{0}/{0}.profs >> mjd.txt' .format(filename[0])) 

# Put the MJDs only (from 'mjd.txt') in an array called mjd_array 
mjd_array = np.genfromtxt('mjd.txt', dtype=[('filename_2','S40'),('mjd','f8')]) 

# Check if working 
print mjd_array['mjd'][7] 

arr = np.vstack(data) 

transposed_arr = np.transpose(arr) 
print transposed_arr 

fout = np.savetxt(psrname + '.total', transposed_arr, delimiter=' ')

的MJDS是不是为了与.composite文件，并在年底，我需要在保存之前由MJD所有列进行排序。

谢谢你的帮助！

希望的输出：

强度

.....

强度

MJD

-0.234987 
2. 0.87734 
... 
-0.65523 
55105.07946 
55105.07946

来源

2017-07-28 astrokid

你能提供两个输入文件的例子，所需的输出文件，请？ – albert

@albert编辑清晰！谢谢 – astrokid

Assumi如果您的示例输出中不需要额外的2.（可能是样本输入中的复制和粘贴错误），则可以先从日期文件中读取日期，并将其用作一种查找表：

import os 
import numpy as np 


# function to read dates from generated mjd file 
# and create look-up table (is a list of lists) 
def read_mjd_file(): 
    with open('mjd.txt') as f: 
     lines = f.read().splitlines() 
    lines = [line.split() for line in lines]   
    return lines 


# function for date look-up 
# filename is first list element, date is second 
def get_date(base_name): 
    for l in lines: 
     if l[0].startswith(base_name): 
      return l[1] 


# function to read data from data file 
def extract_data(file_name): 
    with open(file_name) as f: 
     data = np.loadtxt(f, usecols=[1]) 
    return data 


# generate mjd file at first 
# os.system(...) 


# generate look-up table from mjd file 
lines = read_mjd_file() 


# walk through all files given in directory and filter for desired file type 
for file_name in os.listdir(): 
    if file_name.endswith('.composite'): 
     base_name = file_name.split('.')[0] 
     date = get_date(base_name) 
     data = extract_data(file_name) 
     out = np.append(data, 2*[date]) 


print(out)

，因为这仅仅是为了给你一个提示概念证明你可能会适应这种方法您的特定需求。我个人更喜欢os.listdir()到glob.glob()。另外，我认为你不需要使用numpy来完成这个相当简单的任务。 Python的标准csv模块也应该完成这项工作。但是，numpy的功能更加舒适。所以，如果你需要numpy进一步的任务，你可以保留它。如果不是，使用csv模块重写片段应该不是什么大问题。

mjd.txt样子：

RR0063_0011.profs 55105.07946 
RR0023_0061.profs 53495.367377 
RR0022_0041.profs 53492.307631

RR0023_0061.composite样子：

1. -0.234987 
2. 0.87734 
512. -0.65523

输出（可变out）是np.array：

['-0.234987' '0.87734' '-0.65523' '53495.367377' '53495.367377']

来源

2017-07-28 15:06:29 albert

非常感谢！但我有点困惑 - 包含我需要的数据的文件是一个.composite文件。复合文件不包含MJD，这就是我从其他目录中的.profs文件抓取它的原因。 – astrokid

@astrokid：似乎我仍然没有得到你的文件结构和特定的文件内容。您需要查看多少个文件才能获取所需的全部信息？做.profs文件和。复合文件具有相同的文件名（具有不同的扩展名）并形成一对？如果是这样，您需要使用文件基名而不是完整的文件名查找时间戳。你能澄清一下吗？如果需要，我会在澄清后对我的代码段进行更改。 – albert

因此，在我所在的目录中，我只有复合文件，这是我给出的第一个文件。 MJD文件在包含os.system的行中创建，其中profs文件在特定于当前队列中文件的文件名的其他目录中访问。例如：打开RR0001_0001.composite，读取第二列，vap从目录RR0001_0001中的RR0001_0001.profs文件抓取MJD，然后抓取RR0002_0002等等。 – astrokid

Python：在数组中的单独文件的数据末尾添加数据两次

回答

相关问题