提取不均匀数据从文本文件

-1

我有包含在每一行九个不同的变量值的数据文件：x，y，z，v_x，v_y，v_z，m，ID，V。我正在编写一个程序，仅从数据文件中提取值x，y和z。我对这种类型的程序比较陌生，而且我遇到了这样做的问题，因为这些值的长度并不总是相同的。数据文件的一部分的一个例子是在这里（只有x，y，z列）：提取不均匀数据从文本文件

2501.773926 1701.783081 211.1383057 

1140.961426 4583.300781 322.4959412 

1194.471313 5605.764648 1377.315552 

506.1424866 6037.965332 1119.67041 

213.5106354 5788.785156 2340.610352 

59.43727493 5914.666016 2357.921143 

1223.028564 4292.818848 3007.292725 

4445.61377 3684.48999 2903.169189 

5649.732422 4596.819824 2661.301025 

5741.396973 5503.06543 2412.082031 

4806.246094 5587.194336 2676.126465 

4855.521973 5482.893066 2743.014648 

5190.890625 5399.349121 1549.1698

注意如何在大多数情况下，每个数字的长度是11位，但是这并非总是如此。我写的代码在这里：

#include <cmath> 
#include <cstdlib> 
#include <fstream> 
#include <iostream> 
#include <string> 
#include <vector> 

using namespace std; 

// data created by Gadget2 
const string gadget_data("particles_64cubed.txt"); 

int main() 
{ 

cout << "GADGET2: Extracting Desired Data From ASCII File." << endl; 

// declaring vectors to store the data 
int bins = 135000000; // 512^3 particles = 134,217,728 particles 
vector<double> x(bins), y(bins), z(bins); 


// read the data file 
ifstream data_file(gadget_data.c_str()); 
if (data_file.fail()) 
{ 
    cerr << "Cannot open " << gadget_data << endl; 
    exit(EXIT_FAILURE); 
} 
else 
    cout << "Reading data file: " << gadget_data << endl; 
string line; 
int particles = 0; 
while (getline(data_file, line)) 
{ 
    string x_pos = line.substr(0, 11); 
    double x_val = atof(x_pos.c_str()); // atof converts string to double 
    string y_pos = line.substr(12, 11); 
    double y_val = atof(y_pos.c_str()); 
    string z_pos = line.substr(24, 11); 
    double z_val = atof(z_pos.c_str()); 

    if (particles < bins) 
    { 
     x[particles] = x_val; 
     y[particles] = y_val; 
     z[particles] = z_val; 
     ++particles; 
    } 
} 
data_file.close(); 
cout << "Stored " << particles << " particles in positions_64.dat" << endl; 

vector<double> x_values, y_values, z_values; 
for (int i = 0; i < particles; i++) 
{ 
    x_values.push_back(x[i]); 
    y_values.push_back(y[i]); 
    z_values.push_back(z[i]); 
} 

// write desired data to file 
ofstream new_file("positions_64.dat"); 
for (int i = 0; i < x_values.size(); i++) 
    new_file << x_values[i] << '\t' << y_values[i] << '\t' << z_values[i] << endl; 
new_file.close(); 
cout << "Wrote desired data to file: " << "positions_64.dat" << endl; 

}

由于每个值的非恒定长度，代码显然失败。有没有人知道另一种方法来实现这一目标？也许是除了子字符串之外的其他字符，并且跨越了特定的字符长度，但是却将某些值填入空白区域？任何帮助，将不胜感激。谢谢！

来源

2017-02-14 Leigh K

您找到分隔符，然后串数据，并根据，而不是一个固定的数字。 –

查看[本答案]（http://stackoverflow.com/a/236803/5447209），它提供了根据用户提供的分隔符分割字符串的功能。 – Jvinniec

我注意到你已经在使用ifstream和getline来阅读文件了。为什么你会回到切割成N个字符的大块和atof'他们？我的意思是，iostreams可以读写整数，双精度等，最好在cin和cout的例子中看到。

有一个istringstream类将很容易帮助您：

std::istringstream input(line); // line is std::string from getline() 
double x,y,z; 
if(input >> x >> y >> z) // just this! and it's already a simple error check 
    ; // do something with x,y,z 
else 
    ; // handle the error

它应该只是工作，因为你已经有行由行读，因为数据是由空格，这是默认分隔由运营商>>忽略。

FYI：istringstream

来源

2017-02-14 19:56:41 quetzalcoatl

这样做的工作，非常感谢！唯一的问题是我正在失去原始数据中的一些重要人物。生成的数据文件至多有两位小数，而不是七位。有任何想法吗？ –

@LeighK：我不记得运算符'>>'有任何问题。它应该依次读取每个数字，并尝试将其放入双倍数据中，因此您应该观察仅由双倍本身导致的精度损失。我认为这可能是因为运算符'<<'的一些默认精度设置，因此在写入文件时。请尝试'new_file << setprecision（16）<< x_values [i]'或类似的东西。请参阅[我刚刚做的这个例子]（https://ideone.com/nD0Ph9），看看未配置的'cout'内容如何加倍到6位数。我想你的'ofstream new_file'也是一样的情况。 – quetzalcoatl

setprecision做到了！非常感谢你：） –

提取不均匀数据从文本文件

回答

相关问题