2013-04-10 88 views
0

我在需要分析的文本文件中有大量数据。但是,我的问题是我不知道如何编写程序来读取我想要的数据。将文本文件中的数据读入数组

我所拥有的是像这样有组织的文本文件:

 Event: 23365 
    line 2 
    Q1: x,y,z= 263.25 -25.112 0.68342 
    Q2: x,y,z= 263.25 -25.112 0.68342 
    (blank line) 
    -next entry organized the same begins- 

所以我想要做的每一个这些变量的以某种方式进入多个阵列(每个变量的一个数组),让我可以对他们做数学。

哦,我在C.

编码我有文件输入没有经验,所以我几乎无言以对。我搜索了教程,但没有太多帮助。所以基本上我需要以某种方式扫描一个文本文件并挑选出数字。

+0

您可以发布与精确数据的文本文件的一个实际的例子吗?在开始尝试解析它之前,格式需要完美。 – 2013-04-10 02:59:00

+0

在C语言中,使用的最佳函数可能是'fscanf()'(例如,请参阅http://www.cplusplus.com/reference/cstdio/fscanf/),但是您的数据文件的一部分的实际示例将帮助我们为您提供建议。另外,还有许多其他语言,其中文本处理比C语言更容易 - 尤其是Python或Perl(如果它们对您可用)。 – Simon 2013-04-10 02:59:55

+0

呃,我实际上只有C的经验,所以他们对我来说真的没有。 – user2264247 2013-04-10 03:13:20

回答

0

好吧,这是不是很漂亮,但这里的一种方式的例子来分析数据:

// you cant't get away from pointers in C, so might as well as force yourself to use them 
// till they make sense. Here are some examples: 

// pointer of type data 
// data* d; 

// d is assigned the address of a memory space the size of a data container 
// d = (data*)malloc(sizeof(data)); 

// d can be dereferenced with *d. 
// (*d).identifier is the same as d->identifier 

// the memory space you got earlier can (and should) be freed 
// free(d); 
// if you free d and malloced d->extra, d->extra must be freed first 



#include <stdio.h> 
#include <ctype.h> 
#include <stdlib.h> 
#include <string.h> 

// define a type to hold information for each event 
typedef struct 
{ 
    char* identifier; 
    char* extra; 
    double var[6]; 
} data; 

// returns a pointer to a new data 
data* 
new_data() 
{ 
    // You are going to eventually need to know how malloc works 
    // or at least what it does. Just google it. For now, it requests 
    // memory from the heap to use 
    data* d = (data*)malloc(sizeof(data));    // request memory the size of the type data 
    d->identifier = (char*)malloc(sizeof(char)*128); // request some space for the c string identifier 
    memset(d->identifier,'\0',sizeof(char)*128);  // set identifier to all 0's (for null terminated 
                 // strings this means that I can replace the first zero's 
                 // and wherever I stop the c string ends) 
    d->extra = (char*)malloc(sizeof(char)*128);   // request some space for the c string extra 
    memset(d->extra,'\0',sizeof(char)*128);    // set extra to all 0's 
    return d;           // return the pointer 
} 

int main(void) 
{ 
    FILE *fp; // pointer to file object 
    int c;  // char could be used too. This is what holds whatever fgetc is assigned to 
    int count = 0; // This program is a state machine and count represents a state 
    char* word = malloc(sizeof(char)*128); // holds several chars 
    int wordPosition = 0; // position of the next char to be written in word 

    fp = fopen("datafile.txt", "r"); // fp is assigned the file datafile.txt in read-only mode 
    memset(word,'\0',sizeof(char)*128); // set word to all 0's 

    data* d = new_data(); // get a new data container to write information to 

    while(c = fgetc(fp)) // loops and gets a new char each loop 
    { 
     if(!isspace(c) && c != EOF) // if the char isn't white space or the End Of File 
      word[wordPosition++] = c; // add the char to word 
     else 
     { 
      if(strlen(word) != 0) // skip if word is empty (for example if there were two spaces in a row) 
      { 
       switch(count) // determine the state 
       { 
        case(0): 
         // for case 0, you want the word that isn't "Event:", so 
         // count will stay at 0 and add every word that isn't "Event:" 
         // as long as there is only one other word then this will result 
         // in what you want 
         if(!(strcmp(word, "Event:") == 0)) 
          strcpy(d->identifier, word); 
         // when there is a new line then go to state 1 
         // '\n' is a white space 
         if(c=='\n') 
          count++; 
         break; 
        case(1): 
         // for case 1 you just want the words on the line, so every word just add 
         // to extra with a space after it. Not the neatest way to do it. 
         strcat(d->extra, word); 
         strcat(d->extra, " "); 
         if(c=='\n') 
          count++; 
         break; 
        case(2): // for case 2 - 7 you just want the numbers so you can do something with them 
        case(3): // so if the first character of the word is a digit or '-' (negative) then 
        case(4): // add it to var[]. An easy way to know which one is just count-2. 
        case(5): // When a new number is added, go to the next state. 
        case(6): // Then test if count == 8. If so, you want to reset. 
        case(7): // skipping case 2-6 is simply saying for each of these go to case 7. 
          // that's why you need to break after a case you don't want to continue. 
         if(isdigit(word[0]) || word[0]=='-') 
         { 
          d->var[count-2] = atof(word); 
          count++; 
         } 
         if (count == 8) 
         { 
          // here is where you would do something different with the data you have. 
          // I imagine you would want an array of data's that you would just add this to 
          // for example, at the beginning of main have 
          // data* events[MAX_EVENTS]; 
          // int eventCount = 0; 
          // and here do something like 
          // events[eventCount++] = d; 
          // remember that eventCount++ gets the value first and then increments 
          // after the instruction is over 

          printf("%s\n",d->identifier); 
          printf("%s\n",d->extra); 
          printf("%f\n",d->var[0]); 
          printf("%f\n",d->var[1]); 
          printf("%f\n",d->var[2]); 
          printf("%f\n",d->var[3]); 
          printf("%f\n",d->var[4]); 
          printf("%f\n",d->var[5]); 

          // set the state back to the beginning 
          count = 0; 

          // if you were doing something with the data, then don't free yet. 
          free(d->identifier); 
          free(d->extra); 
          free(d); 

          // make a new data and start over for the next event. 
          d = new_data(); 
         } 
         break; 
       } 

       // clear the word and set wordPosition to the beginning 
       memset(word,'\0',sizeof(char)*128); 
       wordPosition = 0; 
      } 
     } 
     // check if the end of the file is reached and if so then exit the loop 
     if(c == EOF) 
      break; 
    } 
    fclose(fp); 

    // here you would do something like 
    // for(i = 0; i < eventCount; i++) 
    //  total1 += events[i]->var[1]; 
    // printf("The total of var1 is: %f\n",total1); 

    return 0; 
} 
+0

谢谢!这给了我几乎所需的东西。你能解释一下这个代码吗?我是一个有指针的完整noob,我也避免操纵角色。 new_data()后面括号中的整个部分是做什么的? --------- 我正确理解情况0扫描单词“事件”复制行并将计数移动到下一行。情况1删除其他行中的任何字符。 ---------------- 并且一旦前两行被跳过并且有6个数字被计数,count == 7因此您在case 7上打印? – user2264247 2013-04-10 07:04:24

+0

我添加了希望能够解决问题的评论。我同意西蒙说的话,你可能会花更少的时间和头痛下载python并看一些教程。这将是更容易,少于一半的代码。但是如果你坚持C,那么它也是非常可行的。 – Alden 2013-04-10 14:33:06