2009-11-18 231 views
29

输入文件具有记录为:8712351,8712353,8712353,8712354,8712356,8712352,8712355 8712352,8712355如何使用COBOL从文件中删除重复项?

使用COBOL我需要从上述文件删除重复并写入到一个输出文件中。 I 写了简单的逻辑来读取记录并写入输出文件。

我需要从上述文件中删除重复项(例如8712353,8712352)的逻辑。下面是程序逻辑:

IDENTIFICATION DIVISION. 
    PROGRAM-ID.RemoveDup. 
    ENVIRONMENT DIVISION. 
    INPUT-OUTPUT SECTION. 
    FILE-CONTROL. 
    SELECT INPUTFILEDUP ASSIGN TO 'C:\Cobol\INPUTFILEDUP.txt' 
      ORGANIZATION IS LINE SEQUENTIAL. 
    SELECT OUTFILEDUP ASSIGN TO 'C:\Cobol\OUTFILEDUP.txt' 
       ORGANIZATION IS LINE SEQUENTIAL. 

    DATA DIVISION. 

    FILE SECTION. 
    FD INPUTFILEDUP. 
    01 INPUTFILEDUPREC. 
     88 EOFINPUTFILEDUP VALUE HIGH-VALUES. 
     02 INPUTFILEID  PIC 9(07). 

    FD OUTFILEDUP. 
    01 OUTFILEDUPREC   PIC 9(07). 

    WORKING-STORAGE SECTION. 
    77 WS-VARIABLE   PIC 9(09). 
    77 REC-NOT-MATCH   PIC 9(01). 
    77 CUR-VARIABLE   PIC 9(09). 

    PROCEDURE DIVISION. 
    BEGIN. 
    OPEN INPUT INPUTFILEDUP 
    OPEN OUTPUT OUTFILEDUP 

    READ INPUTFILEDUP 
     AT END SET EOFINPUTFILEDUP TO TRUE 
    END-READ 
    PERFORM UNTIL (EOFINPUTFILEDUP) 
       WRITE OUTFILEDUPREC FROM INPUTFILEID 
       READ INPUTFILEDUP 
        AT END SET EOFINPUTFILEDUP TO TRUE 
          PERFORM UNTIL (EOFINPUTFILEDUP) 
    END-READ 
    END-PERFORM 
        CLOSE INPUTFILEDUP 
        CLOSE OUTFILEDUP 
    STOP RUN. 

我按升序排序的输入文件:8712351,8712353,8712353,8712354,8712356,8712352,8712355,8712352,8712355 和它的工作,下面是修改后的代码:

但是,假设我的文件没有升序或降序,我需要在删除dups之前编写排序逻辑。请你能更新我下面的代码这是我试过,但没有全成在做这个,如果输入FIEL结构是这样的:

8712351,8712353,8712353,8712354,8712356,8712352,8712355,8712352,8712355

IDENTIFICATION DIVISION. 
    PROGRAM-ID.RemoveDup2. 
    ENVIRONMENT DIVISION. 
    INPUT-OUTPUT SECTION. 
    FILE-CONTROL. 
    SELECT INPUTFILEDUP ASSIGN TO 'C:\Cobol\INPUTFILEDUP.txt' 
      ORGANIZATION IS LINE SEQUENTIAL. 
    SELECT OUTFILEDUP ASSIGN TO 'C:\Cobol\OUTFILEDUP.txt' 
       ORGANIZATION IS LINE SEQUENTIAL. 

    DATA DIVISION. 

    FILE SECTION. 
    FD INPUTFILEDUP. 
    01 INPUTFILEDUPREC. 
     88 EOFINPUTFILEDUP VALUE HIGH-VALUES. 
     02 INPUTFILEID  PIC 9(07). 

    FD OUTFILEDUP. 
    01 OUTFILEDUPREC   PIC 9(07). 

    WORKING-STORAGE SECTION. 
    77 WS-VARIABLE   PIC 9(09) VALUE ZERO. 
    77 REC-NOT-MATCH   PIC 9(01). 
    77 CUR-VARIABLE   PIC 9(7) VALUE ZERO. 

    PROCEDURE DIVISION. 
    BEGIN. 
    OPEN INPUT INPUTFILEDUP 
    OPEN OUTPUT OUTFILEDUP 

    READ INPUTFILEDUP 
     AT END SET EOFINPUTFILEDUP TO TRUE 
    END-READ 
    PERFORM UNTIL (EOFINPUTFILEDUP) 
     IF INPUTFILEID NOT EQUAL TO WS-VARIABLE 
       MOVE INPUTFILEID TO WS-VARIABLE 
       WRITE OUTFILEDUPREC FROM INPUTFILEID 
       READ INPUTFILEDUP 
        AT END SET EOFINPUTFILEDUP TO TRUE 
       PERFORM UNTIL (EOFINPUTFILEDUP) 
     ELSE 
       DISPLAY "dUPLICATE FOUND" INPUTFILEID 

    READ INPUTFILEDUP 
    AT END SET EOFINPUTFILEDUP TO TRUE 

    END-READ 

     END-PERFORM 

    CLOSE INPUTFILEDUP 
    CLOSE OUTFILEDUP 
    STOP RUN. 
+0

WOW新的最喜欢的标签! :)关于从中删除重复项的数据的问题:8712351等数字是否都会在相对紧凑的范围内发生,例如8700000-8800000?或者是否有可能在一个巨大的范围内从1-N变化的数字? – 2009-11-18 19:10:46

回答

2

OrganizationSequential时,删除的记录是最后读取的记录。 Delete语句仅在对文件的最后一次操作是成功的Read语句时有效。如果不是,Delete返回43. File Status值由于Delete不能返回File Status值以2开头当文件OpenSequential访问,这样的Delete编码Invalid Key是不允许的。

当选择用于文件DynamicRandom接入时,Delete statment,像Rewrite,变得有点限制较少。被删除的记录不需要先前读过。只需在fle的记录说明中填写主要Key信息并发出Delete声明。如果记录不存在,则返回23的File Status并存在Invalid Key条件。

从274页的274

Sams Teach Yourself COBOL in 24 Hours

页(我刚才;从我的书架撒下来)。所以在你的情况下,你可能会设置你的记录,按照INPUTFILEID排序,在你经历第一次发生的给定INPUTFILEID的发生时做出记录,并相应地Delete(在将它写入输出文件之后) 。

1

如果您在使用cobol程序读取文件之前先对文件进行外部排序,您可以使用SORT关键字EQUALS删除重复项。如果您在cobol程序之前对文件进行排序并且不删除重复项,那么简单的IF语句和保存字段将允许您删除dups。

设置一个INPUTFILEID保存字段。在阅读后.​​..如果inputfileid等于inputfileid-save,则再次读取,如果没有写入...在将inputfileid移动到inputfileid-save之后。你将不得不分手当前的表演来做到这一点。

如果你不完全明白我在说什么,将帮助您改变代码只是让我知道

5

最后它的工作。 下面是代码

IDENTIFICATION DIVISION. 
    PROGRAM-ID.RemoveDup2. 
    ENVIRONMENT DIVISION. 
    INPUT-OUTPUT SECTION. 
    FILE-CONTROL. 
    SELECT INPUTFILEDUP ASSIGN TO 'C:\Cobol\INPUTFILEDUP.txt' 
      ORGANIZATION IS LINE SEQUENTIAL. 
    SELECT OUTFILEDUP ASSIGN TO 'C:\Cobol\OUTFILEDUP.txt' 
       ORGANIZATION IS LINE SEQUENTIAL. 
    SELECT WorkFile ASSIGN TO "WORK.TMP". 

    DATA DIVISION. 

    FILE SECTION. 
    FD INPUTFILEDUP. 
    01 INPUTFILEDUPREC. 
     88 EOFINPUTFILEDUP VALUE HIGH-VALUES. 
     02 INPUTFILEID  PIC 9(07). 

    FD OUTFILEDUP. 
    01 OUTFILEDUPREC   PIC 9(07). 

    SD WorkFile. 
    01 WORKREC. 
     02 WINPUTFILEID  PIC 9(07). 

    WORKING-STORAGE SECTION. 
    77 WS-VARIABLE   PIC 9(09) VALUE ZERO. 
    77 REC-NOT-MATCH   PIC 9(01). 
    77 CUR-VARIABLE   PIC 9(7) VALUE ZERO. 

    PROCEDURE DIVISION. 
    BEGIN. 
     SORT WorkFile ON ASCENDING KEY WINPUTFILEID 
     USING INPUTFILEDUP GIVING INPUTFILEDUP 

    OPEN INPUT INPUTFILEDUP 
    OPEN OUTPUT OUTFILEDUP 

     READ INPUTFILEDUP 
       AT END SET EOFINPUTFILEDUP TO TRUE 
    END-READ 
     PERFORM UNTIL (EOFINPUTFILEDUP) 
      IF INPUTFILEID NOT EQUAL TO WS-VARIABLE 
        MOVE INPUTFILEID TO WS-VARIABLE 
        WRITE OUTFILEDUPREC FROM INPUTFILEID 
        READ INPUTFILEDUP 
         AT END SET EOFINPUTFILEDUP TO TRUE 
     PERFORM UNTIL (EOFINPUTFILEDUP) 
      ELSE 
        DISPLAY "DUPLICATE FOUND " INPUTFILEID 

    READ INPUTFILEDUP 
       AT END SET EOFINPUTFILEDUP TO TRUE 
    END-READ 
    END-PERFORM 

    CLOSE INPUTFILEDUP 
    CLOSE OUTFILEDUP 

    STOP RUN. 
1

sort标准是这些操作系统关闭工作遵循DRY原则,齿轮-t用于分离和-u的唯一身份。这是C.