2016-11-14 50 views
2

我有以下代码。我正在尝试为关键字列表(key_words)测试一个段落(descr)。当我执行这段代码时,日志会读入数组的所有变量,但是只会在do循环中测试20,000行中的2行(do i = 1到100)。有关如何解决此问题的任何建议?SAS Do Loop正在处理中排除

data JE.KeywordMatchTemp1; 
    set JE.JEMasterTemp end=eof; 
    if _n_ = 1 then do i = 1 by 1 until (eof); 
    set JE.KeyWords; 
    array keywords[100] $30 _temporary_; 
    keywords[i] = Key_Words; 
    end; 
    match = 0; 
    do i = 1 to 100; 
    if index(descr, keywords[i]) then match = 1; 
    end; 
    drop i; 
run; 

回答

1

您的问题是您的end=eof是在错误的地方。

这是一个简单的例子,计算每个受访者年龄值的“排名”。

看看我把end=eof。这是因为你需要使用它来控制数组填充操作。否则,会发生什么是你的循环是do i = 1 to eof;并没有真正做你应该说的话:它实际上并没有终止于eof,因为它从来不是真的(因为它在第一个set声明中定义)。相反,它会因为超出数据集的末尾而终止,这尤其是您不想要的。

这就是end=eof正在做的事情:它阻止了当数组填充数据集完成时试图拉出一行,从而终止整个数据步骤。任何时候当你看到数据步骤在2次迭代后终止时,你可以确信这就是问题的可能性 - 这是一个非常普遍的问题。

data class_ranks; 
    set sashelp.class; *This dataset you are okay iterating over until the end of the dataset and then quitting the data step, like a normal data step.; 
    array ages[19] _temporary_; 
    if _n_=1 then do; 
    do _i = 1 by 1 until (eof); *iterate until the end of the *second* set statement; 
     set sashelp.class end=eof; *see here? This eof is telling this loop when to stop. It is okay that it is not created until after the loop is.; 
     ages[_i] = age; 
    end; 
    call sortn(of ages[*]); *ordering the ages loaded by number so they are in proper order for doing the trivial rank task; 
    end; 
    age_rank = whichn(age,of ages[*]); *determine where in the list the age falls. For a real version of this task you would have to check whether this ever happens, and if not you would have to have logic to find the nearest point or whatnot.; 
run; 
+0

谢谢!如果可以,还有一件事。看起来代码第二部分的DO-LOOP在条件满足时不会停止。任何可能发生的原因? –

+0

'i = 100'时不停止?或者当'match = 1'时不停止?后者不会阻止它,为什么呢? – Joe