2016-08-03 88 views
0

我是新的Excel VBA代码,我需要帮助优化此代码。它完全符合我的要求,但需要将近30秒才能运行,这对最终用户来说是不可接受的。Optomizing Excel多维循环的VBA

目的是评估一个单词与输入是句子的频率。在“Raw”表中,第一列是整个句子。第二个是句子中有多少单词的计数。第三百是句子中的第一,第二,第三......字。一次最多可分析1000个句子。

然后,只有它们是唯一的,它才会粘贴到“OneColumn”的第一列中。我尝试粘贴所有内容,然后删除重复内容,但运行时间大约为45秒。

我当然愿意用其他方式来分析一个单词的使用频率,但我无法弄清楚如何让它在单元格内检查而不打破它们。

我将不胜感激任何帮助。

Option Explicit 

Sub ListUniqueWords() 
Dim i As Integer 
Dim j As Integer 
Dim k As Integer 

Dim StartTime As Double 
Dim SecondsElapsed As Double 
    StartTime = Timer 

i = 2 
j = 3 
k = 2 

'i=row j=column k=paste into row 

    Do While i < 1001 
    j = 3 
      Do While j < 103 
          If Sheets("Raw").Cells(i, j).Value <> "" And Sheets("Raw").Cells(i, j).Value <> " " And Sheets("OneColumn").Range("A2:A2000").Find(Sheets("Raw").Cells(i, j), LookAt:=xlWhole) Is Nothing Then 
            Worksheets("Raw").Activate 
            Cells(i, j).Select 
            Selection.Copy 
            Worksheets("OneColumn").Activate 
            Cells(k, 1).Activate 
            ActiveCell.PasteSpecial Paste:=xlPasteValues 
            k = k + 1 
            j = j + 1 
           Else 
            j = j + 1 
           End If 
      Loop 
      i = i + 1 
    Loop 
SecondsElapsed = Round(Timer - StartTime, 2) 
    MsgBox "This code ran successfully in " & SecondsElapsed & " seconds", vbInformation 

End Sub 
+1

如果您的代码正常工作,但您希望它得到改进,那么您应该考虑删除此处的问题并将其发布到http://codereview.stackexchange.com/。 –

+0

我甚至不知道存在。非常感谢你John! –

+0

知道该网站存在是有用的,但现在你已经至少有一个答案,你应该把它留在这里,因为交叉发布是不受欢迎的。 –

回答

0

我打算假定所有的句子都是单行,并且在单词之间包含一个空格。将名为“输出”的工作表添加到工作簿中。在单元格A1中键入一个标题(例如“Word”),并在单元格B2中键入一个标题(例如“Count”)。下面的句子将输出A列中的单词和B列中单词的计数,然后进行排序,最常用的是排在最前面。根据你有多少数据,这应该需要一两秒钟才能运行。

注意:您需要添加对“Microsoft Scripting Runtime”库的引用。

Sub Example() 
Dim X As Variant, S As Variant, key As Variant 
Dim str As String 
Dim oDict As Scripting.Dictionary 
Dim i As Double, j As Double, k As Double 
Dim Anchor As Range 

Set oDict = New Scripting.Dictionary 

With ThisWorkbook 
    'Clear past output 
    With .Sheets("Output") 
     .Range("a2:" & .Cells(.Rows.Count, .Columns.Count).Address).ClearContents 
    End With 

'Fill array with text to search 
    With .Sheets("Raw") 
     X = .Range("a2:a" & .Range("a" & .Rows.Count).End(xlUp).Row).Value2 
    End With 
End With 

For i = LBound(X,1) To UBound(X,1) 
    S = Split(X(i,1), " ") 

    For j = LBound(S, 1) To UBound(S, 1) 
     If oDict.Exists(S(j)) Then 
      oDict.Item(S(j)) = oDict.Item(S(j)) + 1 
     Else 
      oDict.Add S(j), 1 
     End If 
    Next j 
Next i 

'Output results to sheet "Output" 
With ThisWorkbook.Sheets("Output") 
For Each key In oDict.Keys 
    Set Anchor = .Range("a" & Rows.Count).End(xlUp).Offset(1, 0) 
    Anchor = key 
    Anchor.Offset(0, 1) = oDict.Item(key) 
Next key 

.Range("a1:" & .Range("a" & .Rows.Count).End(xlUp).Offset(0, 1).Address).Sort .Range("b:b"), xlDescending 
End With 

End Sub 

编辑:

这是我完全的,纯粹的代码。请注意,工作簿和工作表参考不会根据您的目的进行更新。要使用RegExp,您需要添加对“Microsoft VBScript Regular Expressions 5.5”库的引用。我使用“5.5”,但我相信任何会为此工作。

Sub Example() 
Dim X As Variant, S As Variant, S2 As Variant, S3 As Variant, key As  Variant 
Dim oDict As Scripting.Dictionary 
Dim i As Double, j As Double, k As Double 
Dim Anchor As Range 
Dim oReg As New RegExp 
Dim str As String 
Dim st As Single 

Application.ScreenUpdating = False 


st = Timer 
Set oDict = New Scripting.Dictionary 

With ThisWorkbook 
'Clear past output 
    With .Sheets("Output") 
     .Range("a2:" & .Cells(.Rows.Count, .Columns.Count).Address).ClearContents 
    End With 

    'Fill array with text to search 
    With .Sheets("Input") 
     X = .Range("a2:a" & .Range("a" & .Rows.Count).End(xlUp).Row).Value2 
    End With 
End With 

With oReg 
    .Global = True 
    .IgnoreCase = True 
    .Pattern = "[^\w\s]" 
End With 

For i = LBound(X, 1) + 1 To UBound(X, 1) 
    'Get rid of none letter and white space 
      str = oReg.Replace(X(i, 1), "") 


    S = Split(str, " ") 

    For j = LBound(S, 1) To UBound(S, 1) 
     If oDict.Exists(LCase(S(j))) Then 
      oDict.Item(LCase(S(j))) = oDict.Item(LCase(S(j))) + 1 
     Else 
      oDict.Add LCase(S(j)), 1 
     End If 
    Next j 
Next i 

'Output results to sheet "Output" 
With ThisWorkbook.Sheets("Output") 
    For Each key In oDict.Keys 
     Set Anchor = .Range("a" & Rows.Count).End(xlUp).Offset(1, 0) 
     Anchor = key 
     Anchor.Offset(0, 1) = oDict.Item(key) 
    Next key 

     .Range("a1:" & .Range("a" & .Rows.Count).End(xlUp).Offset(0, 1).Address).Sort .Range("b:b"), xlDescending 
End With 

Debug.Print Timer - st 

Application.ScreenUpdating = True 
End Sub 
+0

这是无法运行。你正在拆分你从未定义的'str',这样就会导致错误。而且,你总是只索引第一个子字符串'S(1)'而不是使用你定义的'j'的循环变量,所以除了每个句子中的第一个单词之外,你永远不会计算任何内容。 – Mikegrann

+0

对我来说,它根本不会产生任何输出。尽管我对许多这些脚本不太熟悉,但仍然可以查明具体问题。感谢Mikegrann指出了潜在的问题。 –

+0

@Comintern它确实进行了编译,因为str在顶部被声明为Variant。这绝对是一种糟糕的形式,但至少它可以在发布答案之前通过编译器检查。至少,VBE编译并运行它(当然没有输出),就像我一样。 – Mikegrann

0

您的函数需要很长时间才能运行,因为您在Excel表格中单元格操作。此方法不会将任何数据拉入RAM内存(快速)。只需将感兴趣的列插入到数组或列表中即可。以与您的功能相同的方式在列表上进行操作。这将大大加速它的运作。 例如,

Dim Whole_Sentence_List As New Collection 
Dim Word_Count_List As New Collection 
Dim Array_of_Words_List As New Collection 

Array_of_Words_List的是,你可以把句子的单词变成一个个而不是3,4,5 ...... 100列数组的集合。玩一段时间的集合,以了解它们的工作方式