2009-12-07 25 views
6

Excel VBA的Find和二进制搜索有多好/快?我的平台是Office 11 | 2003,我将在三张值上搜索字符串A列。总行数〜140,000Excel查找速度与VBA二进制搜索?

如果值得哪个库函数我应该参考做排序然后二进制搜索?据称二进制搜索字符串/文本有潜在的问题。

...有一件事 必须注意。使用二进制搜索 公式与sortedtextrequires 小心。 Aladin A., Excel MVP

Excel中查找:

Worksheets(1).Range("A:A").Find("PN-String-K9", LookIn:=xlValues, LookAt:=xlWhole) 

回答

7

很多反对我的直觉VBA二进制搜索强烈胜过一个Excel查找。至少在下面的场景中,120,000个6个字符的字符串均匀分布在3个工作表中。

Excel查找需要1分58秒,
VBA二分查找在我的特定机器上需要36秒。

了解文本的顺序显然胜过Excel的天然优势。请注意Aladin A对排序顺序的警告。

Option Explicit 

' Call Search to look for a thousand random strings 
' in 3 worksheets of a workbook 

' requires a workbook with 3 sheets and 
' column A populated with values between "00001" to "120000" 
' split evenly 40,000 to a worksheet in ascending order. 
' They must be text, not numbers. 

Private Const NUM_ROWS As Long = 120000 
Private Const SHEET_1 As String = "Sheet1" 
Private Const SHEET_2 As String = "Sheet2" 
Private Const SHEET_3 As String = "Sheet3" 

' This uses VBA Binary Search 
Public Sub Search() 
    Worksheets(SHEET_1).Range("B:B").ClearContents 
    Worksheets(SHEET_2).Range("B:B").ClearContents 
    Worksheets(SHEET_3).Range("B:B").ClearContents 
    DoSearch True  ' change to False to test Excel search 
End Sub 

' Searches for a thousand values using binary or excel search depending on 
' value of bBinarySearch 
Public Sub DoSearch(ByVal bBinarySearch As Boolean) 
    Debug.Print Now 
    Dim ii As Long 

    For ii = 1 To 1000 
     Dim rr As Long 
     rr = Int((NUM_ROWS) * Rnd + 1) 
     If bBinarySearch Then 
      Dim strSheetName As String 
      Dim nRow As Long 
      If BinarySearch(MakeSearchArg(rr), strSheetName, nRow) Then 
       Worksheets(strSheetName).Activate 
       Cells(nRow, 1).Activate 
      End If 
     Else 
      If Not ExcelSearch(SHEET_1, MakeSearchArg(rr)) Then 
       If Not ExcelSearch(SHEET_2, MakeSearchArg(rr)) Then 
        ExcelSearch SHEET_3, MakeSearchArg(rr) 
       End If 
      End If 
     End If 
     ActiveCell.Offset(0, 1).Value = "FOUND" 
    Next 
    Debug.Print Now 

End Sub 

' look for one cell value using Excel Find 
Private Function ExcelSearch(ByVal strWorksheet As String _ 
    , ByVal strSearchArg As String) As Boolean 
    On Error GoTo Err_Exit 
    Worksheets(strWorksheet).Activate 
    Worksheets(strWorksheet).Range("A:A").Find(What:=strSearchArg, LookIn:=xlValues, LookAt:= 
     xlWhole, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=True 
     , SearchFormat:=False).Activate 
    ExcelSearch = True 
    Exit Function 
Err_Exit: 
    ExcelSearch = False 
End Function 

' Look for value using a vba based binary search 
' returns true if the search argument is found in the workbook 
' strSheetName contains the name of the worksheet on exit and nRow gives the row 
Private Function BinarySearch(ByVal strSearchArg As String _ 
    , ByRef strSheetName As String, ByRef nRow As Long) As Boolean 
    Dim nFirst As Long, nLast As Long 
    nFirst = 1 
    nLast = NUM_ROWS 
    Do While True 
     Dim nMiddle As Long 
     Dim strValue As String 
     If nFirst > nLast Then 
      Exit Do  ' Failed to find search arg 
     End If 
     nMiddle = Round((nLast - nFirst)/2 + nFirst) 
     SheetNameAndRowFromIdx nMiddle, strSheetName, nRow 
     strValue = Worksheets(strSheetName).Cells(nRow, 1) 
     If strSearchArg < strValue Then 
      nLast = nMiddle - 1 
     ElseIf strSearchArg > strValue Then 
      nFirst = nMiddle + 1 
     Else 
      BinarySearch = True 
      Exit Do 
     End If 
    Loop 
End Function 

' convert 1 -> "000001", 120000 -> "120000", etc 
Private Function MakeSearchArg(ByVal nArg As Long) As String 
    MakeSearchArg = Right(CStr(nArg + 1000000), 6) 
End Function 

' converts some number to a worksheet name and a row number 
' This is depenent on the worksheets being named sheet1, sheet2, sheet3 

' and containing an equal number of vlaues in each sheet where 
' the total number of values is NUM_ROWS 
Private Sub SheetNameAndRowFromIdx(ByVal nIdx As Long _ 
    , ByRef strSheetName As String, ByRef nRow As Long) 
    If nIdx <= NUM_ROWS/3 Then 

     strSheetName = SHEET_1 
     nRow = nIdx 
    ElseIf nIdx > (NUM_ROWS/3) * 2 Then 
     strSheetName = SHEET_3 
     nRow = nIdx - (NUM_ROWS/3) * 2 
    Else 
     strSheetName = SHEET_2 
     nRow = nIdx - (NUM_ROWS/3) 
    End If 
End Sub 
+0

谢谢。在52000个可能性(单张)内搜索1000个示例的测试案例中,Excel搜索的搜索时间为17秒,二进制搜索的搜索时间为5.5秒。揉搓是二分查找失败的25%时间。我认为问题在于excel的排序方式与VBA的“>”和“<”比较有所不同。 – ExcelCyclist 2009-12-13 15:13:59

+0

是否有shell的记录,二进制搜索很好! 2000个随机例子,其中36秒(excel查找)中的52000行与11秒(二元查找)中的查找结果一致。 – ExcelCyclist 2009-12-13 23:29:17

3

我发现使用AutoFilter比使用任何方法手动搜索记录的速度快很多。

我过滤,检查是否有任何结果,然后继续。如果找到任何东西(通过检查结果的数量),我可以搜索手动过滤的小部分或全部返回。

我在大约44,000条记录上使用它,搜索100多个零件的列表。

如果您不小心,二进制搜索很容易陷入无限循环。

3

如果你使用vlookup和排序选项,它可能会比你的vba更快。