我打算假定所有的句子都是单行,并且在单词之间包含一个空格。将名为“输出”的工作表添加到工作簿中。在单元格A1中键入一个标题(例如“Word”),并在单元格B2中键入一个标题(例如“Count”)。下面的句子将输出A列中的单词和B列中单词的计数,然后进行排序,最常用的是排在最前面。根据你有多少数据,这应该需要一两秒钟才能运行。
注意:您需要添加对“Microsoft Scripting Runtime”库的引用。
Sub Example()
Dim X As Variant, S As Variant, key As Variant
Dim str As String
Dim oDict As Scripting.Dictionary
Dim i As Double, j As Double, k As Double
Dim Anchor As Range
Set oDict = New Scripting.Dictionary
With ThisWorkbook
'Clear past output
With .Sheets("Output")
.Range("a2:" & .Cells(.Rows.Count, .Columns.Count).Address).ClearContents
End With
'Fill array with text to search
With .Sheets("Raw")
X = .Range("a2:a" & .Range("a" & .Rows.Count).End(xlUp).Row).Value2
End With
End With
For i = LBound(X,1) To UBound(X,1)
S = Split(X(i,1), " ")
For j = LBound(S, 1) To UBound(S, 1)
If oDict.Exists(S(j)) Then
oDict.Item(S(j)) = oDict.Item(S(j)) + 1
Else
oDict.Add S(j), 1
End If
Next j
Next i
'Output results to sheet "Output"
With ThisWorkbook.Sheets("Output")
For Each key In oDict.Keys
Set Anchor = .Range("a" & Rows.Count).End(xlUp).Offset(1, 0)
Anchor = key
Anchor.Offset(0, 1) = oDict.Item(key)
Next key
.Range("a1:" & .Range("a" & .Rows.Count).End(xlUp).Offset(0, 1).Address).Sort .Range("b:b"), xlDescending
End With
End Sub
编辑:
这是我完全的,纯粹的代码。请注意,工作簿和工作表参考不会根据您的目的进行更新。要使用RegExp,您需要添加对“Microsoft VBScript Regular Expressions 5.5”库的引用。我使用“5.5”,但我相信任何会为此工作。
Sub Example()
Dim X As Variant, S As Variant, S2 As Variant, S3 As Variant, key As Variant
Dim oDict As Scripting.Dictionary
Dim i As Double, j As Double, k As Double
Dim Anchor As Range
Dim oReg As New RegExp
Dim str As String
Dim st As Single
Application.ScreenUpdating = False
st = Timer
Set oDict = New Scripting.Dictionary
With ThisWorkbook
'Clear past output
With .Sheets("Output")
.Range("a2:" & .Cells(.Rows.Count, .Columns.Count).Address).ClearContents
End With
'Fill array with text to search
With .Sheets("Input")
X = .Range("a2:a" & .Range("a" & .Rows.Count).End(xlUp).Row).Value2
End With
End With
With oReg
.Global = True
.IgnoreCase = True
.Pattern = "[^\w\s]"
End With
For i = LBound(X, 1) + 1 To UBound(X, 1)
'Get rid of none letter and white space
str = oReg.Replace(X(i, 1), "")
S = Split(str, " ")
For j = LBound(S, 1) To UBound(S, 1)
If oDict.Exists(LCase(S(j))) Then
oDict.Item(LCase(S(j))) = oDict.Item(LCase(S(j))) + 1
Else
oDict.Add LCase(S(j)), 1
End If
Next j
Next i
'Output results to sheet "Output"
With ThisWorkbook.Sheets("Output")
For Each key In oDict.Keys
Set Anchor = .Range("a" & Rows.Count).End(xlUp).Offset(1, 0)
Anchor = key
Anchor.Offset(0, 1) = oDict.Item(key)
Next key
.Range("a1:" & .Range("a" & .Rows.Count).End(xlUp).Offset(0, 1).Address).Sort .Range("b:b"), xlDescending
End With
Debug.Print Timer - st
Application.ScreenUpdating = True
End Sub
如果您的代码正常工作,但您希望它得到改进,那么您应该考虑删除此处的问题并将其发布到http://codereview.stackexchange.com/。 –
我甚至不知道存在。非常感谢你John! –
知道该网站存在是有用的,但现在你已经至少有一个答案,你应该把它留在这里,因为交叉发布是不受欢迎的。 –