2017-05-25 55 views
0

我正在寻找Powershell,vbscript或Excel VBA中脚本形式的Excel公式转换。我试图获得列标题列表和它下面的字符串的最大长度。列出列标题并获取每列的最大字符串长度

通常情况下,我做的是手动在Excel中打开.txt文件,从那里我能得到头名。接下来,我创建一个数组公式= MAX(LEN(A1:A100,000))例如。这将获得列中字符串的最大长度。我会对其他栏执行相同的公式。

现在我无法做到这一点,因为文件已增加到1GB的大小,我无法再打开它们,我的桌面崩溃了。这也可能是因为它们超过100万行Excel无法处理。我的朋友建议Powershell,但我知识有限..不知道它是否可以在vbscript或Excel VBA中完成。

在此先感谢您的帮助。

下面的代码适用的.csv文件,但不与.TXT分隔的文件 -

$fileName = "C:\Desktop\EFile.csv" 
<# 
Sample format of c:\temp\data.csv 
"id","name","grade","address" 
"1","John","Grade-9","test1" 
"2","Ben","Grade-9","test12222" 
"3","Cathy","Grade-9","test134343" 
#> 
$colCount = (Import-Csv $fileName | Get-Member | Where-Object {$_.MemberType -eq 'NoteProperty'} | Measure-Object).Count 
$csv = Import-Csv $fileName 
$csvHeaders = ($csv | Get-Member -MemberType NoteProperty).name 

$dict = @{} 
foreach($header in $csvHeaders) { 
    $dict.Add($header,0) 
    } 

foreach($row in $csv) 
{ 
    foreach($header in $csvHeaders) 
    { 
     if($dict[$header] -le ($row.$header).Length) 
     { 
      $dict[$header] =($row.$header).Length 
     } 
    } 
} 
$dict.Keys | % { "key = $_ , Column Length = " + $dict.Item($_) } 
+0

你有什么试过,你试过的怎么样都失败了?理想情况下,您应该提供您尝试过的[最小,完整和可验证的示例](https://stackoverflow.com/help/mcve),并包含有关失败的具体信息,包含错误消息和/或错误的输出。 SO不是代码写入服务;最好的问题是提供有用信息的问题,以便那些回答问题的人可以指导你设计自己的正确答案。参见[如何提出一个好问题](https://stackoverflow.com/help/how-to-ask)。 –

+0

谢谢。我编辑我的文章,我以前在Excel中执行它..我打开.txt文件,获取标题名称并插入公式。但是,自从Excel崩溃后,我不能再这样做了,可能是由于超过100万行记录。我的桌面最终崩溃。 –

+0

请将您的代码精简至尽可能最小的示例,以说明您的问题并在此处发布代码。我们很乐意提供帮助,只要我们知道您正在使用的确切代码,而不必问几十个问题。 – PeterT

回答

0

这是我如何得到我的数据。

$data = @" 
"id","name","grade","address" 
"1","John","Grade-9","test1" 
"2","Ben","Grade-9","test12222" 
"3","Cathy","Grade-9","test134343" 
"@ 
$csv = ConvertFrom-Csv -Delimiter ',' $data 

但你应该得到这样的数据该

$fileName = "C:\Desktop\EFile.csv" 
$csv = Import-Csv -Path $fileName 

然后

# Extract the header names 
$headers = $csv | Get-Member -MemberType NoteProperty | Select-Object -ExpandProperty Name 

# Capture output in $result variable 
$result = foreach($header in $headers) { 

    #     Select all items in $header column,  find the longest,   and select the item for output 
    $maximum = $csv | Select-Object -ExpandProperty $header | Measure-Object -Maximum | Select-Object -ExpandProperty Maximum 

    # Generate new object holding the information. 
    # This will end up in $results 
    [pscustomobject]@{ 
     Header = $header 
     Max = $maximum.Length 
     String = $maximum 
    } 
} 


# Simple output 
$result | Format-Table 

这就是我得到:

Header Max String  
------ --- ------  
address 10 test134343 
grade  7 Grade-9 
id  1 3   
name  4 John  

另外,如果你有记忆处理大文件的问题,你可能有o使用.NET框架会变得更加肮脏。这段代码一次处理一条csv行,而不是将整个文件读入内存。

$fileName = "$env:TEMP\test.csv" 
$delimiter = ',' 

# Open a StreamReader 
$reader = [System.IO.File]::OpenText($fileName) 

# Read the headers and turn it into an array, and trim away any quotes 
$headers = $reader.ReadLine() -split $delimiter | % { $_.Trim('"''') } 

# Prepare a hashtable for the results 
$result = @{} 

# So long as there's more data, keep running 
while(-not $reader.EndOfStream) { 

    # Read a single line and process it as csv 
    $csv = $reader.ReadLine() | ConvertFrom-Csv -Header $headers -Delimiter $delimiter 

    # Determine if the item in the result hashtable is smaller than the current, using the header as a key 
    foreach($header in $headers) { 
     $item = $csv | Select-Object -ExpandProperty $header 

     if($result[$header].Maximum -lt $item.Length) { 
      $result[$header] = [pscustomobject]@{ 
       Header = $header 
       Maximum = $item.Length 
       String = $item 
      } 
     } 
    } 
} 

# Clean up our spent resource 
$reader.Close() 

# Simple output 
$result.Values | Format-Table 
相关问题