AWS S3 - Python脚本获得的列数在一个文件中

AWS S3 - Python脚本获取文件AWS S3 - Python脚本获得的列数在一个文件中

我需要从一个文件中的每一行获得的列数的每一行中的列数使用|在AWS S3存储桶中驻留作为分隔符然后打印所有的行是否具有相同的列数

输入文件

100|name1|Test 
200|name2|Test45 
300|name3 
400|name4|Test1|subject

结果

的在这种情况下，基于列的数目|每行分别为2,2,1,3。

列数是不同的。

来源

2017-04-21 Rajeev

这将有助于在这里简化您的问题。您一次要问几件事：（1）从AWS存储桶获取文件; （2）对连续的列进行计数; （3）比较列数。您似乎可能已经知道如何执行这些步骤。 – Cireo

正如我在你的问题的评论中提到的，你一次提出几个问题。在这里，我将回答简单的蟒蛇部分：

"Given a table as list of strings and a delimiter, how can I determine if they have the same number of rows?"。

这样做是相对直接的 - 因为您不知道您期望您需要多少列来校准第一行，并根据它来验证其他行。

def columns_are_consistent(rows, delimiter): 
    """ 
    Returns True if the number of delimiters is the same in every row, 
    and False otherwise. 
    Note that in general: # columns == # delimiters + 1 
    """ 
    if not rows: # This could also be "if len(rows) < 2" 
     return True # Can't be inconsistent if there is nothing 
    # Calibrate on first row 
    expected = rows[0].count(delimiter) 
    # Validate remaining rows, note that "all([]) == True" 
    return all(row.count(delimiter) == expected for row in rows[1:])

来源

2017-04-21 21:52:07 Cireo

我是python新手。我能够理解所使用的逻辑，但是请你能解释一下，如果只用（filename，|）作为参数调用函数（行，分隔符）就足够了吗？还是像我需要在处理它之前打开文件？ – Rajeev

@Rajeev，如果你在当前的问题中有一个像输入文件一样的文件，那么你可以这样做： 'print columns_are_consistent（open（filename）.readlines（），'|'）'。虽然我会建议不要试图将它合并到一行中 – Cireo

AWS S3 - Python脚本获得的列数在一个文件中

回答

相关问题