在python中查找文件

146

os.walk就是答案，这将找到的第一个匹配：

import os 

def find(name, path): 
    for root, dirs, files in os.walk(path): 
     if name in files: 
      return os.path.join(root, name)

而这会发现所有的比赛：

def find_all(name, path): 
    result = [] 
    for root, dirs, files in os.walk(path): 
     if name in files: 
      result.append(os.path.join(root, name)) 
    return result

这将匹配模式：

import os, fnmatch 
def find(pattern, path): 
    result = [] 
    for root, dirs, files in os.walk(path): 
     for name in files: 
      if fnmatch.fnmatch(name, pattern): 
       result.append(os.path.join(root, name)) 
    return result 

find('*.txt', '/path/to/dir')

来源

2009-11-12 19:25:29

+1

注意，这些例子只能找到文件，而不是具有相同名称的目录。如果你想在这个名字的目录中找到任何**对象，你可以使用'如果文件中的名字或目录中的名字' – 2014-10-17 23:29:51

+3

小心区分大小写。 '用于文件中的名称：''在文件系统中的'super-photo.JPG'时将无法查找'super-photo.jpg'。（我一生中的一个小时我想回来;-)有点乱的解决方法是'如果str.lower（名称）在[x。lower（）for x in files]' – 2014-12-16 22:53:25

+0

如何使用** yield **而不是准备结果列表？ ..... if fnmatch.fnmatch（name，pattern）： yield os.path.join（root，name） – Berci 2015-05-03 21:26:13

1

见os module为os.walk或os.listdir

又见这个问题os.walk without digging into directories below示例代码

来源

2009-11-12 19:24:39

2

对于快速，独立于操作系统的搜索，使用scandir

https://github.com/benhoyt/scandir/#readme

读http://bugs.python.org/issue11406的细节原因。

来源

2014-06-19 09:36:07

+2

具体来说，使用'scandir.walk（） per @ Nadia的回答。请注意，如果你使用Python 3.5+，'os.walk（）'已经有'scandir.walk（）'加速。另外，[PEP 471]（https://www.python.org/dev/peps/pep-0471/）可能是一个比这个问题更好的阅读信息的文档。 – 2016-12-15 15:29:49

13

我用的os.walk一个版本，并在更大的目录传开时间3.5秒。我尝试了两种随机解，没有很大的改善，那么就做：

paths = [line[2:] for line in subprocess.check_output("find . -iname '*.txt'", shell=True).splitlines()]

虽然这是POSIX的唯一，我得到了0.25秒。

由此，我认为这是完全有可能的，以优化整个搜索了很多独立于平台的方式，但是这是我停止了研究。

来源

2014-07-10 15:00:13 kgadek

0

如果您正在使用Python在Ubuntu，你只希望它在Ubuntu的工作基本上更快的方法是利用终端的locate这样的程序。

import subprocess 

def find_files(file_name): 
    command = ['locate', file_name] 

    output = subprocess.Popen(command, stdout=subprocess.PIPE).communicate()[0] 
    output = output.decode() 

    search_results = output.split('\n') 

    return search_results

search_results是绝对文件路径的list。这比上面的方法快10,000倍，而且我做了一次搜索，速度快了72,000倍。

来源

2017-02-02 15:16:03 SARose

1

如果您正在使用Python 2个工作必须与所造成的自参照符号链接窗口无限递归的问题。

此脚本将避免以下这些。请注意，这是特定窗口！

import os 
from scandir import scandir 
import ctypes 

def is_sym_link(path): 
    # http://stackoverflow.com/a/35915819 
    FILE_ATTRIBUTE_REPARSE_POINT = 0x0400 
    return os.path.isdir(path) and (ctypes.windll.kernel32.GetFileAttributesW(unicode(path)) & FILE_ATTRIBUTE_REPARSE_POINT) 

def find(base, filenames): 
    hits = [] 

    def find_in_dir_subdir(direc): 
     content = scandir(direc) 
     for entry in content: 
      if entry.name in filenames: 
       hits.append(os.path.join(direc, entry.name)) 

      elif entry.is_dir() and not is_sym_link(os.path.join(direc, entry.name)): 
       try: 
        find_in_dir_subdir(os.path.join(direc, entry.name)) 
       except UnicodeDecodeError: 
        print "Could not resolve " + os.path.join(direc, entry.name) 
        continue 

    if not os.path.exists(base): 
     return 
    else: 
     find_in_dir_subdir(base) 

    return hits

它返回一个列表，其中包含所有指向文件名列表中文件的路径。用法：

find("C:\\", ["file1.abc", "file2.abc", "file3.abc", "file4.abc", "file5.abc"])

来源

2017-04-13 11:30:43

+0

请注意，这是特定窗口 – Leliel 2017-11-29 20:37:12

+0

@Leliel已将其添加到答案。感谢您的反馈意见。 – 2017-11-30 21:08:47

在python中查找文件

回答

相关问题