我正在制作一个脚本,通过检查文件中的已知关键字将视频文件分类到文件夹中。随着关键字数量的增长失控,脚本变得非常慢,需要几秒钟处理每个文件。根据关键字排序文件,需要更多的数据库-y解决方案
@echo off
cd /d d:\videos\shorts
if /i not "%cd%"=="d:\videos\shorts" echo invalid shorts dir. && exit /b
:: auto detect folder name via anchor file
for /r %%i in (*spirit*science*chakras*) do set conspiracies=%%~dpi
if not exist "%conspiracies%" echo conscpiracies dir missing. && pause && exit /b
for /r %%i in (*modeselektor*evil*) do set musicvideos=%%~dpi
if not exist "%musicvideos%" echo musicvideos dir missing. && pause && exit /b
for %%s in (*) do set "file=%%~nxs" & set "full=%%s" & call :count
for %%v in (*) do echo can't sort "%%~nv"
exit /b
:count
set oldfile="%file%"
set newfile=%oldfile:&=and%
if not %oldfile%==%newfile% ren "%full%" %newfile%
set count=0
set words= & rem
echo "%~n1" | findstr /i /c:"music" >nul && set words=%words%, music&& set /a count+=1
echo "%~n1" | findstr /i /c:"official video" >nul && set words=%words%, official video&& set /a count+=2
set words=%words:has, =has %
set words=%words: , =%
if not %count%==0 echo "%file%" has "%words%" %count%p for music videos
set musicvideoscount=%count%
set count=0
set words= & rem
echo "%~n1" | findstr /i /c:"misinform" >nul && set words=%words%, misinform&& set /a count+=1
echo "%~n1" | findstr /i /c:"antikythera" >nul && set words=%words%, antikythera&& set /a count+=2
set words=%words:has, =has %
set words=%words: , =%
if not %count%==0 echo "%file%" has "%words%" %count%p for conspiracies
set conspiraciescount=%count%
set wanted=3
set winner=none
:loop
:: count points and set winner (in case of tie lowest in this list wins, sort accordingly)
if %conspiraciescount%==%wanted% set winner=%conspiracies%
if %musicvideoscount%==%wanted% set winner=%musicvideos%
set /a wanted+=1
if not %wanted%==15 goto loop
if not "%winner%"=="none" move "%full%" "%winner%" >nul && echo "%winner%%file%" && echo.
注意每个关键字的“权重值”。它会计算每个类别的总点数,找到最大值并将文件移至指定给该类别的文件夹。它还显示它找到的单词,最后列出它找到的无法分类的文件,以便我可以添加关键字或调整权重值。
我已将本示例中的文件夹和关键字数量减至最少。完整的脚本有六个文件夹和64k大小的所有关键字(和增长)。
如果你想在PowerShell中使用它,你首先需要自己做一些基本的代码,如果你有问题,请回答*关于什么不工作的具体问题。从我所看到的情况来看,现有批处理代码的主要问题在于性能,对吗? – gravity
我明白了。没错,性能。我怀疑这是做错事情的主要例子。我遇到的唯一的实际问题是特殊字符。 – bricktop