我有很多文件* .java,* .xml。但是一个人用西班牙文字写了一些评论和字符串。我一直在网上搜索如何删除它们。删除linux文件中的特殊字符
我试过find . -type f -exec sed 's/[áíéóúñ]//g' DefaultAuthoritiesPopulator.java
只是一个例子,我如何从子文件夹中的许多其他文件中删除这些字符?
我有很多文件* .java,* .xml。但是一个人用西班牙文字写了一些评论和字符串。我一直在网上搜索如何删除它们。删除linux文件中的特殊字符
我试过find . -type f -exec sed 's/[áíéóúñ]//g' DefaultAuthoritiesPopulator.java
只是一个例子,我如何从子文件夹中的许多其他文件中删除这些字符?
如果这就是你真正想要的,你可以使用find
,就像你使用它一样。
find -type f \(-iname '*.java' -or -iname '*.xml' \) -execdir sed -i 's/[áíéóúñ]//g' '{}' ';'
的差异:如果没有路径被供给
.
是隐式的。execdir
比exec
更安全(请阅读手册页)。-i
告知sed
就地修改文件参数。阅读手册页以了解如何使用它来进行备份。{}
表示路径参数,该参数find
将在替代。;
是find
语法exec
/execdir
的一部分。你几乎那里:)
find . -type f -exec sed -i 's/[áíéóúñ]//g' {} \;
^^ ^^
从sed(1)
:
-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if extension supplied)
从find(1)
:
-exec command ;
Execute command; true if 0 status is returned. All
following arguments to find are taken to be arguments to
the command until an argument consisting of `;' is
encountered. The string `{}' is replaced by the current
file name being processed everywhere it occurs in the
arguments to the command, not just in arguments where it
is alone, as in some versions of find. Both of these
constructions might need to be escaped (with a `\') or
quoted to protect them from expansion by the shell. See
the EXAMPLES section for examples of the use of the -exec
option. The specified command is run once for each
matched file. The command is executed in the starting
directory. There are unavoidable security problems
surrounding use of the -exec action; you should use the
-execdir option instead.
tr
是适合工作的工具:通过tr -d áíéóúñ
管道的投入可能会做你想要什么
NAME
tr - translate or delete characters
SYNOPSIS
tr [OPTION]... SET1 [SET2]
DESCRIPTION
Translate, squeeze, and/or delete characters from standard input, writing to standard out‐
put.
-c, -C, --complement
use the complement of SET1
-d, --delete
delete characters in SET1, do not translate
-s, --squeeze-repeats
replace each input sequence of a repeated character that is listed in SET1 with a
single occurrence of that character
。
你为什么试图只删除带有变音符号的字符?如果你确定你的文件不应该包含更高的ascii,那么删除所有字符的代码不在0-127
范围内,所以删除正则表达式将会是s/[\0x80-\0xFF]//g
。
我敢打赌,某种形式的iconv会让你走;但我不确切知道你需要什么,所以我会坚持评论而不是回答 – 2011-04-06 22:54:10
为什么你要做这样一个邪恶的东西? – tchrist 2011-04-06 23:23:22