2011-04-06 42 views
0

我有很多文件* .java,* .xml。但是一个人用西班牙文字写了一些评论和字符串。我一直在网上搜索如何删除它们。删除linux文件中的特殊字符

我试过find . -type f -exec sed 's/[áíéóúñ]//g' DefaultAuthoritiesPopulator.java只是一个例子,我如何从子文件夹中的许多其他文件中删除这些字符?

+0

我敢打赌,某种形式的iconv会让你走;但我不确切知道你需要什么,所以我会坚持评论而不是回答 – 2011-04-06 22:54:10

+1

为什么你要做这样一个邪恶的东西? – tchrist 2011-04-06 23:23:22

回答

0

如果这就是你真正想要的,你可以使用find,就像你使用它一样。

find -type f \(-iname '*.java' -or -iname '*.xml' \) -execdir sed -i 's/[áíéóúñ]//g' '{}' ';' 

的差异:如果没有路径被供给

  • 路径.是隐式的。
  • 该命令仅对* .java和* .xml文件进行操作。
  • execdirexec更安全(请阅读手册页)。
  • -i告知sed就地修改文件参数。阅读手册页以了解如何使用它来进行备份。
  • {}表示路径参数,该参数find将在替代。
  • ;find语法exec/execdir的一部分。
0

几乎那里:)

find . -type f -exec sed -i 's/[áíéóúñ]//g' {} \; 
         ^^     ^^ 

sed(1)

-i[SUFFIX], --in-place[=SUFFIX] 
      edit files in place (makes backup if extension supplied) 

find(1)

-exec command ; 
      Execute command; true if 0 status is returned. All 
      following arguments to find are taken to be arguments to 
      the command until an argument consisting of `;' is 
      encountered. The string `{}' is replaced by the current 
      file name being processed everywhere it occurs in the 
      arguments to the command, not just in arguments where it 
      is alone, as in some versions of find. Both of these 
      constructions might need to be escaped (with a `\') or 
      quoted to protect them from expansion by the shell. See 
      the EXAMPLES section for examples of the use of the -exec 
      option. The specified command is run once for each 
      matched file. The command is executed in the starting 
      directory. There are unavoidable security problems 
      surrounding use of the -exec action; you should use the 
      -execdir option instead. 
0

tr是适合工作的工具:通过tr -d áíéóúñ管道的投入可能会做你想要什么

NAME 
     tr - translate or delete characters 

SYNOPSIS 
     tr [OPTION]... SET1 [SET2] 

DESCRIPTION 
     Translate, squeeze, and/or delete characters from standard input, writing to standard out‐ 
     put. 

     -c, -C, --complement 
       use the complement of SET1 

     -d, --delete 
       delete characters in SET1, do not translate 

     -s, --squeeze-repeats 
       replace each input sequence of a repeated character that is listed in SET1 with a 
       single occurrence of that character 

0

你为什么试图只删除带有变音符号的字符?如果你确定你的文件不应该包含更高的ascii,那么删除所有字符的代码不在0-127范围内,所以删除正则表达式将会是s/[\0x80-\0xFF]//g