前言:这是一个学校作业。我没有试图收集电子邮件用于恶意目的。正则表达式转换混淆电子邮件地址Perl
我需要识别,提取和转换来自给定文件的电子邮件地址(作为命令行 参数)。对于模糊的电子邮件地址,我需要将电子邮件转换回常规电子邮件地址格式(account-name @ domain-name)。
这些都是混淆技术,我需要考虑:
No obfuscation. An email address may be included in a pair of <>. For example,
1. <[email protected]> or [email protected]
2. A space MAY be added before or after (or both) the @ sign.
3. The @ sign is written as AT or at, and space is added before and after AT or at.
4. The . sign in domain name is written as DOT or dot, and space is added before and after DOT
or dot.
目前我只是想占第一种技术。 "1. <[email protected]> or [email protected]"
这是我到目前为止有:
编辑:从@ikegami
#!/usr/bin/perl -w
use warnings;
use strict;
my @addrs;
my $re;
open my $INFILE, '<', $ARGV[0] or die $!;
while(my $line = <$INFILE>) {
push @addrs, $line =~ /(\w+\@(?:\w+\.)*\w+)/g;
foreach $re (@addrs) {
if ($re =~ (/$line/)) {
print $re;
}
}
}
close $INFILE;
使用帮助不再得到一个错误,但是没有得到任何输出。
样品输入:
Email: <[email protected]> email: [email protected] [email protected]
Email: anonym3 AT efs.new.edu E-mail: anonym4 at efs.new.edu test at 9:00PM
We will have a test in room 705 @ another time.
Email: anonym5 @ efs dot new dot edu what if we continue
Another test anonym6 at efs dot new dot edu
If you type a dot, it means you have finished typing all contents.
Email:anonym7 AT new DOT efs DOT edu
We can, at 10:00PM, go to library DOT or .
My gmail address is [email protected] DOT com
输出应该是:
[email protected]
[email protected]
[email protected]
anonym3[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
任何帮助/在正确的方向点,将不胜感激!
您是否想过,也许这些人为解决他们的电子邮件地址出于某种原因而变得非常重要? – xbug 2014-11-21 19:38:57
第一个'syntax'错误是因为'$ str =〜s \ w + @ \ w + \。\ W +(。\ W +)*; ''=''正则表达式运算符需要分隔符,'=〜/ regex /;'除此之外,您要通过foreach循环覆盖没有任何内容的$ str。 – sln 2014-11-21 19:43:53
哦,我看到@sln我更新了解决方案/错误输出。仍然有问题。 – chomp 2014-11-21 19:53:34