2016-10-04 167 views
-1

我想检查我的URL格式是正确的,它有一些AWS ACCES键等:正则表达式来检查URL格式

/https://bucket.s3.amazonaws.com/path/file.txt?AWSAccessKeyId=[.+]&Expires=[.+]&Signature=[.+]/.match(url) 

^这样的事情。能否请你帮忙?

+0

搜索:[url正则表达式](http://stackoverflow.com/search?q=url+regex) – Tushar

回答

0

我们需要一个URL一起工作:

url = "/https://bucket.s3.amazonaws.com/path/file.txt?AWSAccessKeyId=somestuff&Expires=somemorestuff&Signature=evenmorestuff" 

我们还需要逃脱了一堆东西,做一些非贪婪匹配(+?):

/https:\/\/bucket.s3.amazonaws.com\/path\/file\.txt\?AWSAccessKeyId=.+?&Expires=.+?&Signature=.+/.match(url) 

=> #<MatchData "https://bucket.s3.amazonaws.com/path/file.txt?AWSAccessKeyId=somestuff&Expires=somemorestuff&Signature=evenmorestuff"> 
1

URI RFC指定此正则表达式解析URL和URI:

^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))? 

您也可以使用URI module在Ruby标准库:

require 'uri' 
if url =~ /^#{URI::regexp(%w(http https))}$/ 
    puts "it's an url alright" 
else 
    puts "that's no url, that's a spaceship" 
end 

要检查的“一些AWS访问键等”的存在,你可以这样做:

require 'uri' 
uri = URI.parse(url) 
params = URI.decode_www_form(uri.query).to_h 
if params.has_key?('AWSAccessKeyId') 
    unless params['AWSAccessKeyId'] =~ /\A[a-f0-9]{32}\z/ 
    abort 'AWSAccessKeyId not valid' 
    end 
else 
    abort 'AWSAccessKeyId required' 
end 

当然你可以用正则表达式来直接解析它们,但它变得丑陋,因为参数的顺序可能会有所不同:

>> url = "https://bucket.s3.amazonaws.com/path/file.txt?AWSAccessKeyId=abcd12345&Expires=12345678&Signature=abcd" 
>> matchdata = url.match(
/
    \A 
     (?<scheme>http(?:s)?):\/\/ 
     (?<host>[^\/]+) 
     (?<path>\/.+)\? 
     (?=.*(?:[\?\&]|\b)AWSAccessKeyId\=(?<aws_access_key_id>[a-f0-9]{1,32})) 
     (?=.*(?:[\?\&]|\b)Expires=(?<expires>[0-9]+)) 
    /x 
) 
=> #<MatchData "https://bucket.s3.amazonaws.com/path/file.txt?" 
    scheme:"https" 
    host:"bucket.s3.amazonaws.com" 
    path:"/path/file.txt" 
    aws_access_key_id:"abcd12345" 
    expires:"12345678"> 

>> matchdata[:aws_access_key_id] 
# => "abcd12345" 

它使用

  1. 正则表达式的积极前瞻:(?=..)忽略参数 为了
  2. Ruby的正则表达式命名捕获(?<param_name>.*)识别 的PARAMS从比赛数据
  3. 非捕获分组(?abcd|efgh)
  4. 的匹配(?[\&\?]|\b)处理Expires=...?Expires=...&Expires=...
  5. 最后,/x自由间距修改器改为 允许更好的格式
+0

在地球上如何检查“它有一些AWS acces键等”? – mudasobwa

+0

@mudasobwa现在它。 –