2017-06-16 149 views
0

我无法让此脚本接受例如https://youtu.be/HPP0yB-_blAhttps://www.youtube.com/watch?v=HPP0yB-_blA虽然有效。第一个例子只是导致invalid command name ""无效的命令名称“”

# URL title parse script for Eggdrop. 
# 
# Based on https://github.com/teeli/urltitle by teel. 
# 
# Version log: 
# 0.11 Minor site specific tweaks. 
# 0.1  First version. 
# 
# Usage: 
# .chanset #channelname +urltitle ;# Enable script. 

namespace eval urltitle { 
    # Configuration variables. 
    set delay 1 ;# Minimum number of seconds to wait between uses. 
    set length 5 ;# Minimum character length of URL to trigger usage. 
    set timeout 5000 ;# Geturl timeout in milliseconds (1/1000ths of a second). 

    # Internal variables. 
    set ignoredSites {apina.biz} ;# Sites to ignore when parsing URLs. 
    set last 1 ;# Stores time of last usage. 
    set scriptVersion 0.11 ;# Script version number. 

    # Binds/Hooks. 
    bind pubm - "*://*" urltitle::handler 
    setudef flag urltitle ;# Channel flag to enable script. 

    # Required packages. 
    package require http 
    package require tdom 
    package require tls 

    proc socket {args} { 
     set opts [lrange $args 0 end-2] 
     set host [lindex $args end-1] 
     set port [lindex $args end] 

     ::tls::socket -autoservername true {*}$opts $host $port 
    } 

    proc handler {nick host user chan text} { 
     set time [clock seconds] 
     variable delay 
     variable ignoredSites 
     variable last 
     variable length 

     if {[channel get $chan urltitle] && ($time - $delay) > $last} { 
      foreach word [split $text] { 
       if {[string length $word] >= $length && [regexp {^(f|ht)tp(s|)://} $word] && \ 
        ![regexp {://([^/:]*:([^/]*@|\d+(/|$))|.*/\.)} $word]} { 
        foreach site $ignoredSites { 
         if {![string match *$site* $word]} { 
          set last $time 

          # Enable HTTPS support. 
          ::http::register https 443 [list urltitle::socket] 
          set title [urltitle::parse $word] 

          # Disable HTTPS support. 
          ::http::unregister https 

          # Sends text to the server, like 'putserv', but it uses a different queue intended for sending messages to channels or people. 
          puthelp "PRIVMSG $chan :$title" 

          break 
         } 
        } 
       } 
      } 
     } 
     return 1 
    } 

    proc parse {url} { 
     set title "" 
     variable timeout 

     if {[info exists url] && [string length $url]} { 
      if {[catch {set http [::http::geturl $url -timeout $timeout]} results]} { 
       putlog "Connection to $url failed" 
      } else { 
       if {[::http::status $http] == "ok" } { 
        set data [::http::data $http] 

        if {[catch {set doc [dom parse -html -simple $data]} results]} { 
         # Remove HTML comments. 
         regsub -all {<!--.*?-->} $data {} data 

         # Remove everything except <head></head> content. 
         regexp -nocase {<head>.*?</head>} $data match 
         #regsub -nocase {.*?<head>} $data {} data 
         #regsub -nocase {</head>.*?} $data {} data 

         regexp -nocase {<title>(.*?)</title>} $data match title 
         #set title [regsub -all -nocase {\s+} $title " "] 
         set title [string trim $title] 
        } else { 
         set root [$doc documentElement] 
         set title [string trim [[$root selectNodes {//head/title[1]/text()}] data]] 
         $doc delete 
        } 
       } else { 
        putlog "Connection to $url failed" 
       } 
       http::cleanup $http 
      } 
     } 
     return $title 
    } 
    putlog "URL title parser v$scriptVersion" 
} 

有没有人为什么会发生这种情况?我认为这个问题是set title [urltitle::parse $word],但我不能说清楚。

回答

1

由于您的模式正确匹配了这两个网址,因此问题出现在您尚未显示的代码urltitle::parse中。确定这是否为真的一个好方法就是尝试在交互式shell中运行一小段代码。

我猜测,实际的问题是youtu.be URL生成一个HTTP 重定向到其他URL(或非常喜欢它); Tcl的http库不会为你处理重定向 - 它会是顶层的更高层(如果thisurltitle代码的源代码,那么我可以看到它没有这样做) - 并且结果产生原因有些东西以肮脏的方式窒息。

如果你只是想支持这些youtu.be网址,你可以使用该立即regsub改写自己将网址传递到urltitle::parse前:

... 
    regsub {^https?//youtu\.be/([^?/]*)$} $word {https://www.youtube.com/watch?\1} word 
    set title [urltitle::parse $word] 
    ... 

regsub被小心保护所以不会改变它不应该做的任何事情,但这种方法不可扩展;你无法为每个网站引入你自己的重写规则!相反,它需要为您正确处理各种重定向。这是urltitle代码中的一个实际错误。

+0

这看起来很有希望,我明天再看看。 –

+0

我已经完成了需要修复的公关。 –

+0

是的,你是正确的引入每个网站的重写规则。我开始研究TCL HTTP重定向,并在http://wiki.tcl.tk/11831上找到了一些最初由您制作的代码。现在它工作得很好。谢谢您的帮助! –