2015-05-04 74 views
-1

When I try to scrape the table data from the following link it displays nothing..`如何利用引入nokogiri

我写了下面的代码,但它给nothing..I希望表数据,即最后更新,天气,温度从该链接这是我给请帮放弃从网站数据我..

url = "http://w1.weather.gov/xml/current_obs/KM89.xml" 

docs = Nokogiri::HTML(open(url)) 

puts docs.css("table") 
+0

我看到你的问题,有没有在该网页上多CSS选择器。我建议寻找XPath – SirLenz0rlot

回答

2

进入该页面,打开你的开发工具,当你发现网络选项卡下的请求KM89.xml的反应,你会看到,它不是返回HTML,XML,但像这样的:

<?xml version="1.0" encoding="ISO-8859-1"?> 
<?xml-stylesheet href="latest_ob.xsl" type="text/xsl"?> 
<current_observation version="1.0" 
    xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xsi:noNamespaceSchemaLocation="http://www.weather.gov/view/current_observation.xsd"> 
    <credit>NOAA's National Weather Service</credit> 
    <credit_URL>http://weather.gov/</credit_URL> 
    <image> 
    <url>http://weather.gov/images/xml_logo.gif</url> 
    <title>NOAA's National Weather Service</title> 
    <link>http://weather.gov</link> 
    </image> 
    <suggested_pickup>15 minutes after the hour</suggested_pickup> 
    <suggested_pickup_period>60</suggested_pickup_period> 
    <location>Dexter B Florence Memorial Field Airport, AR</location> 
    <station_id>KM89</station_id> 
    <latitude>34.1</latitude> 
    <longitude>-93.07</longitude> 
    <observation_time>Last Updated on Nov 23 2012, 7:56 am CST</observation_time> 
     <observation_time_rfc822>Fri, 23 Nov 2012 07:56:00 -0600</observation_time_rfc822> 
    <weather>Light Rain</weather> 
    <temperature_string>57.0 F (13.8 C)</temperature_string> 
    <temp_f>57.0</temp_f> 
    <temp_c>13.8</temp_c> 
    <relative_humidity>87</relative_humidity> 
    <wind_string>Northeast at 8.1 MPH (7 KT)</wind_string> 
    <wind_dir>Northeast</wind_dir> 
    <wind_degrees>30</wind_degrees> 
    <wind_mph>8.1</wind_mph> 
    <wind_kt>7</wind_kt> 
    <pressure_string>1027.5 mb</pressure_string> 
    <pressure_mb>1027.5</pressure_mb> 
    <pressure_in>30.30</pressure_in> 
    <dewpoint_string>52.9 F (11.6 C)</dewpoint_string> 
    <dewpoint_f>52.9</dewpoint_f> 
    <dewpoint_c>11.6</dewpoint_c> 
    <windchill_string>55 F (13 C)</windchill_string> 
     <windchill_f>55</windchill_f> 
     <windchill_c>13</windchill_c> 
    <visibility_mi>10.00</visibility_mi> 
    <icon_url_base>http://forecast.weather.gov/images/wtf/small/</icon_url_base> 
    <two_day_history_url>http://www.weather.gov/data/obhistory/KM89.html</two_day_history_url> 
    <icon_url_name>ra1.png</icon_url_name> 
    <ob_url>http://www.weather.gov/data/METAR/KM89.1.txt</ob_url> 
    <disclaimer_url>http://weather.gov/disclaimer.html</disclaimer_url> 
    <copyright_url>http://weather.gov/disclaimer.html</copyright_url> 
    <privacy_policy_url>http://weather.gov/notice.html</privacy_policy_url> 
</current_observation> 

所以,你可以刮它是这样的:

require 'open-uri' 
require 'nokogiri' 

url = 'http://w1.weather.gov/xml/current_obs/KM89.xml' 
doc = Nokogiri::HTML(open(url)) 

p doc.at_css('station_id').text 
+0

但我如何获得标签????? –

+0

您可以通过以下方式获取它们:http://w1.weather.gov/xml/current_obs/latest_ob.xsl –

+0

谢谢Almir ......... –