2016-11-21 88 views
0
<div id="ext-gen392" class="x-panel-body"> 
    <div class="identify multiline"> 
     <div class="item"> 
      <span class="larger-text">1566 GREENE AVENUE, Brooklyn 11237</span> 
     </div> 
     <div class="item" style="display:none"> 
      <span class="label">Alternate address from NYC Dept of City Planning:</span> 
      <br>1566 GREENE AVENUE 
     </div> 
     <div class="item"> 
      <span style="background-color:#FFE094;" class="legend-color"></span><span class="label" style="font-style: italic;">&nbsp;Residential: Multi-Family Walk-up</span> 
     </div> 
     <div class="item" style="clear:both;"> 
      <span class="label">Owner:</span> BAEZ, IGNACIO 
     </div> 
     <div class="item"> 
      <span class="label">Block:</span> 3303 <span class="label">Lot:</span> 22 
     </div> 
     <div class="item"> 
      <span class="label">Property Characteristics:</span> 
      <ul style="list-style-type: none; padding-left: 0;"> 
       <li><span class="label">Lot Area:</span> 1,950 sq ft (19.5' x 100')</li> 
       <li><span class="label"># of Buildings:</span> 1 <span class="label">Year 
        built:</span> 1920 (Year built is an estimate)</li> 
       <li><span class="label">Building frontage:</span> 19.5' <span class="faded-text">(Building frontage along the street measured in feet.)</span></li> 
       <li><span class="label"># of floors:</span> 3 <span class="label">Building 
        Area:</span> 3,303 sq ft</li> 
       <li><span class="label">Total Units:</span> 3 <span class="label"> 
        Residential Units:</span> 3</li> 
       <li><span class="label">Primary zoning:</span> R6 <span class="label">Commercial Overlay:</span> 
        None</li> 
       <li><span class="label">Floor Area Ratio:</span> 1.69 
        <br> 
        <span class="label">Max. Allowable Residential FAR:</span> 2.43 
        <br> 
        <span class="label">Max. Allowable Commercial FAR:</span> 0 
        <br> 
        <span class="label">Max. Allowable Facility FAR:</span> 4.8 
        <!--REMOVED MAX FAR UNTIL WE FIGURE OUT HOW TO ADD DIFFT FAR VARS FROM PLUTO13--> 
        <!--<span class="label">Max. FAR:</span> 0 --> 
        <span class="faded-text"> 
         <br> 
         The Maximum Allowable Floor Area Ratios are exclusive of bonuses for plazas, plaza-connected open areas, arcades or other amenities. 
         <br> 
         FAR may depend on street widths or other characteristics. Contact <a href="http://www1.nyc.gov/site/planning/zoning/about-zoning.page" target="_blank">City Planning Dept.</a> for latest information.</span></li> 
      </ul> 
     </div> 
     <div class="item"> 
      <span class="label">MORE INFO:</span> 
      <ul> 
       <li><span class="label">Zoning Map#:</span> <a href="http://www1.nyc.gov/assets/planning/download/pdf/zoning/zoning-maps/map13b.pdf" target="_blank"> 
        13b</a> (<a href="http://www1.nyc.gov/site/planning/zoning/zoning-maps.page" target="_blank">how to read</a> NYC zoning maps)</li> 
       <li><span class="label">Historical Zoning Maps:</span> <a href="http://www1.nyc.gov/assets/planning/download/pdf/zoning/zoning-maps/historical-zoning-maps/maps13b.pdf" target="_blank"> 
        13b</a></li> 

       <li><a href="http://a810-bisweb.nyc.gov/bisweb/PropertyProfileOverviewServlet?boro=3&amp;block=3303&amp;lot=22" target="_blank">NYC Dept. of Buildings</a></li> 


       <li><a href="http://a836-acris.nyc.gov/bblsearch/bblsearch.asp?borough=3&amp;block=3303&amp;lot=22" target="_blank">Property transaction records</a> (<b>NB:</b> buildings w/condos may not show transaction results)</li> 

       <li><a href="http://webapps.nyc.gov:8084/CICS/fin1/find001i?FFUNC=C&amp;FBORO=3&amp;FBLOCK=3303&amp;FLOT=22" target="_blank">NYC Dept. of Finance Assessment Roll</a></li> 
       <li><a href="https://hpdonline.hpdnyc.org/HPDonline/provide_address.aspx" target="_blank">NYC HPD data</a></li><!--?p1=3&p2=street number =&p3=street name--> 
       <li><a href="http://gis.nyc.gov/doitt/nycitymap/template?z=8&amp;p=1008264,195724&amp;a=ZOLA&amp;c=ZOLA&amp;s=l:Brooklyn,3303,22,PLUTO" target="_blank">NYC Planning's ZoLa application</a></li> <!--http://gis.nyc.gov/doitt/nycitymap/template?z=8&p=988783,211983&a=ZOLA&c=ZOLA&s=a:365,FIFTH+AVENUE,MANHATTAN--> 
       <li><a href="http://maps.nyc.gov/taxmap/map.htm?searchType=BblSearch&amp;featureTypeName=EVERY_BBL&amp;featureName=3033030022" target="_blank">NYC Digital Tax Map</a></li> 
<!--    <li><a href="http://a810-bisweb.nyc.gov/bisweb/PropertyProfileOverviewServlet?boro=3&block=3303&lot=22" target="_blank">NYC Dept. of Buildings</a></li> 
       <li><a href="http://a836-acris.nyc.gov/bblsearch/bblsearch.asp?borough=3&block=3303&lot=22" target="_blank">Property transaction records</a></li> 
       <li><a href="http://webapps.nyc.gov:8084/CICS/fin1/find001i?FFUNC=C&FBORO=3&FBLOCK=3303&FLOT=22" target="_blank">NYC Dept. of Finance Assessment Roll</a></li> 
       <li><a href="http://gis.nyc.gov/taxmap/map.htm?searchType=FeatureSearch&featureTypeName=TAX_LOT_POLYGON&featureName=3033030022" target="_blank">NYC Digital Tax Map</a></li>--> 
       <li><a href="http://www.nyc.gov/html/dcp/html/subcats/zoning.shtml" target="_blank"> 
        NYC zoning guide</a></li> 
       <li><a href="http://www.oasisnyc.net/watershed/watershed.aspx" target="_blank">NYC 
        Watershed Resources</a></li> 
      </ul> 
     </div> 
     <div class="item"> 
      <span class="label">OASIS shortcut to this property:</span> 
      <br> 
      <a href="http://www.oasisnyc.net/map.aspx?zoomto=lot:3033030022">http://www.oasisnyc.net/map.aspx?zoomto=lot:3033030022</a> 
     </div> 
     <div class="item"> 
      <span class="faded-text">Source: MapPLUTO Tax 
       Block &amp; Tax Lot files from the New York City Department of City Planning, 
       2016 (ver. 16v1).</span> 
     </div> 
<!--  <div class="item" style="width: 95%; margin: 10px 0 5px 4px;"> 
      <span style="display:block;padding: 1px; color: #000066; background-color: #dddddd; border-bottom: solid 1px #aabbdd;"> 
       NYC Department of City Planning Census Factfinder 
      </span> 
      Find all census tracts within 
      <select id="selTaxLotRadius" style="font-size:1.1em" > 
       <option>0.25</option> 
       <option>0.5</option> 
       <option>1</option> 
      </select> 
      mile(s) 
      <input type="button" value="Go" style="font-size:1.1em;font-weight:bold;" onclick="var sel=document.getElementById('selTaxLotRadius');CUR.IdentifyLotTemplate.goToNycFF('1566 GREENE AVENUE','3', sel.options[sel.selectedIndex].value);" /> 
     </div>--> 
<!--  <div class="item"> 
      <div style="width: 95%; margin: 10px 0 5px 4px;"> 
       <div style="padding: 1px; color: #000066; background-color: #dddddd; border-bottom: solid 1px #aabbdd;"> 
        <a href="http://local.yahoo.com/" style="text-decoration: none;" 
         target="newWin"><span style="color: #ff0000; font-weight: bold;">YAHOO!</span> <span style="color: #000066;"> 
          Local</span></a> search results for this 
        address:</div> 
       <div style="padding-left: 4px;"> 

        <div style="margin-top: 4px; color: #888888; font-style: italic;"> 
         &nbsp;Know of something that's missing? <a href="http://listings.local.yahoo.com/csubmit/index.php" 
          target="newWin">Add it to YAHOO!</a></div> 
       </div> 
      </div> 
     </div>--> 
    </div> 
</div> 

我正在取消网站以收集有关属性的一些数据。我试图获得所有者名称并最终获得<span class="label">之后的所有其他文本属性。以下是查询表达式normalize-space(//span[(@class='label') and contains(., 'Owner:')]/following-sibling::text()),我使用FirePath评估了表达式,并返回了正确的字符串,但是在Google表格中,返回的值为空。有什么建议么?使用IMPORTXML从网站上抓取数据

+0

XPath查询看起来是正确的,我可以在HTML模式xmllint验证。但它不是有效的XML ...... Google表格能够处理不是XML的HTML吗? – Markus

+0

是的,Google表格能够使用IMPORTXML功能处理HTML。 –

+0

你能发布网页的网址吗? – Markus

回答

1

您可以通过修改您的网址做到这一点,你的查询只是一个小 - 比如我发现你想要的原始数据端点看起来是这样的:http://www.oasisnyc.net/service.svc/lot/3033030022?layerstoselect=

所以当时使用这个公式,你可以改变你的原来的URL到正确的端点:

="http://www.oasisnyc.net/service.svc/lot/"&REGEXEXTRACT(A1,"lot:(\d+)")&"?layerstoselect="

如果您在=transpose(IMPORTDATA(B1))数据拉你会看到一个列有所有的领域,这取决于你想如何安排数据,然后你可以使用ARRAYFORMULA和什么都不干净/转换米一块,如果你想在一列的标题和数据并根据需要分开......例如,你可以输入:

=arrayformula(regexreplace({iferror(ARRAYFORMULA(REGEXEXTRACT(transpose(IMPORTDATA(B1)),"(\w+):"))),iferror(ARRAYFORMULA(REGEXEXTRACT(transpose(IMPORTDATA(B1)),":""?(.*)""?")))},"""","")) 

enter image description here

如果你要转置到行包裹了整个事情的转置:

=transpose(arrayformula(regexreplace({iferror(ARRAYFORMULA(REGEXEXTRACT(transpose(IMPORTDATA(B1)),"(\w+):"))),iferror(ARRAYFORMULA(REGEXEXTRACT(transpose(IMPORTDATA(B1)),":""?(.*)""?")))},"""",""))) 

enter image description here

+0

好的侦探工作。有趣的是,所有者名称似乎并不完整。 – Markus