2011-04-07 106 views
0

我只想从使用Objective-c的“discription”标记中进行iPhone编程;如何在Objective-c/xcode中解析RSS XML时忽略额外的html标签?

尼泊尔的政府和私营部门都没有将数据和应用程序的远程备份保存在一个地点的灾难发生后可以安全的距离内。认证首席拉詹拉吉潘塔的控制器办公室警告说,随着...

<description> 
<table border="0" cellpadding="2" cellspacing="7" style="vertical-align:top;"> 
<tr> 
<td width="80" align="center" valign="top"> 
<font style="font-size:85%;font-family:arial,sans-serif"></font></td> 
<td valign="top" class="j"> 
<font style="font-size:85%;font-family:arial,sans-serif"> 
<br /> 
<div style="padding-top:0.8em;"> 
<img alt="" height="1" width="1" /></div> 
<div class="lh"> 
<a href="http://news.google.com/news/url?sa=t&amp;fd=R&amp;usg=AFQjCNG5gNh3aGY3uxIlUjnsJ_C4ugrnrg&amp;url=http://www.thehimalayantimes.com/fullNews.php?headline%3DJapan%2Bquake%2Ba%2Bwake-up%2Bcall%2Bfor%2BNepal%2BIT%2Bsector%26NewsID%3D280789"> 
<b>Japan quake a wake-up call for 
<b>Nepal</b> IT sector</b></a> 
<br /> 
<font size="-1"> 
<b> 
<font color="#6f6f6f">Himalayan Times</font></b></font> 
<br /> 
<font size="-1">Neither the government nor private sector in 
<b>Nepal</b> has off-site backup of data and applications at a distance that can be safe after a disaster at one 
<b>location</b>. Office of the Controller of Certification chief Rajan Raj Panta warned that as the 
<b>...</b></font> 
<br /> 
<font size="-1" class="p"></font> 
<br /> 
<font class="p" size="-1"> 
<a class="p" href="http://news.google.com/news/more?pz=1&amp;ned=uk&amp;ncl=dxKbHaltcQfMZ4M"> 
<nobr> 
<b></b></nobr></a></font></div></font></td></tr></table> 
</description> 

请帮我我怎么忽略所有那些不需要的HTML标签和文本?

其实我正在使用谷歌新闻搜索rss,像这样:http://news.google.com/news?q=location:london&output=rss 是否有任何其他方式获取基于位置的rss消息?

+0

你怎么解析? – 2011-04-07 12:45:19

+0

我正在使用NSXMLParser并根据标签(标题,说明)检索内容。但我不知道如何避免这些html标签内的描述标签。我正在使用谷歌新闻搜索RSS获取消息。例如此rss http://news.google.com/news?q=location:london&output=rss – Himalay 2011-04-07 12:54:37

回答

1

所以,你已经完成了原始XML的一个解析,让你的一切的标签内的文本(这是在原来的逃脱,所以第一个解析会不会看着得很深),但他们重新发送HTML格式的RSS提要,你想纯文本?比如说,提取大小为-1的标签中的所有文本是否可以接受?如果是的话是这样的可能就够:

// relevant class members are: 
BOOL acceptText; 
NSMutableString *totalText; 

// when a new element starts, check if it's a 'font' tag, and if so, 
// decide whether to accept subsequent text based on its size 
- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict 
{ 
    if([elementName isEqualToString:@"font"]) 
    { 
     acceptText = [[attributeDict objectForKey:@"size"] intValue] == -1; 
    } 
} 

// upon receiving new characters, copy them into the string only if 
// that's what we're doing right now 
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string 
{ 
    if(acceptText) 
     [totalText appendString:string]; 
} 

这是一个有点脏修复,要考虑屏幕充其量刮。只需要他们改变他们的HTML布局,你的刮刮就会破裂。