2012-08-04 84 views
0

我搜索了这个网站过去两天,寻找一个原因,为什么这将无法正常工作!我试图在crossfit.com上获得一天的锻炼,当我运行该程序时,它显示一段时间的白色屏幕,然后崩溃。请告诉我这里有什么问题!Android - Jsoup HTML抓取

public class MainActivity extends Activity { 
/** Called when the activity is first created. */ 
@Override 
public void onCreate(Bundle savedInstanceState) { 
    super.onCreate(savedInstanceState); 
    setContentView(R.layout.activity_main); 

    TextView tv = (TextView) findViewById(R.id.textView1); 

    Document doc = null; 
    try { 
     doc = Jsoup.connect("http://www.crossfit.com").get(); 
    } catch (IOException e) { 
     // TODO Auto-generated catch block 
     e.printStackTrace(); 
    } 
    Element content = doc.getElementsByClass("blogbody").first(); 
    System.out.println(content.text()); 
    tv.setText(content.text()); 
} 
} 

的logcat:

08-04 00:14:08.105: E/AndroidRuntime(339): FATAL EXCEPTION: main 
08-04 00:14:08.105: E/AndroidRuntime(339): java.lang.OutOfMemoryError 
08-04 00:14:08.105: E/AndroidRuntime(339): at java.util.ArrayList.add(ArrayList.java:123) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.nodes.Node.addChildren(Node.java:411) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.nodes.Element.appendChild(Element.java:267) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.parser.HtmlTreeBuilder.insertNode(HtmlTreeBuilder.java:204) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.parser.HtmlTreeBuilder.insertEmpty(HtmlTreeBuilder.java:173) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.parser.HtmlTreeBuilderState$7.process(HtmlTreeBuilderState.java:443) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.parser.HtmlTreeBuilder.process(HtmlTreeBuilder.java:89) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.parser.HtmlTreeBuilderState$15.anythingElse(HtmlTreeBuilderState.java:1197) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.parser.HtmlTreeBuilderState$15.process(HtmlTreeBuilderState.java:1191) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.parser.HtmlTreeBuilder.process(HtmlTreeBuilder.java:84) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.parser.TreeBuilder.runParser(TreeBuilder.java:48) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.parser.TreeBuilder.parse(TreeBuilder.java:41) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.parser.HtmlTreeBuilder.parse(HtmlTreeBuilder.java:37) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.parser.Parser.parseInput(Parser.java:30) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.helper.DataUtil.parseByteData(DataUtil.java:101) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.helper.HttpConnection$Response.parse(HttpConnection.java:469) 
08-04 00:14:08.105: E/AndroidRuntime(339): at org.jsoup.helper.HttpConnection.get(HttpConnection.java:147) 
08-04 00:14:08.105: E/AndroidRuntime(339): at com.example.lookingfor.MainActivity.onCreate(MainActivity.java:36) 
08-04 00:14:08.105: E/AndroidRuntime(339): at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1047) 
08-04 00:14:08.105: E/AndroidRuntime(339): at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:1611) 
08-04 00:14:08.105: E/AndroidRuntime(339): at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:1663) 
08-04 00:14:08.105: E/AndroidRuntime(339): at android.app.ActivityThread.access$1500(ActivityThread.java:117) 
08-04 00:14:08.105: E/AndroidRuntime(339): at android.app.ActivityThread$H.handleMessage(ActivityThread.java:931) 
08-04 00:14:08.105: E/AndroidRuntime(339): at android.os.Handler.dispatchMessage(Handler.java:99) 
08-04 00:14:08.105: E/AndroidRuntime(339): at android.os.Looper.loop(Looper.java:123) 
08-04 00:14:08.105: E/AndroidRuntime(339): at android.app.ActivityThread.main(ActivityThread.java:3683) 
08-04 00:14:08.105: E/AndroidRuntime(339): at java.lang.reflect.Method.invokeNative(Native Method) 
08-04 00:14:08.105: E/AndroidRuntime(339): at java.lang.reflect.Method.invoke(Method.java:507) 
08-04 00:14:08.105: E/AndroidRuntime(339): at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:839) 
08-04 00:14:08.105: E/AndroidRuntime(339): at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:597) 
08-04 00:14:08.105: E/AndroidRuntime(339): at dalvik.system.NativeStart.main(Native Method) 

回答

2

嗯.. Jsoup并非无懈可击。尽管它适用于所有下载,但看起来很有吸引力,但有时你自己也可以更好地下载流媒体。

URL url = new URL("http://www.crossfit.com"); 
url.openConnection(); 
InputStream is = url.openStream(); 
byte[] b = new byte[8192]; 
int count; 
ByteArrayOutputStream os = new ByteArrayOutputStream(); 
while ((count = is.read(b)) != -1) { 
    os.write(b, 0, count); 
} 
is.close(); 
doc = Jsoup.parse(os.toString("UTF-8")); 

这是一个非常巨大的页面 - 拉屎了约每弹出HTML的70K是不会有任何小的内存打印设备类似DOM的模型来坐好。加载它自己可以工作..在大多数设备..否则 - 看看使用TagSoup代替流媒体解决方案。

+0

感谢您的输入,我有一个版本,加载webview中的整个网页,真的只是想要一天的锻炼,这就是为什么我认为Jsoup将是一个好的途径去,我给你的建议一枪虽然。谢谢! – user1561757 2012-08-04 00:55:33

+0

这条路线也消耗了大量的内存 - 比做一个JSoup连接要少,它应该加载在一个合理的2.3设备上 - 否则 - 在你的情况下 - 我会用TagSoup来做刮擦 - 它缺少漂亮的CSS风格的选择器,但它不会像一个胖胖的女孩一样吃自助餐。 – Jens 2012-08-04 00:58:21

+0

哈哈谢谢你!那么我会检查那条路线 – user1561757 2012-08-04 00:59:23