1、問題:用Jsoup在獲取一些網站的數據時,起初獲取很順利,但是在訪問某浪的數據是Jsoup報錯,應該是請求頭裡面的請求類型(ContextType)不符合要求。
錯誤信息:
Exception in thread "main" org.jsoup.UnsupportedMimeTypeException: Unhandled content type. Must be text/*, application/xml, or application/xhtml+xml. Mimetype=application/json; charset=utf-8, URL=... at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:547) at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:493) at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:205) at org.jsoup.helper.HttpConnection.get(HttpConnection.java:194) at com.Interface.test.JsoupUtil.httpGet(JsoupUtil.java:30) at com.Interface.test.test.main(test.java:23)
請求方法:
public static String httpGet(String url,String cookie) throws IOException{ //獲取請求連接 Connection con = Jsoup.connect(url); //請求頭設置,特別是cookie設置 con.header("Accept", "text/html, application/xhtml+xml, */*"); con.header("Content-Type", "application/x-www-form-urlencoded"); con.header("User-Agent", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0))"); con.header("Cookie", cookie); //解析請求結果 Document doc=con.get(); //獲取標題 System.out.println(doc.title()); return doc.toString(); }
2、解決:只需要在 Connection con = Jsoup.connect(url);中添加ignoreContentType(true)即可,這裡的ignoreContentType(true)意思就是忽略ContextType的檢查。
添加後
//獲取請求連接 Connection con = Jsoup.connect(url).ignoreContentType(true);