python Crawler has more libraries to use , When I first learned to crawl, I used urllib library , This is a good tool for getting started , To understand some basic concepts of reptiles , It is helpful to master the process of crawling . But after getting started , We need to learn more advanced content and tools to facilitate our crawling , So let's briefly introduce requests Basic usage of the library .
requests It's a very practical Python HTTP Client library , Crawlers and test servers are often used when responding to data ,requests yes Python Third party libraries of languages , Dedicated for sending HTTP request , Compared with urllib Concise and many . Here we will briefly talk about how to implement the proxy .
#! -*- encoding:utf-8 -*- import requests import random # Target page to visit targetUrl = "http://httpbin.org/ip" # Objectives to visit HTTPS page # targetUrl = "https://httpbin.org/ip" # proxy server ( The product's official website www.16yun.cn) proxyHost = "t.16yun.cn" proxyPort = "31111" # Proxy authentication information proxyUser = "username" proxyPass = "password" proxyMeta = "http://%(user)s:%(pass)[email protected]%(host)s:%(port)s" % { "host" : proxyHost, "port" : proxyPort, "user" : proxyUser, "pass" : proxyPass, } # Set up http and https All visits are made with HTTP agent proxies = { "http" : proxyMeta, "https" : proxyMeta, } # Set up IP Switch head tunnel = random.randint(1,10000) headers = {"Proxy-Tunnel": str(tunnel)} resp = requests.get(targetUrl, proxies=proxies, headers=headers) print resp.status_code print resp.text
Here I mainly share with you requests Examples when using agents , There are many other basic uses , For example, various request methods , add to heads Information 、 obtain cookie、 We can continue to explain exception handling for you next time . Interested people can communicate .