I believe that many people will encounter websites that need simulated Login in the process of crawling , Use at this time selenium Simulating the browser has become the preferred solution , But some websites will also be right selenium This plug-in detects , Once discovered, various anti creep mechanisms will appear ( Access denied 、 Verification Code 、 Can't load out 、 Man machine judgment, etc ). Today, let's introduce a scheme that can perfectly bypass this kind of detection , Is the use of Selenium Control the browser that is already open .
1、 Find the locally installed browser startup path , for example Chrome
# windows
C:\Program Files (x86)\Google\Chrome\Application\chrome.exe
# mac
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome
2、 Start... From the command line ChromeDbug Pattern
# windows
$ C:\Program Files (x86)\Google\Chrome\Application>chrome.exe --remote-debugging-port=9222
# mac
$ /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome -remote-debugging-port=9222
# notes :
1. Launch the browser dbug In mode, you need to close all the processes opened by the browser .
2. 9222 Is the default port , It can be modified at will . But don't use ports that are already occupied .
3、 Connection commissioning switch on chrome
options = webdriver.ChromeOptions()
options.debugger_address = "127.0.0.1:9222"
driver = webdriver.Chrome(options=options)
If the above is helpful to you , Please give me a compliment , thank you !