Browse a web page on your computer , All use browsers , When you open the web page , The browser will record your browsing information , This information may be the source of your information leakage . Let's see how to use python Get the history of browsing ;
With chrome Browser as an example :
Data storage location is generally fixed , They are basically in the following positions
C:\Users\Administrator\AppData\Local\Google\Chrome\User Data\Default
If you don't find it in this position , Can open chrome browser , Enter in the address field
chrome://version
After the personal data path is the data storage location ;
Enter in folder , Find a file in it :
Bookmarks Store your favorite bookmarks
History Store historical browsing records
Bookmarks Is the bookmark information saved , Data in one json File format save , So get bookmark information and use it directly python Just read the file
def getBookmarks(): marks = '' with open(bookmarks,'r',encoding='utf-8') as f: marks = json.loads(f.read()) for name, item in marks['roots'].items(): print(' Favorite name :',name) child = item['children'] for c in child: print(c['type']) #folder,url print(c['name'])
In the bookmark file ;
bookmark_bar Bookmark bar
other Other Bookmarks .
synced Mobile device bookmarks
Histroy The file is a sqlite3 database ; To obtain the data, you need to use sqlite3 library ;
Use the database visualizer to open the database , Find... In it urls surface , All your browsing history will be saved in this table
python To connect to the database , get data , You need to use python Medium sqlite3;
sqlite3 yes Python A library of its own , No additional modules are required , It's easy to operate .
First define the database path , Pass the path into sqlite3.connect() In the method , Connect to the database first ;
And get curosr The cursor , Through execution sql Query statement , Get the data in the table ;
The specific code can be referred to as follows :
def get_history(): conn = sqlite3.connect(history) # Even the database cursor = conn.cursor() # To obtain the cursor cursor = conn.execute("SELECT id,url,title,visit_count,last_visit_time from urls order by last_visit_time desc ") rows = [] for _id, url, title, visit_count, last_visit_time in cursor: row = {} row['id'] = _id row['url'] = url row['title'] = title row['visit_count'] = visit_count row['last_visit_time'] = time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(last_visit_time/1000000 -11644473600 )) if last_visit_time > 0 else 0 rows.append(row) return rows
What needs to be noted is :
One :
last_visit_time It means url Last visited by ; The unit is subtle When converting to seconds , Divide by 10^6;
last_visit_time The starting value of is 1601 year 1 month 1 Japan 0 when 0 branch 0 second , Calculation time of normal computer 0 Second is to do 1970 year , So subtract 11644473600, Get the number of seconds that can be changed into time ;
Found during operation , The secondary attribute may be 0. Conditional judgment is needed ;
Two :
When our browser is running, the database will be locked and cannot be opened , So before the script runs, you need to kill Drop the browser process ,
kill chrome Process command taskkill /f /t /im chrome.exe
So you need to execute the kill process command before obtaining the history :
kill_cmd = 'taskkill /f /t /im chrome.exe'os.system(kill_cmd) # Killing process result = get_history()
python There is an excellent library in , It can achieve the acquisition of historical records ,
browserhistory
There are many recommended installation methods , Recommended pip;
pip yes Python Package installer for . Actually ,pip Namely Python Standard library (The Python Standard Library) One of the bags in , It's just that this bag is special , It can be used to manage Python Standard library (The Python Standard Library) Other bags in the .pip Is a command line program . install pip after , A... Will be added to the system pip command , This command can be run from the command prompt .
install pip:
install python; This must be installed ;
download pip:
Official website address :https://pypi.org/project/pip/#downloads; After downloading , decompression
Open the command line window , Enter into pip Unzipped directory ; Execute code
python3 setup.py install
Installation ,
After installation , take pip Add to system environment variables
verification
Open the command line window , Input pip list perhaps pip3 list
install browserhistory:
Open the command line window , Enter and execute the following code , And return
pip install browserhistory
Wait for the prompt that the installation is successful ;
installation is complete browserhistory after , Just import , Then the above functions can be realized with little code ;
import browserhistory dc = browserhistory.get_browserhistory() print(dc.keys()) # browser chrome firefox print(dc['chrome'][0])
Four lines of code can handle ;
Use browserhistory Getting records is simple , And the whole source code of this library is just less than 200 That's ok , You can easily access Google , firefox ,safari The browsing history of these three browsers , And support ,mac, linux, windows Three platforms . Learn the source code , Learn and master , It is a great ability improvement for Technology ;
After checking the source code , You'll find that , There are not only access methods in the Library , It also provides the function of saving history to the hard disk ; Need to use pandas library , Another wheel ; If necessary, , You can see .