程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Get browser favorites and browsing history using Python

編輯:Python

Browse a web page on your computer , All use browsers , When you open the web page , The browser will record your browsing information , This information may be the source of your information leakage . Let's see how to use python Get the history of browsing ;

With chrome Browser as an example :

Find the browser data storage location

Data storage location is generally fixed , They are basically in the following positions

C:\Users\Administrator\AppData\Local\Google\Chrome\User Data\Default

If you don't find it in this position , Can open chrome browser , Enter in the address field

chrome://version

After the personal data path is the data storage location ;

Enter in folder , Find a file in it :

Bookmarks Store your favorite bookmarks
History Store historical browsing records

Get bookmarks

Bookmarks Is the bookmark information saved , Data in one json File format save , So get bookmark information and use it directly python Just read the file

def getBookmarks(): marks = '' with open(bookmarks,'r',encoding='utf-8') as f: marks = json.loads(f.read()) for name, item in marks['roots'].items(): print(' Favorite name :',name) child = item['children'] for c in child: print(c['type']) #folder,url print(c['name'])

In the bookmark file ;

bookmark_bar Bookmark bar
other Other Bookmarks .
synced Mobile device bookmarks

Get history

Histroy The file is a sqlite3 database ; To obtain the data, you need to use sqlite3 library ;

Use the database visualizer to open the database , Find... In it urls surface , All your browsing history will be saved in this table

python To connect to the database , get data , You need to use python Medium sqlite3;
sqlite3 yes Python A library of its own , No additional modules are required , It's easy to operate .

First define the database path , Pass the path into sqlite3.connect() In the method , Connect to the database first ;
And get curosr The cursor , Through execution sql Query statement , Get the data in the table ;

The specific code can be referred to as follows :

def get_history(): conn = sqlite3.connect(history) # Even the database cursor = conn.cursor() # To obtain the cursor cursor = conn.execute("SELECT id,url,title,visit_count,last_visit_time from urls order by last_visit_time desc ") rows = [] for _id, url, title, visit_count, last_visit_time in cursor: row = {} row['id'] = _id row['url'] = url row['title'] = title row['visit_count'] = visit_count row['last_visit_time'] = time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(last_visit_time/1000000 -11644473600 )) if last_visit_time > 0 else 0 rows.append(row) return rows

What needs to be noted is :
One :

last_visit_time It means url Last visited by ; The unit is subtle When converting to seconds , Divide by 10^6;
last_visit_time The starting value of is 1601 year 1 month 1 Japan 0 when 0 branch 0 second , Calculation time of normal computer 0 Second is to do 1970 year , So subtract 11644473600, Get the number of seconds that can be changed into time ;
Found during operation , The secondary attribute may be 0. Conditional judgment is needed ;

Two :

When our browser is running, the database will be locked and cannot be opened , So before the script runs, you need to kill Drop the browser process ,
kill chrome Process command taskkill /f /t /im chrome.exe

So you need to execute the kill process command before obtaining the history :

kill_cmd = 'taskkill /f /t /im chrome.exe'os.system(kill_cmd) # Killing process result = get_history()

Existing wheels

python There is an excellent library in , It can achieve the acquisition of historical records ,

browserhistory

install browserhistory

There are many recommended installation methods , Recommended pip;

pip yes Python Package installer for . Actually ,pip Namely Python Standard library (The Python Standard Library) One of the bags in , It's just that this bag is special , It can be used to manage Python Standard library (The Python Standard Library) Other bags in the .pip Is a command line program . install pip after , A... Will be added to the system pip command , This command can be run from the command prompt .

install pip:

  • install python; This must be installed ;

  • download pip:

    Official website address :https://pypi.org/project/pip/#downloads; After downloading , decompression

  • Open the command line window , Enter into pip Unzipped directory ; Execute code

    python3 setup.py install
    Installation ,
    After installation , take pip Add to system environment variables

  • verification
    Open the command line window , Input pip list perhaps pip3 list

  • install browserhistory:
    Open the command line window , Enter and execute the following code , And return

    pip install browserhistory

    Wait for the prompt that the installation is successful ;

Code implementation :

installation is complete browserhistory after , Just import , Then the above functions can be realized with little code ;

 import browserhistory dc = browserhistory.get_browserhistory() print(dc.keys()) # browser chrome firefox print(dc['chrome'][0])

Four lines of code can handle ;

Use browserhistory Getting records is simple , And the whole source code of this library is just less than 200 That's ok , You can easily access Google , firefox ,safari The browsing history of these three browsers , And support ,mac, linux, windows Three platforms . Learn the source code , Learn and master , It is a great ability improvement for Technology ;

After checking the source code , You'll find that , There are not only access methods in the Library , It also provides the function of saving history to the hard disk ; Need to use pandas library , Another wheel ; If necessary, , You can see .


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved