May involve in project test log security test,Retrieves whether component print system log contains sensitive information,如身份證號碼、電話號碼,Text passwords and other information.The normal component is less,And log volume is small,Through the log downloaded to the local,再通過Ctrl+FIn a way that search view.但是,Once the components are more,The log file is large,This way is time consuming,且容易遺漏.
針對以上問題,This article usePython+PyQt5Developed a batch log keyword scan tool,Can enter the log file path in the tools screen、待檢索的關鍵字、The retrieval results path,After running the tool under the corresponding path generating search results.The following began to introduce the whole process of tool development:
1.環境搭建
本人使用Pycharm作為python集成開發環境,Its installation is introduced steps on the go.Because of the need to log retrieval tool for visualization,So the choice nowPython中較為常用的UI插件包PyQt5.The plug-in package installation is simple,可直接在Pycharm中進行添加,需安裝PyQt5、PyQt5-tools、PyQ5-plugins、PyQt5Designer插件,安裝方式如下圖:
說明:Download the plugin source used in the package asPython官方的源,Sometimes there is failed download speed is slow,Can add a plug-in package download source,Add again after download.Here is the source of added tsinghua:Simple Indexhttps://pypi.tuna.tsinghua.edu.cn/simple/
Add a source for:點擊上圖中左下角的Manage Repositories按鈕,Enter the source management page,Click on the green add button,To copy in tsinghua source links can be saved.如下圖所示:
2.邏輯代碼編寫
In this paper, logic code section for the log file keyword retrieval,主要思路如下:
(1)Define a log keyword search function,First read each log file in the specified folder;
(2)定義一個字符串變量log_string,To hold each log file of all the lines;
(3)利用open()Methods to read-only open log file,並用readline()Method reads all the lines in the log file.And will read the contents of the assigned to variableslog_string,To form a long string;
(4)然後利用find()Methods for a given keywordkeyword在log_stringLook for key words and returns its position in the index of the long string;
(5)According to the key index position,將log_stringCorresponding to the index after the position of characters and a few(可自定)字符寫入txt文件中,As log keyword search results.
A complete logic code is as follows,將其封裝在一個Log_scanning類中.
'''Define a log retrieval class'''
class Log_scanning:
##初始化類屬性
def __init__(self,logs_path, keywords, scan_result_path):
self.logs_path = logs_path
self.keywords = keywords
self.scan_result_path = scan_result_path
####日志文件讀取----
def log_scan(self,logs_path, keywords, scan_result_path):
global end_msg_status
end_msg_status = 1 ##定義一個全局變量,If end of print information status value,When an exception that value to0,Don't print end information
try:
dirs = os.listdir(logs_path)
##log_string = "" ##定義一個字符串變量,Save the log file all the lines
# keywords = ["phone", "key", "password"]
# scan_result_path = "E:/python_pycharm/scan_result_path/"
os.makedirs(scan_result_path)
##Reads the specified log file folder---
for file in dirs:
file_path = logs_path + "/" + file ###Cycle from the log storage folder of each log file
filename = scan_result_path + "/" + file + "_scan_result.txt" ####Define the scan results for each log file storage path and scan results
log_string = "" ##定義一個字符串變量,Save the log file all the lines,Every time after processing a log file,初始化該變量,For the next cycle will be another log file village long strings
###In a read-only way open log file.And read all of the log file is variablelogs中
with open(file_path, "r", encoding="utf-8") as f:
logs = f.readlines()
for log in logs:
log_string += log.strip() ###The log into a long string
for keyword in keywords: ###Keywords retrieval part,Retrieves the characters contained in the keyword,And return keyword index values
start_index = -1
len_keyword = len(keyword)
number_keyword = log_string.count(keyword)
for i in range(number_keyword):
keyword_index = log_string.find(keyword, start_index + 1, len(log_string))
with open(filename, "a") as f:
f.write(
f"{log_string[keyword_index:keyword_index + len_keyword]}:" +
f"{log_string[keyword_index + len_keyword:keyword_index + len_keyword + 10]}\n")
except FileExistsError:
print("Scans folder already exists,Please enter a nonexistent a folder!")
my_newsignal = Newsignal() ##Instantiate custom signal class
my_newsignal.signal_connect()
end_msg_status = 0 ##觸發異常,置為0
except FileNotFoundError:
print("Log storage path is not correct,Please confirm and then perform!")
my_newsignal = Newsignal() ##Instantiate custom signal class
my_newsignal.signal_connect_2()
end_msg_status = 0 ##觸發異常,置為0
3.UI界面設計
本文采用PyQt5As a keyword retrieval logGUI界面設計工具,PyQt5作為第三方庫,Has a rich interface design control、Real-time display intuitive interface design、Logic code andGUICode separation, and many other advantages,Has been widelyPythonThe user to developGUIInterface like.
按照步驟1.Environment build step corresponding plug-in installed package,在PycharmPage menu barTools_>External ToolsWill appear under the two tools(下圖1):Qt Designer和PyUIC,點擊Qt Designer即可進入Qt設計頁面(下圖2),進行GUIInterface editor(控件添加、控件布局),設計好界面後,點擊保存,In the program directory will be generated at the same levelXXX.ui文件,After the mouse click on the file,In the menu bar againTools_>External Tools選擇PyUIC,Can be generated withuiFile the same nameXXX.py文件(下圖3),即為對應GUI界面的代碼.The above steps, in turn, as shown in the figure below:
4.程序異常處理
一般情況下,Users in the use of log retrieval tools,There is the log file path input incorrect or log results folder without specification,In such a case the executor,The program will run out of abnormal,Lead to abnormal exit of tools.為解決以上問題,本文采用PythonThe files in the exception handling method:try-except,The program ran out of the abnormal inexcept中,並給出友好提示,Users know where the input has a problem,The situation of the tool to avoid abnormal exit.如下圖所示:
5.程序打包
To run the program does not depend onPython集成開發環境,Need to code with packaging,生成exe可執行文件,在任何其他windowsComputers can run.本文采用Pyinstaller對程序進行打包.First of all need to install package toolPyinstaller,安裝方式主要有兩種:一是在windows本地CMD窗口中通過命令:pip install Pyinstaller安裝,另一種就是在PycharmBy installing a plug-in package for installation.安裝好Pyinstaller後,進入本地CMDSwitch to the code under the directory,The software can be packaged,打包方式為:
pyinstaller --paths D:/python_pycharm/venv/Lib/site-packages/PyQt5/Qt5/bin -F -w log_scan_v0.4.py
--paths:When program based on other modules or libraries,Need to add the installation path of the libraries in,The project relies onPyQt5,So will the installation path in the end,If need packing procedure not depend on other packages,You do not need to add parameterspaths;
-F:將程序打包成一個exe文件;
-w:After packaging program without the command line window,Namely when running the program will not appearcmd窗口;
log_scan_v0.4.py:待打包的py程序.
6.結果展示
7.開發過程中所遇到的問題
(1)After the completion of the packaging process,Double click run packaging goodexe文件時,出現報錯,提示:failed to execute script xxx,Details in tooltips lackPyQt5模塊,原因是PyQt5The library files only inPycharmIn the virtual environment,Also need to in local computer installationPyQt5,cmd打開命令行窗口,執行pip install PyQt5,在本地安裝一下PyQt5,And the installation path is added to the environment variablepath中,And then executing packaging program,After the packaging run normal.
(2)If choose the logic code when exposed to a module alone,GUIInterface code to import the logic code,采用from xxx import xxx,需要注意當.pyAs a module is imported into the other file is called,則作為模塊的.pyFile cannot be named begin with Numbers,At the same time cannot contain decimal point(如from log_scan_v0.1 import log_scan會報錯),可以用下劃線和字母開頭.
(3)When want to in the process of program run intexteditControl box, real time print a message,可使用QtGui.QGuiApplication.processEvents() ##實時刷新界面,To print the contents of the real-time display in the window.This kind of scene to deal with dense time-consuming things:The program sometimes need to handle some has nothing to do with interface but time-consuming things,這些事情跟界面在同一個線程中,由於時間太長,導致界面無法響應,處於“假死”狀態.
(4)When want to use in the another method in the class of the same variable,Can pass in a method using the firstglobalThe way to declare global variables,如:global end_msg_status,In another way can then use the variable.本文中使用end_msg_statusWhether as the end of the print information(即end_msg方法中的信息)狀態值,When an exception will be the value set to0,Don't print end information(即不調用end_msg方法).
8.說明
(1)經測試,The log keywords retrieval tools in retrieval more larger log file,Retrieval takes longer.When considering the code reads the log file is to log all rows in the synthesis of a long string,Then the keywords retrieval and returns the index position,To a certain extent affect the retrieval performance.Subsequent consider the optimization direction for direct to each row in the log files are keywords retrieval and returns the index and will writetxt文件,To reduce the log file synthesis of a long string of operation,On the whole retrieval performance should be able to have certain promotion.
(2)另外,目前寫入txtOf keyword and character is written in the code file behind the keyword and death10個字符,Follow-up can be optimized to allow users to manually enter into scan results the median of the key at the back of the character in file.