Today, the editor will recommend several Python
modules that are very useful in the field of automated office, which can help you improve efficiency at work and avoid repetitive mechanized operation processes.
When it comes to file system operations, I believe many people are still using the OS
module in Python
. In comparison, Pathlib
Module has many advantages, let's look at a few simple cases
For example, we can create and delete directories, the code is as follows
from pathlib import PathcurrentPath = Path.cwd()makePath = currentPath / 'pythonPractice'makePath.mkdir()
Then the same, the code to delete the directory is
currentPath = Path.cwd()delPath = currentPath / 'pythonPractice'delPath.rmdir()
For example, we want to get the path of the current directory, the code is as follows
currentPath = Path.cwd()print(currentPath)
and the directory where the computer user is located
homePath = Path.home()print(homePath)
For example, we want to splice the absolute path of the desktop, the code is as follows
Path(Path.home(), "Desktop")
Can also be
Path.joinpath(Path.home(), "Desktop")
For the specified path, we can judge whether it is a folder and whether it is a file, the code is as follows
input_path = r"specified path"if Path(input_path ).exists():if Path(input_path).is_file():print("It's a file!")elif Path(input_path ).is_dir():print("It's a folder!")else:print("The path is wrong!")
glob
module in Python
is mainly used to find directories and files that meet specific rules, and return the search results to a list.
Because this module supports the use of regular wildcards for searching, it is also very convenient to use. Let's take a look at a simple case,
path1 = r".\[0-9].jpg"glob.glob(path1)
outut
['.\\1.jpg', '.\\2.jpg', '.\\3.jpg', ......]
The wildcards that are often used are
*
: matches 0 or more characters
**
: matches all files, directories, subdirectories and files in subdirectories
[]
: matches characters within the specified range, for example [1-9]
matches 1Characters within -9
[!]
: matches characters not in the specified range
Let's look at a few more cases, the code is as follows
for fname in glob.glob("./*.py"):print(fname)
The above code prints all files with py
suffixes in the current directory, let's look at the case again
for fname in glob.glob("./file[!0-9].py"):print(fname)
The above code prints the py
file with non-numeric symbols starting with filename
.
Finally, let's talk about how to convert PDF documents into Word format documents. The module used is pdf2docx
, we first use the pip command to install the module
pip install pdf2docx
Let's practice, the code is as follows
from pdf2docx import Convertercv = Converter(r"the specific path of the pdf document")cv.convert("test.docx", start=0,end=None)cv.close()
If it is a document with relatively simple page elements, the pdf2docx
module is sufficient to process it, but sometimes individual pages in the PDF
document are very fancy and are converted intoDocuments in Word
format will look a bit messy afterwards.
Finally, we can also convert the specified number of pages, for example, only for the odd-numbered pages in the document, the code is as follows
from pdf2docx import Convertercv = Converter(r"the specific path of the pdf document")cv.convert("test.docx", pages=[1, 3, 5, 7])cv.close()
Previous highlightsRoute and data download for beginners to get started with artificial intelligence (graphics + video) Machine learning introduction series Download machine learning and deep learning notes and other materials Print the code reproduction album of "Statistical Learning Methods" Machine Learning Exchange qq group 955171419, join WeChatGroup please scan the code