It is often used in daily office word Program , stay python Also targeted in word Operation Library python-docx; bring python It can be operated automatically word file ;
python-docx Is a use of python Read and write word Third party Library of documents . Is a tool for creating and updating Microsoft Word (.docx) Document library , Provide a complete set of Word operation , Is the most commonly used Word Tools ;
python-docx Library only supports docx file , If it is doc, Need to convert file format . File format conversion can use win32com Library saveas Methods to automate ;
python-docx It's an open source library , The address of the open source code base is here :github Address
python-docx There are supporting official documents , Accessible https://python-docx.readthedocs.io/en/latest/ Check out the latest official tutorial documentation ;
install python-docx
It is recommended to use pip Package manager to install , Relatively convenient ;
pip install python-docx
Import python-docx
The name used during installation is python-docx, But it is a different name when importing docx
import docx
python-docx Basic concepts in :
Document: It's a Word file object , Open different Word file , There will be different Document object , There is no influence on each other
Paragraph: It's a paragraph , One Word The document consists of multiple paragraphs , When entering a enter key in the document , It will become a new paragraph , Input shift + enter , No segmentation
Run Represents a segment , Each paragraph consists of multiple Segment form , Continuous text with the same style in a paragraph , Make up a section , So a The paragraph The object has a Run list
Be careful : Color 、 typeface 、 thickness 、 Different italics , Just different blocks of text .
Use docx Write word file :
1、 Create a new blank document
doc = docx.Document()
2、 Add the title
doc.add_heading(‘ This is the title , The levels are level decision ’,level=2)
3、 Add paragraph
p = doc.add_paragraph(‘ This is a paragraph , Can be long or short ’)
4、 Add text blocks :
p.add_run(’\n– This is a newline segment , But it is still a paragraph ’)
5、 Save the file
doc.save(‘H://pytest.docx’)
Complete code example
def create(): ''' Create a word''' doc = docx.Document() # Create a new blank document doc.add_heading(' This is the title , The levels are level decision ',level=2) # Add the title p = doc.add_paragraph(' This is a paragraph , Can be long or short ') p.insert_paragraph_before(' Insert a paragraph before the first paragraph ') p.add_run('\n-- This is a newline segment , But it is still a paragraph ') p.add_run('== Bold text ').bold = True # Set in bold p.add_run('-- Italics ').italic = True # Set italics doc.add_page_break() # Insert a blank page np = doc.add_paragraph(' New paragraph ') from docx.enum.text import WD_BREAK np.runs[-1].add_break(WD_BREAK.PAGE) # Add pagination after the last paragraph of the paragraph doc.save('H://pytest.docx') # preservation
In addition to the regular text , You can also add special formats such as tables ;
read word file :
It's easy to read documents , It's mainly about loading files , Get paragraph , Get information such as tables
The sample code is as follows :
def read(): ''' Read the document ''' doc = docx.Document('H://pytest.docx') # Open the existing document in the current path for paragraph in doc.paragraphs: print(f'paragraph.text = {paragraph.text}') for run in paragraph.runs: print(f'\trun.text = {run.text}') for table in doc.tables: print(f" form ======{table}") for i in range(len(table.rows)): for j in range(len(table.columns)): print (f"{i} That's ok {j} Column : data :{table.cell(i,j).text}")
報錯內容如下:TemplateSyntaxError at
pythonKit de développement dou