您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Python office automation Excel

編輯：Python

String splicing method

f-string Method
format() function
- Operator

pathlib modular

pathlib Official documents

Primary experience

from pathlib2 import Path
# Get current directory
current_path = Path.cwd()
print(current_path)
# Output is as follows ：
# /Users/Anders/Documents/
# obtain Home Catalog
home_path = Path.home()
print(home_path)
# Output is as follows ：
# /Users/Anders

Parent operation

from pathlib2 import Path
# Get current directory
current_path = Path.cwd()
# Get parent directory
print(current_path.parent)
# Get the upper parent directory
print(current_path.parent.parent)
# Get the upper parent directory
print(current_path.parent.parent.parent)
# Get the upper parent directory
print(current_path.parent.parent.parent.parent)
# Get the upper parent directory
print(current_path.parent.parent.parent.parent.parent)
# Output is as follows ：
# /Users/Anders/Documents/Jupyter
# /Users/Anders/Documents
# /Users/Anders
# /Users
# /

Traverse parent directory

# Get current directory
from pathlib2 import Path
current_path = Path.cwd()
for p in current_path.parents:
print(p)
# Output is as follows ：
# /Users/Anders/Documents/Jupyter
# /Users/Anders/Documents
# /Users/Anders
# /Users
# /

File name operation

:::info

name file name

suffix Extension of the file

suffixes Returns a list of multiple extensions

stem The main name of the file （ Does not contain extension ）

with_name(name) Replace File name and return a new path

with_suffix(suffix) Replace extension , Back to the new path , The extension remains the same

:::

Splicing and decomposition of paths

from pathlib2 import Path
# Pass a complete string directly
example_path1 = Path('/Users/Anders/Documents/powershell-2.jpg')
# You can also pass in multiple strings
example_path2 = Path('/', 'Users', 'dongh', 'Documents', 'python_learn', 'file1.txt')
# You can also use Path.joinpath()
example_path3 = Path('/Users/Anders/Documents/').joinpath('python_learn')
# utilize / You can create sub paths
example_path4 = Path('/Users/Anders/Documents')
example_path5 = example_path4 / 'python_learn/pic-2.jpg'

Traversal folder

from pathlib2 import Path
# Returns the extension of the last part in the directory
example_path = Path('/Users/Anders/Documents')
[path for path in example_path.iterdir()]
# Output is as follows ：
# [PosixPath('/Users/Anders/Documents/abc.jpg'),
# PosixPath('/Users/Anders/Documents/book-master'),
# PosixPath('/Users/Anders/Documents/Database'),
# PosixPath('/Users/Anders/Documents/Git'),
# PosixPath('/Users/Anders/Documents/AppProjects')]

File operations

The operation syntax is ： open(mode=‘r’, bufferiong=-1, encoding=None, errors=None, newline=None)

from pathlib2 import Path
example_path = Path('/Users/Anders/Documents/information/JH.txt')
with example_path.open(encoding = 'GB2312') as f:
print(f.read())
# or
example_path = Path('/Users/Anders/Documents/information/JH.txt')
example_path.read_text(encoding='GB2312')

For simple file reading and writing , stay pathlib There are several simple methods in the Library ：

.read_text(): Open the path in text mode and return the content as a string .

.read_bytes(): In binary / Byte mode opens the path and returns the content as a byte string .

.write_text(): Open the path and write string data to it .

.write_bytes(): In binary / Byte mode opens the path and writes data to it .

Create and delete folders

About creating file directories here mkdir Method accepts two parameters ：

parents： If the parent directory does not exist , Create parent directory or not .
exist_ok： Create a directory only if it does not exist , No exception will be thrown when the directory already exists .

from pathlib2 import Path
example_path = Path('/Users/Anders/Documents/test1/test2/test3')
# Create file directory , In this case, because it doesn't exist test1,test2,test3, because parents by True, So it will be created .
example_path.mkdir(parents = True, exist_ok = True)
# Delete the path object directory , If the folder to be deleted contains files, an error will be reported
example_path.rmdir()

Determine whether the file and folder objects exist

There are many related attributes about the judgment of documents

is_dir() Is it a directory

is_file() Is it a common document

is_symlink() Is it a soft link

is_socket() Whether it is socket file

is_block_device() Whether it's a piece of equipment

is_char_device() Is it a character device

is_absolute() Is it an absolute path

resolve() Return to a new path , This new path is the present Path The absolute path of the object , If it is a soft link, it will be parsed directly

absolute() You can also get the absolute path , But the recommended resolve()

exists() Whether the path points to an existing directory or file ：

from pathlib2 import Path
example_path = Path('/Users/Anders/Documents/pic-2.jpg')
# Determines whether an object exists
print(example_path.exists())
# Output is as follows ：
# True
# Determine whether the object is a directory
print(example_path.is_dir())
# Output is as follows ：
# False
# Judge whether the object is a file
print(example_path.is_file())
# Output is as follows ：
# True

File information

Just go through **.stat()** Method can return the file information of the specified path

from pathlib2 import Path
example_path = Path('/Users/Anders/Documents/pic.jpg')
print(example_path.stat())
# Output is as follows ：
# os.stat_result(st_mode=33188, st_ino=8598206944, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=38054, st_atime=1549547190, st_mtime=1521009880, st_ctime=1521009883)
# file size Last access time Last modified Creation time
print(example_path.stat().st_size)
# Output is as follows ：
# 38054

Working with the workbooks

Move and rename workbook

from pathlib import Path # pathlib The module is python Built-in module
old_file_path = Path('F:\\python\\ The first 2 Chapter \\ Employee profile .xlsx') # The original path
new_file_path = Path('F:\\table\\ Employee information form .xlsx') # New path
old_file_path.rename(new_file_path) # rename rename Only on the same disk partition

Resolve the path information of the workbook

from pathlib import Path
file_path = Path('F:\\python\\ The first 2 Chapter \\ Delivery list .xlsx')
path = file_path.parent # File path
file_name = file_path.name # file name
stem_name = file_path.stem # File primary name
suf_name = file_path.suffix # File extension
print(path)
print(file_name)
print(stem_name)
print(suf_name)

Extract the file names of all workbooks in the folder

from pathlib import Path
folder_path = Path('F:\\python\\ The first 2 Chapter \\ Worksheet \\')
file_list = folder_path.glob('*.xls*') # glob Used to find files or folders that meet the specified rules
lists = []
for i in file_list:
file_name = i.name
lists.append(file_name)
print(lists)
#[' Supplier information sheet .xlsx', ' Delivery list .xlsx', ' Year on year growth table .xls', ' Employee file form .xlsx', ' An inventory statement .xlsx']

Create a new workbook

import xlwings as xw
app = xw.App(visible=False, add_book=False) # start-up Excel Program
workbook = app.books.add() # New workbook
workbook.save('F:\\test\\1 Monthly sales table .xlsx') # Save the new workbook
workbook.close() # Close workbook
app.quit() # sign out Excel Program

Cycle to create multiple tables

import xlwings as xw
app = xw.App(visible=False, add_book=False)
for i in range(1, 6):
workbook = app.books.add()
workbook.save(f'F:\\test\\ Sales list {
i}.xlsx') # f-string Replace {} Contents
workbook.close()
app.quit()

Open an existing workbook

import xlwings as xw
app = xw.App(visible=True, add_book=False)
file_path = ' Employee information form .xlsx'
app.books.open(file_path)

Open all workbooks under the folder

from pathlib import Path
import xlwings as xw
app = xw.App(visible=True, add_book=False)
folder_path = Path('F:\\python\\ The first 2 Chapter \\ Worksheet \\')
file_list = folder_path.glob('*.xls*')
for i in file_list:
app.books.open(i)

Batch rename multiple workbooks

from pathlib import Path
folder_path = Path('F:\\python\\ The first 2 Chapter \\table\\')
file_list = folder_path.glob('* month .xlsx')
for i in file_list:
old_file_name = i.name
new_file_name = old_file_name.replace(' month ', ' Monthly sales table ')
# with_name yes pathlib Path object function , Used to replace the original path file name
new_file_path = i.with_name(new_file_name) # Construct a new file path with a new file name
i.rename(new_file_path) # Perform the rename operation

Batch conversion of workbook file formats

from pathlib import Path
import xlwings as xw
app = xw.App(visible=False, add_book=False)
folder_path = Path('F:\\python\\ The first 2 Chapter \\ Worksheet \\')
file_list = folder_path.glob('*.xlsx')
for i in file_list:
new_file_path = str(i.with_suffix('.xls')) # SaveAs The path is not recognized
workbook = app.books.open(i) # Open the workbook to convert the file format
workbook.api.SaveAs(new_file_path, FileFormat=56) # 56 representative .xls. 51 representative .xlsx
workbook.close()
app.quit()

Split a workbook into multiple workbooks

import xlwings as xw
app = xw.App(visible=False, add_book=False)
file_path = 'F:\\python\\ The first 2 Chapter \\ Record information of new energy vehicles .xlsx'
workbook = app.books.open(file_path)
worksheet = workbook.sheets
for i in worksheet:
new_workbook = app.books.add()
new_worksheet = new_workbook.sheets[0]
i.copy(before=new_worksheet) # Copy the current worksheet of the source workbook to the first worksheet of the new workbook
new_workbook.save('F:\\python\\ The first 2 Chapter \\ Automobile filing information \\{}.xlsx'.format(i.name))
new_workbook.close()
app.quit()

Merge multiple workbooks into one workbook

from pathlib import Path
import pandas as pd
folder_path = Path('F:\\python\\ The first 2 Chapter \\ Sales statistics in the first half of the year \\')
file_list = folder_path.glob('*.xls*')
with pd.ExcelWriter('F:\\python\\ The first 2 Chapter \\ Total table .xlsx') as workbook:
for i in file_list:
stem_name = i.stem
data = pd.read_excel(i, sheet_name=0)
data.to_excel(workbook, sheet_name=stem_name, index=False)

Classify workbooks by extension

from pathlib import Path
folder_path = Path(' The first 2 Chapter \\ Working papers \\') # Give the folder path to be classified
file_list = folder_path.glob('*.xls*')
for i in file_list:
suf_name = i.suffix
new_folder_path = folder_path / suf_name # Construct the full path of the folder named after the extension
if not new_folder_path.exists():
new_folder_path.mkdir()
i.replace(new_folder_path / i.name) # Move the workbook to a file with an extension
# replace Used to overwrite the original path with the new path

Sort workbooks by date

from time import localtime
from pathlib import Path
folder_path = Path('F:\\python\\ The first 2 Chapter \\ Working papers \\')
file_list = folder_path.glob('*.xls*')
for i in file_list:
lm_time = i.stat().st_mtime # Get the last modification time
year = localtime(lm_time).tm_year # Year of extraction
month = localtime(lm_time).tm_mon # Extraction month
new_folder_path = folder_path / str(year) / str(month)
if not new_folder_path.exists():
new_folder_path.mkdir(parents=True)
i.replace(new_folder_path / i.name)

Find workbooks accurately

from pathlib import Path
folder_path = input(' Please enter the search path （ Such as C:\\）：')
file_name = input(' Please enter the name of the workbook you want to find ：')
folder_path = Path(folder_path)
file_list = folder_path.rglob(file_name) # rglob Used to find files or folders whose names meet the specified rules in the specified folder and its subfolders
for i in file_list:
print(i)
# Please enter the search path （ Such as C:\）：I:\
# Please enter the name of the workbook you want to find ： Delivery list .xlsx
# I:\Projects\jupyter\Excel\ The first 2 Chapter \ Delivery list .xlsx
# I:\Projects\jupyter\Excel\ The first 2 Chapter \table\ Delivery list .xlsx
# I:\Projects\jupyter\Excel\ The first 2 Chapter \ Worksheet \ Delivery list .xlsx

Find workbook by keyword

from pathlib import Path
folder_path = input(' Please enter the search path （ Such as C:\\）：')
keyword = input(' Please enter keywords ：')
folder_path = Path(folder_path)
file_list = folder_path.rglob(f'*{
keyword}*.xls*')
for i in file_list:
print(i)
# Please enter the search path （ Such as C:\）：I:\
# Please enter keywords ： supplier
# I:\Projects\jupyter\Excel\ The first 2 Chapter \ Worksheet \ Supplier information sheet .xlsx

Protect the structure of a workbook

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open('F:\\python\\ The first 2 Chapter \\ Purchase form of office supplies .xlsx')
workbook.api.Protect(Password='123', Structure=True, Windows=True) # password The workbook structure is not modified The window is not modified
workbook.save()
workbook.close()
app.quit()

Encrypt and protect a workbook

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open('F:\\python\\ The first 2 Chapter \\ Purchase form of office supplies .xlsx')
workbook.api.Password = '123' # Set workbook opening password
workbook.save()
workbook.close()
app.quit()

Encryption protects multiple workbooks

from pathlib import Path
import xlwings as xw
app = xw.App(visible=False, add_book=False)
folder_path = Path('F:\\python\\ The first 2 Chapter \\ Worksheet \\')
file_list = folder_path.glob('*.xls*')
for i in file_list:
workbook = app.books.open(i)
workbook.api.Password = '123'
workbook.save()
workbook.close()
app.quit()

Operation sheet

Extract the names of all worksheets in a workbook （ Method 1 ）

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open('F:\\python\\ The first 3 Chapter \\ Record information of new energy vehicles .xlsx')
worksheet = workbook.sheets
lists = []
for i in worksheet:
sheet_name = i.name
lists.append(sheet_name)
print(lists)
workbook.close()
app.quit()

import pandas as pd
file_path = 'F:\\python\\ The first 3 Chapter \\ Record information of new energy vehicles .xlsx'
data = pd.read_excel(file_path, sheet_name=None)
worksheet_name = list(data.keys())
print(worksheet_name)
# The third line of code reads the data in all worksheets in the workbook , Generate a dictionary , The key of the dictionary is the name of the worksheet , The value is the corresponding data

Add a worksheet to a workbook

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open('F:\\python\\ The first 3 Chapter \\ Record information of new energy vehicles .xlsx')
worksheet = workbook.sheets
new_sheet_name = ' Product information sheet '
lists = []
for i in worksheet:
sheet_name = i.name
lists.append(sheet_name)
if new_sheet_name not in lists:
worksheet.add(name=new_sheet_name) # Add a worksheet
workbook.save()
workbook.close()
app.quit()

Delete a worksheet in a workbook

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open('F:\\python\\ The first 3 Chapter \\ Record information of new energy vehicles .xlsx')
worksheet = workbook.sheets
del_sheet_name = ' Automobile filing information '
for i in worksheet:
sheet_name = i.name
if sheet_name == del_sheet_name:
i.delete() # Delete the current sheet
break
workbook.save()
workbook.close()
app.quit()

Batch add worksheets to multiple workbooks

from pathlib import Path
import xlwings as xw
app = xw.App(visible=False, add_book=False)
folder_path = Path('F:\\python\\ The first 3 Chapter \\ Sales statistics in the first half of the year \\')
file_list = folder_path.glob('*.xls*')
new_sheet_name = ' Product information sheet '
for i in file_list:
workbook = app.books.open(i)
worksheet = workbook.sheets
lists = []
for j in worksheet:
sheet_name = j.name
lists.append(sheet_name)
if new_sheet_name not in lists:
worksheet.add(name=new_sheet_name)
workbook.save()
workbook.close()
app.quit()

Batch delete worksheets from multiple workbooks

from pathlib import Path
import xlwings as xw
app = xw.App(visible=False, add_book=False)
folder_path = Path('F:\\python\\ The first 3 Chapter \\ Car information \\')
file_list = folder_path.glob('*.xls*')
del_sheet_name = 'Sheet1'
for i in file_list:
workbook = app.books.open(i)
worksheet = workbook.sheets
for j in worksheet:
sheet_name = j.name
if sheet_name == del_sheet_name:
j.delete()
break
workbook.save()
workbook.close()
app.quit()

Rename a worksheet in a workbook

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open('F:\\python\\ The first 3 Chapter \\ Record information of new energy vehicles .xlsx')
worksheet = workbook.sheets
for i in worksheet:
if i.name == ' Automobile filing information ':
i.name = ' Car information '
break
workbook.save()
workbook.close()
app.quit()

Rename all worksheets in a workbook

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open('F:\\python\\ The first 3 Chapter \\ Sales statistics in the first half of the year .xlsx')
worksheet = workbook.sheets
for i in worksheet:
i.name = i.name.replace(' Sales list ', '')
workbook.save()
workbook.close()
app.quit()

Rename worksheets with the same name in multiple workbooks

from pathlib import Path
import xlwings as xw
app = xw.App(visible=False, add_book=False)
folder_path = Path('F:\\python\\ The first 3 Chapter \\ Sales statistics \\')
file_list = folder_path.glob('*.xls*')
for i in file_list:
workbook = app.books.open(i)
worksheet = workbook.sheets
for j in worksheet:
if j.name == ' Product information ':
j.name = ' Accessory information '
break
workbook.save()
workbook.close()
app.quit()

Copy a worksheet to another workbook

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook1 = app.books.open('F:\\python\\ The first 3 Chapter \\ Product information sheet .xlsx')
workbook2 = app.books.open('F:\\python\\ The first 3 Chapter \\1 Monthly sales table .xlsx')
worksheet1 = workbook1.sheets[' Accessory information ']
worksheet2 = workbook2.sheets[0] # sheets[0] Represents the first worksheet
worksheet1.copy(before=worksheet2)
workbook2.save()
app.quit()

Batch copy a worksheet to multiple workbooks

from pathlib import Path
import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook1 = app.books.open('F:\\python\\ The first 3 Chapter \\ Product information sheet .xlsx')
worksheet1 = workbook1.sheets[' Accessory information ']
folder_path = Path('F:\\python\\ The first 3 Chapter \\ Sales statistics in the first half of the year \\')
file_list = folder_path.glob('*.xls*')
for i in file_list:
workbook2 = app.books.open(i)
worksheet2 = workbook2.sheets[0]
worksheet1.copy(before=worksheet2)
workbook2.save()
app.quit()

Split a worksheet into multiple workbooks by criteria

import pandas as pd
file_path = 'F:\\python\\ The first 3 Chapter \\ Sales list .xlsx'
data = pd.read_excel(file_path, sheet_name=' Total table ')
pro_data = data.groupby(' The product name ')
for i, j in pro_data: # Group name data
new_file_path = 'F:\\python\\ The first 3 Chapter \\ Split \\' + i + '.xlsx'
j.to_excel(new_file_path, sheet_name=i, index=False)

Split a worksheet into multiple worksheets by criteria

import pandas as pd
file_path = 'F:\\python\\ The first 3 Chapter \\ Sales list .xlsx'
data = pd.read_excel(file_path, sheet_name=' Total table ')
pro_data = data.groupby(' The product name ')
with pd.ExcelWriter('F:\\python\\ The first 3 Chapter \\ Sales table of each product .xlsx') as workbook:
for i, j in pro_data:
j.to_excel(workbook, sheet_name=i, index=False)

Split a worksheet horizontally into multiple worksheets

import pandas as pd
file_path = 'F:\\python\\ The first 3 Chapter \\ Sales quantity statistics .xlsx'
data = pd.read_excel(file_path, sheet_name=' Total table ')
head_col = list(data.columns)
same_col = data[[' Accessory No ', ' Name of accessories ']]
with pd.ExcelWriter('F:\\python\\ The first 3 Chapter \\ Sales table of each product 1.xlsx') as workbook:
for i in head_col[2:]:
dif_col = data[i]
sheet_data = pd.concat([same_col, dif_col], axis=1)
sheet_data.to_excel(workbook, sheet_name=i, index=False)

Vertically merge multiple worksheets into one worksheet

import pandas as pd
file_path = 'F:\\python\\ The first 3 Chapter \\ Sales statistics in the first half of the year .xlsx'
data = pd.read_excel(file_path, sheet_name=None)
all_data = pd.concat(data, ignore_index=True)
new_file_path = 'F:\\python\\ The first 3 Chapter \\ Sales statistics .xlsx'
all_data.to_excel(new_file_path, sheet_name=' Total table ', index=False)

Horizontally merge multiple worksheets into one worksheet

import pandas as pd
file_path = 'F:\\python\\ The first 3 Chapter \\ Monthly sales quantity table of products .xlsx'
data = pd.read_excel(file_path, sheet_name=None)
all_data = data['1 month '][[' Accessory No ', ' Name of accessories ']]
for i in data:
col = data[i].iloc[:, [2]]
all_data = pd.concat([all_data, col], axis=1)
new_file_path = 'F:\\python\\ The first 3 Chapter \\ Merge tables .xlsx'
all_data.to_excel(new_file_path, sheet_name=' Total table ', index=False)

Set the label color of the worksheet

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open('F:\\python\\ The first 3 Chapter \\ Record information of new energy vehicles .xlsx')
worksheet = workbook.sheets
for i in worksheet:
if i.name == ' Automobile filing information ':
i.api.Tab.Color = 255
workbook.save()
workbook.close()
app.quit()

Hide a worksheet in a workbook

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open('F:\\python\\ The first 3 Chapter \\ Record information of new energy vehicles .xlsx')
worksheet = workbook.sheets
for i in worksheet:
if i.name == ' Automobile filing information ':
i.visible = False
workbook.save()
workbook.close()
app.quit()

Hide a worksheet with the same name in multiple workbooks

from pathlib import Path
import xlwings as xw
app = xw.App(visible=False, add_book=False)
folder_path = Path('F:\\python\\ The first 3 Chapter \\ Regional sales statistics \\')
file_list = folder_path.glob('*.xls*')
for i in file_list:
workbook = app.books.open(i)
worksheet = workbook.sheets
for j in worksheet:
if j.name == ' Supplier information ':
j.visible = False
workbook.save()
workbook.close()
app.quit()

Hide multiple worksheets with the same name in multiple workbooks

from pathlib import Path
import xlwings as xw
app = xw.App(visible=False, add_book=False)
folder_path = Path('F:\\python\\ The first 3 Chapter \\ Regional sales statistics \\')
file_list = folder_path.glob('*.xls*')
lists = [' Accessory information ', ' Supplier information ']
for i in file_list:
workbook = app.books.open(i)
worksheet = workbook.sheets
for j in worksheet:
if j.name in lists:
j.visible = False
workbook.save()
workbook.close()
app.quit()

Protect a worksheet

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open('F:\\python\\ The first 3 Chapter \\ Product information sheet .xlsx')
worksheet = workbook.sheets[' Accessory information ']
worksheet.api.Protect(Password='123', Contents=True)
workbook.save()
workbook.close()
app.quit()

That's ok / Column operation

Automatically adjust the row height and column width of a worksheet according to the data content

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook= app.books.open(' Record information of new energy vehicles .xlsx')
worksheet = workbook.sheets[0]
worksheet.autofit()
workbook.save()
workbook.close()
app.quit()

Accurately adjust the row height and column width of a worksheet

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Product information sheet .xlsx')
worksheet = workbook.sheets[0]
area = worksheet.range('A1').expand('table')
area.column_width = 15
area.row_height = 20
workbook.save()
workbook.close()
app.quit()

Adjust the row height and column width of all worksheets in a workbook

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Record information of new energy vehicles .xlsx')
worksheet = workbook.sheets
for i in worksheet:
i.autofit()
workbook.save()
workbook.close()
app.quit()

Adjust row height and column width of multiple workbooks

from pathlib import Path
import xlwings as xw
app = xw.App(visible=False, add_book=False)
folder_path = Path('F:\\python\\ The first 4 Chapter \\ Regional sales statistics \\')
file_list = folder_path.glob('*.xls*')
for i in file_list:
workbook = app.books.open(i)
worksheet = workbook.sheets
for j in worksheet:
j.autofit()
workbook.save()
workbook.close()
app.quit()

Insert blank rows into a worksheet

from openpyxl import load_workbook # openpyxl Modules can be used to .xlsx Format reading, writing and modification
workbook = load_workbook(' payroll .xlsx')
worksheet = workbook[' payroll ']
worksheet.insert_rows(6, 1)
workbook.save(' payroll 1.xlsx')

Insert blank rows every other row in a worksheet

from openpyxl import load_workbook
workbook = load_workbook(' payroll 1.xlsx')
worksheet = workbook[' payroll ']
num = 2 # Set the number of blank lines inserted
last_num = worksheet.max_row # Get the number of rows in the worksheet data area
for i in range(0, last_num):
worksheet.insert_rows(i * (num + 1) + 3, num) # Insert blank line
workbook.save(' payroll 2.xlsx')

Insert blank columns into a worksheet

from openpyxl import load_workbook
workbook = load_workbook(' Record information of new energy vehicles .xlsx')
worksheet = workbook[' Automobile filing information ']
worksheet.insert_cols(5, 1)
workbook.save(' Record information of new energy vehicles 1.xlsx')

Delete rows in a worksheet

from openpyxl import load_workbook
workbook = load_workbook(' Record information of new energy vehicles .xlsx')
worksheet = workbook[' Automobile filing information ']
worksheet.delete_rows(5, 2)
workbook.save(' Record information of new energy vehicles 1.xlsx')

Delete columns from a worksheet （ Method 1 ）

from openpyxl import load_workbook
workbook = load_workbook(' Record information of new energy vehicles .xlsx')
worksheet = workbook[' Automobile filing information ']
worksheet.delete_cols(5, 2)
workbook.save(' Record information of new energy vehicles 1.xlsx')

import pandas as pd
data = pd.read_excel(' Sales list .xlsx', sheet_name=0)
data.drop(columns=[' Cost price ', ' Product cost '], inplace=True) # Delete the specified column
data.to_excel(' Sales list 1.xlsx', sheet_name=' Total table ', index=False)

Append row data to a worksheet

import xlwings as xw
app = xw.App(visible=False, add_book=False)
new_data = [['8', ' Chongqing ** Automobile Co., Ltd ', ' Grandchildren **', '187****2245'], ['9', ' sichuan ** Automobile Co., Ltd ', ' Shaw **', '177****2245']]
workbook = app.books.open(' Product information sheet .xlsx')
worksheet = workbook.sheets[' Supplier information ']
data = worksheet.range('A1').expand('table')
num = data.shape[0]
worksheet.range(num + 1, 1).value = new_data
workbook.save()
workbook.close()
app.quit()

Append row data to worksheets with the same name in multiple workbooks

from pathlib import Path
import xlwings as xw
app = xw.App(visible=False, add_book=False)
folder_path = Path('F:\\python\\ The first 4 Chapter \\ Regional sales statistics \\')
file_list = folder_path.glob('*.xls*')
new_data = [['8', ' Chongqing ** Automobile Co., Ltd ', ' Grandchildren **', '187****2245'], ['9', ' sichuan ** Automobile Co., Ltd ', ' Shaw **', '177****2245']]
for i in file_list:
workbook = app.books.open(i)
worksheet = workbook.sheets[' Supplier information ']
data = worksheet.range('A1').expand('table')
num = data.shape[0]
worksheet.range(num + 1, 1).value = new_data
workbook.save()
workbook.close()
app.quit()

Add column data to a worksheet （ tagging ）

import pandas as pd
data = pd.read_excel(' Sales list .xlsx', sheet_name=0)
max_data = data[' profits '].max()
level = [0, 5000, 10000, max_data]
level_names = [' Bad ', ' good ', ' optimal ']
data[' Grade '] = pd.cut(data[' profits '], level, labels=level_names)
data.to_excel(' Sales list 1.xlsx', sheet_name=' Total table ', index=False)

Extract row data and column data of a worksheet

import pandas as pd
data = pd.read_excel(' Sales list .xlsx', sheet_name=' Total table ')
row_data = data.iloc[0:10] # Before extraction 10 That's ok
col_data = data[[' Odd Numbers ', ' Sales date ', ' The product name ', ' profits ']]
range_data = data.iloc[0:5][[' Odd Numbers ', ' Sales date ', ' The product name ', ' profits ']]
row_data.to_excel(' Extract row data .xlsx', sheet_name=' front 10 Row data ', index=False)
col_data.to_excel(' Extract column data .xlsx', sheet_name=' Income statement ', index=False)
range_data.to_excel(' Extract the data .xlsx', sheet_name='Sheet1', index=False)

Extract row data of all worksheets in a workbook

import pandas as pd
data = pd.read_excel(' Purchase form of office supplies .xlsx', sheet_name=None)
with pd.ExcelWriter(' Extraction table .xlsx') as workbook:
for i, j in data.items(): # items() Key value pairs used to return dictionaries
row_data = j.iloc[0:5]
row_data.to_excel(workbook, sheet_name=i, index=False)

Replace the data of a worksheet

import pandas as pd
data = pd.read_excel(' Sales list .xlsx', sheet_name=0)
data = data.replace(' clutch ', ' Brake pads ')
data.to_excel(' Sales list 1.xlsx', sheet_name=' Total table ', index=False)

Replace the data of all worksheets in a workbook

import pandas as pd
data = pd.read_excel(' Purchase form of office supplies .xlsx', sheet_name=None)
with pd.ExcelWriter(' Purchase form of office supplies 1.xlsx') as workbook:
for i, j in data.items():
data = j.replace(' Solid glue ', ' scotch tape ')
data.to_excel(workbook, sheet_name=i, index=False)

Replace the column data of a worksheet

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Product information sheet 1.xlsx')
worksheet = workbook.sheets[' Accessory information ']
data = worksheet.range('A2').expand('table').value
for i, j in enumerate(data):
data[i][3] = float(j[3]) * (1 + 0.1)
worksheet.range('A2').expand('table').value = data
workbook.save()
workbook.close()
app.quit()

Replace the column data corresponding to the specified column data of a worksheet

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Sales list .xlsx')
worksheet = workbook.sheets[' Total table ']
data = worksheet.range('A1').expand('table').value
for i, j in enumerate(data):
if (j[2] == ' Odometer ') and (j[3] == 850):
data[i][3] = 900
worksheet.range('A1').expand('table').value = data
workbook.save()
workbook.close()
app.quit()

Transpose the rows and columns of a worksheet

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Product analysis table .xlsx')
worksheet = workbook.sheets[0]
data = worksheet.range('A1').expand('table').options(transpose=True).value
worksheet.clear() # Clear the contents and formatting of the worksheet
worksheet.range('A1').expand().value = data
workbook.save(' Product analysis table 1.xlsx')
workbook.close()
app.quit()

Freeze the pane of a worksheet from the specified row and column

from openpyxl import load_workbook
workbook = load_workbook(' Sales list .xlsx')
worksheet = workbook[' Total table ']
worksheet.freeze_panes = 'B2'
workbook.save(' Sales list 1.xlsx')

Split a column of a worksheet into multiple columns

import pandas as pd
data = pd.read_excel(' Product specification sheet .xlsx', sheet_name=0)
data_col = data[' Product specifications '].str.split('*', expand=True)
data[' Long (cm)'] = data_col[0]
data[' wide (cm)'] = data_col[1]
data[' high (cm)'] = data_col[2]
data.drop(columns=[' Product specifications '], inplace=True)
data.to_excel(' Product specification sheet 1.xlsx', sheet_name=' Specification sheet ', index=False)

Merge multiple columns of a worksheet into one column

import pandas as pd
data = pd.read_excel(' Product specification sheet 1.xlsx', sheet_name=' Specification sheet ')
data[' Product specifications '] = data[' Long (cm)'].astype(str) + '*' + data[' wide (cm)'].astype(str) + '*' + data[' high (cm)'].astype(str)
data.drop(columns=[' Long (cm)', ' wide (cm)', ' high (cm)'], inplace=True)
data.to_excel(' Product specification sheet 2.xlsx', sheet_name='Sheet1', index=False)

Hide row data in a worksheet

from openpyxl import load_workbook
workbook = load_workbook(' Record information of new energy vehicles .xlsx')
worksheet = workbook[' Automobile filing information ']
worksheet.row_dimensions.group(2, 10, hidden=True)
workbook.save(' Record information of new energy vehicles 1.xlsx')

Hide column data in a worksheet

from openpyxl import load_workbook
workbook = load_workbook(' Record information of new energy vehicles .xlsx')
worksheet = workbook[' Automobile filing information ']
worksheet.column_dimensions.group('A', 'D', hidden=True)
workbook.save(' Record information of new energy vehicles 1.xlsx')

Cell manipulation

Enter something in the cell

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.add()
worksheet = workbook.sheets.add(name=' Sales ')
worksheet.range('A1').value = [[' The product name ', ' sales volumes ', ' Unit sales price ', ' sales '], [' overcoat ', 15, 400, 6000], [' Down Jackets ', 20, 500, 10000]]
workbook.save(' Product list .xlsx')
workbook.close()
app.quit()

Set the font format of cell data

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' The order sheet .xlsx')
worksheet = workbook.sheets[0]
header = worksheet.range('A1:I1')
header.font.name = ' Microsoft YaHei '
header.font.size = 10
header.font.bold = True
header.font.color = (255, 255, 255)
header.color = (0, 0, 0) # Cell fill color
data = worksheet.range('A2').expand('table') # Select the cell range where the data row is located
data.font.name = ' Microsoft YaHei '
data.font.size = 10
workbook.save(' The order sheet 1.xlsx')
workbook.close()
app.quit()

Set the alignment of cell data

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' The order sheet 1.xlsx')
worksheet = workbook.sheets[0]
header = worksheet.range('A1:I1')
header.api.HorizontalAlignment = -4108
header.api.VerticalAlignment = -4108
data = worksheet.range('A2').expand('table')
data.api.HorizontalAlignment = -4152 # Set the horizontal alignment of data rows
data.api.VerticalAlignment = -4108 # Set the vertical alignment of data rows
workbook.save(' The order sheet 2.xlsx')
workbook.close()
app.quit()

HorizontalAlignment Set horizontal alignment

Alignment mode Property value routine 1 Keep right -4152 Keep to the left -4131 In the middle -4108 fill 5 full-justified -4130

Set the border style of cells

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' The order sheet 2.xlsx')
worksheet = workbook.sheets[0]
area = worksheet.range('A1').expand('table')
for i in area:
for j in range(7, 11):
i.api.Borders(j).LineStyle = 1
i.api.Borders(j).Weight = 2
i.api.Borders(j).Color = xw.utils.rgb_to_int((255, 0, 0)) # Set the color of the border
workbook.save(' The order sheet 3.xlsx')
workbook.close()
app.quit()

Modify the number format of cells

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' The order sheet 3.xlsx')
worksheet = workbook.sheets[0]
row_num = worksheet.range('A1').expand('table').last_cell.row
worksheet.range(f'B2:B{
row_num}').number_format = 'yyyy year m month d Japan '
worksheet.range(f'D2:D{
row_num}').number_format = '￥#,##0'
worksheet.range(f'E2:E{
row_num}').number_format = '￥#,##0'
worksheet.range(f'G2:G{
row_num}').number_format = '￥#,##0.00'
worksheet.range(f'H2:H{
row_num}').number_format = '￥#,##0.00'
worksheet.range(f'I2:I{
row_num}').number_format = '￥#,##0.00'
workbook.save(' The order sheet 4.xlsx')
workbook.close()
app.quit()

Merge cells to make table titles （ Method 1 ）

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' The order sheet 5.xlsx')
worksheet = workbook.sheets[0]
title = worksheet.range('A1:I1') # Specify the range of cells to merge
title.merge() # merge cell
title.font.name = ' Microsoft YaHei '
title.font.size = 18
title.font.bold = True
title.api.HorizontalAlignment = -4108
title.api.VerticalAlignment = -4108
title.row_height = 30
workbook.save(' The order sheet 6.xlsx')
workbook.close()
app.quit()

Merge cells to make table titles （ Method 2 ）

from openpyxl import load_workbook
from openpyxl.styles import Font, Alignment
workbook = load_workbook(' The order sheet 5.xlsx')
worksheet = workbook[' Total table ']
worksheet.merge_cells('A1:I1') # Specify the range of cells to merge
worksheet['A1'].font = Font(name=' Microsoft YaHei ', size=18, bold=True)
worksheet['A1'].alignment = Alignment(horizontal='center', vertical='center')
worksheet.row_dimensions[1].height = 30
workbook.save(' The order sheet 6.xlsx')

Merge consecutive cells with the same content

from openpyxl import load_workbook
workbook = load_workbook(' Order amount table .xlsx')
worksheet = workbook['Sheet1']
lists = []
num = 2 # Start with the second line
while True: # Construct a permanent cycle
datas = worksheet.cell(num, 1).value
if datas:
lists.append(datas)
else: # If the data read is empty
break # Then force the end of the cycle
num += 1
s = 0
e = 0
data = lists[0]
for m in range(len(lists)):
if lists[m] != data:
data = lists[m]
e = m - 1
if e >= s:
worksheet.merge_cells(f'A{
s + 2}:A{
e + 2}') # Merge A Column cells with the same content
s = e + 1
if m == len(lists) - 1:
e = m
worksheet.merge_cells(f'A{
s + 2}:A{
e + 2}') # Merge A Column cells with the same content
workbook.save(' Order amount table 1.xlsx')

Fill blank cells with data

import pandas as pd
data = pd.read_excel(' Sales list .xlsx', sheet_name=' Total table ')
data[' Sales amount '].fillna(0, inplace=True)
data[' profits '].fillna(0, inplace=True)
data.to_excel(' Sales list 1.xlsx', sheet_name=' Total table ', index=False)

Delete duplicate rows in the worksheet

import pandas as pd
data = pd.read_excel(' Sales list 1.xlsx', sheet_name=' Total table ')
data = data.drop_duplicates() # Delete duplicate lines
data.to_excel(' Sales list 2.xlsx', sheet_name=' Total table ', index=False)

Convert formulas in cells to numeric values

import xlwings as xw
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Sales list 2.xlsx')
worksheet = workbook.sheets[0]
data = worksheet.range('A1').expand('table').value
worksheet.range('A1').expand('table').value = data
workbook.save(' Sales list 3.xlsx')
workbook.close()
app.quit()

Data processing and analysis operations

Sort the data in a worksheet （ Method 1 ）

import pandas as pd
data = pd.read_excel(' Sales list .xlsx', sheet_name=' Total table ')
data = data.sort_values(by=' profits ', ascending=False) # Descending
data.to_excel(' Sales list 1.xlsx', sheet_name=' Total table ', index=False)

Sort the data in a worksheet （ Method 2 ）

import xlwings as xw
import pandas as pd
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Sales list .xlsx')
worksheet = workbook.sheets[' Total table ']
data = worksheet.range('A1').expand('table').options(pd.DataFrame).value
result = data.sort_values(by=' profits ', ascending=False)
worksheet.range('A1').value = result
workbook.save(' Sales list 1.xlsx')
workbook.close()
app.quit()

Sort the data of all worksheets in a workbook

import xlwings as xw
import pandas as pd
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Monthly sales quantity table .xlsx')
worksheet = workbook.sheets
for i in worksheet:
data = i.range('A1').expand('table').options(pd.DataFrame).value # Translate data into DataFrame Format
result = data.sort_values(by=' sales volumes ', ascending=False)
i.range('A1').value = result
workbook.save(' Monthly sales quantity table 1.xlsx')
workbook.close()
app.quit()

Sort the data of worksheets with the same name in multiple workbooks

from pathlib import Path
import xlwings as xw
import pandas as pd
app = xw.App(visible=False, add_book=False)
folder_path = Path(' Sales volume by Region ')
file_list = folder_path.glob('*.xls*')
for i in file_list:
workbook = app.books.open(i)
worksheet = workbook.sheets[' sales volumes ']
data = worksheet.range('A1').expand('table').options(pd.DataFrame).value
result = data.sort_values(by=' sales volumes ', ascending=False)
worksheet.range('A1').value = result
workbook.save()
workbook.close()
app.quit()

Filter the data in a worksheet according to a single condition

import pandas as pd
data = pd.read_excel(' Sales list .xlsx', sheet_name=' Total table ')
pro_data = data[data[' The product name '] == ' clutch ']
num_data = data[data[' sales volumes '] >= 100]
pro_data.to_excel(' clutch .xlsx', sheet_name=' clutch ', index=False)
num_data.to_excel(' The sales quantity is greater than or equal to 100 The record of .xlsx', sheet_name=' The sales quantity is greater than or equal to 100 The record of ', index=False)

Filter the data in a worksheet according to multiple criteria

import pandas as pd
data = pd.read_excel(' Sales list .xlsx', sheet_name=' Total table ')
condition1 = (data[' The product name '] == ' Tachometer ') & (data[' sales volumes '] >= 50)
condition2 = (data[' The product name '] == ' Tachometer ') | (data[' sales volumes '] >= 50)
data1 = data[condition1]
data2 = data[condition2]
data1.to_excel(' Sales list 1.xlsx', sheet_name=' And conditional filtering ', index=False)
data2.to_excel(' Sales list 2.xlsx', sheet_name=' Or conditional filtering ', index=False)

Filter the data of all worksheets in a workbook

import pandas as pd
all_data = pd.read_excel(' Purchase form of office supplies .xlsx', sheet_name=None)
with pd.ExcelWriter(' Filter table .xlsx') as workbook:
for i in all_data:
data = all_data[i]
filter_data = data[data[' Purchase items '] == ' desk ']
filter_data.to_excel(workbook, sheet_name=i, index=False)

Filter the data of all worksheets in a workbook and summarize

import pandas as pd
all_data = pd.read_excel(' Purchase form of office supplies .xlsx', sheet_name=None)
datas = pd.DataFrame()
for i in all_data:
data = all_data[i]
filter_data = data[data[' Purchase items '] == ' desk ']
datas = pd.concat([datas, filter_data], axis=0)
datas.to_excel(' desk .xlsx', sheet_name=' desk ', index=False)

Summarize a worksheet

import xlwings as xw
import pandas as pd
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Sales list .xlsx')
worksheet = workbook.sheets[' Total table ']
data = worksheet.range('A1').expand('table').options(pd.DataFrame, dtype=float).value
result = data.groupby(' The product name ').sum()
worksheet1 = workbook.sheets.add(name=' Subtotal ')
worksheet1.range('A1').value = result[[' sales volumes ', ' Sales amount ']]
workbook.save(' Subtotal .xlsx')
workbook.close()
app.quit()

Sum a worksheet

import xlwings as xw
import pandas as pd
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Purchase form of office supplies .xlsx')
worksheet = workbook.sheets['1 month ']
data = worksheet.range('A1').expand('table').options(pd.DataFrame).value
result = data[' Purchase amount '].sum()
worksheet.range('B15').value = ' total '
worksheet.range('C15').value = result
workbook.save(' Summation table .xlsx')
workbook.close()
app.quit()

Sum all worksheets of a workbook separately

import xlwings as xw
import pandas as pd
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Purchase form of office supplies .xlsx')
worksheet = workbook.sheets
for i in worksheet:
data = i.range('A1').expand('table').options(pd.DataFrame).value
result = data[' Purchase amount '].sum()
column = i.range('A1').expand('table').value[0].index(' Purchase amount ') + 1
row = i.range('A1').expand('table').shape[0]
i.range(row + 1, column - 1).value = ' total '
i.range(row + 1, column).value = result
workbook.save(' Summation table .xlsx')
workbook.close()
app.quit()

Make a PivotTable report in a worksheet

import xlwings as xw
import pandas as pd
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Sales list .xlsx')
worksheet = workbook.sheets[' Total table ']
data = worksheet.range('A1').expand('table').options(pd.DataFrame, dtype=float).value
pivot = pd.pivot_table(data, values=[' sales volumes ', ' Sales amount '], index=[' The product name '], aggfunc={
' sales volumes ': 'sum', ' Sales amount ': 'sum'}, fill_value=0, margins=True, margins_name=' total ')
worksheet1 = workbook.sheets.add(name=' PivotTable ')
worksheet1.range('A1').value = pivot
workbook.save(' PivotTable .xlsx')
workbook.close()
app.quit()

Use the correlation coefficient to judge the correlation of the data

import pandas as pd
data = pd.read_excel(' Sales statistics .xlsx', sheet_name=0, index_col=' Serial number ')
result = data.corr() # Calculate the correlation coefficient between any two variables
print(result)

Use descriptive statistics and histograms to set goals

import pandas as pd
import matplotlib.pyplot as plt
import xlwings as xw
data = pd.read_excel(' Employee sales performance table .xlsx', sheet_name=0)
data_describe = data[' sales （ Ten thousand yuan ）'].astype(float).describe()
data_cut = pd.cut(data[' sales （ Ten thousand yuan ）'], 6)
data1 = pd.DataFrame()
data1[' Count '] = data[' sales （ Ten thousand yuan ）'].groupby(data_cut).count()
data2 = data1.reset_index()
data2[' sales （ Ten thousand yuan ）'] = data2[' sales （ Ten thousand yuan ）'].apply(lambda x:str(x))
figure = plt.figure()
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
n, bins, patches = plt.hist(data[' sales （ Ten thousand yuan ）'], bins=6, edgecolor='black', linewidth=1)
plt.xticks(bins)
plt.title(' Employee sales performance frequency analysis ')
plt.xlabel(' sales （ Ten thousand yuan ）')
plt.ylabel(' frequency ')
app = xw.App(visible=False, add_book=False)
workbook = app.books.open(' Employee sales performance table .xlsx')
worksheet = workbook.sheets[0]
worksheet.range('E1').value = data_describe
worksheet.range('H1').value = data2
worksheet.pictures.add(figure, name=' picture 1', update=True, left=400, top=200)
worksheet.autofit()
workbook.save(' Describe statistics .xlsx')
workbook.close()
app.quit()

Fit the regression equation and judge the degree of fitting

import pandas as pd
from sklearn import linear_model
df = pd.read_excel(' Monthly sales and advertising expenses table .xlsx', sheet_name=0)
x = df[[' Video portal advertising expenses （ Ten thousand yuan ）', ' TV advertising expenses （ Ten thousand yuan ）']]
y = df[' sales （ Ten thousand yuan ）']
model = linear_model.LinearRegression()
model.fit(x, y)
R2 = model.score(x, y)
print(R2)

Use regression equations to predict future values

import pandas as pd
from sklearn import linear_model
df = pd.read_excel(' Monthly sales and advertising expenses table .xlsx', sheet_name=0)
x = df[[' Video portal advertising expenses （ Ten thousand yuan ）', ' TV advertising expenses （ Ten thousand yuan ）']]
y = df[' sales （ Ten thousand yuan ）']
model = linear_model.LinearRegression()
model.fit(x, y)
coef = model.coef_
model_intercept = model.intercept_
equation = f'y={
coef[0]}*x1+{
coef[1]}*x2{
model_intercept:+}'
print(equation)
x1 = 40
x2 = 30
y = coef[0] * x1 + coef[1] * x2 + model_intercept
print(y)