There were some automated tasks in the company before , Mainly from several excel Find data in , Then record the new excel in . I really learned a lot , sigh python A powerful , For each field python It really saves a lot of things .
I wanted to record it before , Keep busy and forget . Now recall here and consolidate the following previous knowledge , Hope it can also help new friends get started faster .OK, Let's get started !
Handle excel There are still many libraries , I'll list it here :
import openpyxl
# load excel file
wb = openpyxl.load_workbook(excel_path)
# 1. obtain sheet
sheets = wb.worksheets # Return to one sheet The object list , Then according to the index, you can get sheet
sheet = wb['sheet1'] # according to sheet Get the worksheet by name
# 2. Get the cell
a1 = sheet['A1'].value # Get by slicing A1 The value of the cell
a1 = sheet.cell(row=1, column=1).value # Specify row and column to get A1 The value of the cell
# 3. Read cells in range
cells = sheet['A1':'B5'] # obtain A1 To B5 Cells in range
# 4. Traversing cells
# When there are multiple receive traversals , The number shall be consistent with the number of column differences , Otherwise, an error will be reported . for example :A2-C9 There are three columns , So you need three variables to receive . In this case, assign values by line
for c1, c2, c3 in sheet['A2':'C9']:
print(f'{c1.value}, {c2.value}, {c3.value}')
# Read cells one by one , It should be noted that the outer loop is in behavioral units , Need to cycle again to get the value , Or take values through the index .
for cells in sheet['A2':'C9']:
for cell in cells:
print(cell.value)
# 5. Get the maximum row number and column number containing data , But there are still some shortcomings , If the cell has a style, it is not very accurate .
max_row = sheet.max_row
max_column = sheet.max_column
# 6. Cell assignment
sheet['A1'].value = 'A1' # take A1 The value of the cell is set to A1
sheet.cell(row=2, column=1).value = 'A1'
# 7. Batch assignment , In the form of tuples or nested lists for Cyclic addition . Append to the last row with data
rows = (
('name', 'age', 'number'),
(' Zhang San ', 24, '001'),
(' Li Si ', 22, '002'),
(' The king 2 ', 23, '003'),
)
for row in rows:
sheet.append(row)
# 8. Filtering and sorting
data = [
['Item', 'Colour'],
['pen', 'brown'],
['book', 'black'],
['plate', 'white'],
['chair', 'brown'],
['coin', 'gold'],
['bed', 'brown'],
['notebook', 'white'],
]
for r in data:
sheet.append(r)
sheet.auto_filter.ref = 'A1:B8' # Set up A1-B8 Filter columns for
sheet.auto_filter.add_filter_column(1, ['brown', 'white']) # Set the first column to filter data ,'brown', 'white'
sheet.auto_filter.add_sort_condition('B2:B8') # Press B2-B8 Sort automatically
# 9. merge cell
sheet.merge_cells('A1:B2') # Merge A1-B2 Cell
# 10. Freeze the panes
sheet.freeze_panes = 'B2' # With B2 Cell based freeze sheet
more openpyxl You can refer to this article for the operation of , The summary is particularly comprehensive .https://geek-docs.com/python/python-tutorial/python-openpyxl.html.
I mainly use the query function of this library , You can use it when you need to query the row number of a value in the worksheet . At present, I only use this function .
Here's an example , It feels a little complicated to use , You need to try more by yourself to find the rules .
import pandas as pd
# sheet_name Appoint excel Of sheet name ,header Specify which row to use as the header row
inner_excel = pd.read_excel(inner_path, sheet_name='インシデント Standing book ', header=1)
# After specifying the title line above , You can use the following method to find the row number of the value in the column of a title
rows = inner_excel[inner_excel[" Case No "] == "20290208-001"].index.tolist() # lookup 20290208-001 In the line number list where the case number column is located
infos = []
if len(rows) == 0:
indexs = excel_event[excel_event[" Case No "] == case_num].index.tolist()
if len(indexs) > 0:
infos.append(excel_event[' Classification '].values[indexs[0]])
infos.append(' Hino Ho ')
date64 = excel_event['クローズ\n Daily payment '].values[indexs[0]]
infos.append(str(date64.astype("datetime64[D]")).replace('-', '/'))
return infos
return []
infos.append(inner_excel[' Classification '].values[rows[0]]) # Get the value of the specified row number of the classification column
infos.append(inner_excel[' Person in charge '].values[rows[0]])
date64 = inner_excel[' The end of the day '].values[rows[0]]
infos.append(str(date64.astype("datetime64[D]")).replace('-', '/')) # Date processing , This is also a big pit . Special attention required .
Only part of the content is listed here , Indeed, I think it is quite complicated , You come on ! If there is anything you can understand, please leave a message to discuss ! If there is something wrong, please correct it .