Chapter one Python introduction
Chapter two Python Basic concepts
The third chapter Sequence
Chapter four Control statement
The fifth chapter function
Chapter six Object oriented fundamentals
Chapter vii. Object oriented depth
Chapter viii. Exception mechanism
Chapter nine File operations
In this chapter , It mainly introduces the related to file operation API Methods use
First we will learn what is file manipulation , And the classification of documents IO Introduction to common codes used in operation
Then we learned the process of file operation , establish -> write in -> close
Then we learned the expansion of documents , Serialization module pickle, File operation module csv, System operation call module os and os.path And file copy compression module shutil
A complete program generally includes data storage and reading ; The program data we wrote above is not actually stored , therefore python After the interpreter executes, the data disappears
In development , We often need external storage media ( Hard disk 、 Compact disc 、U Plate, etc. ) Reading data , Or store the data generated by the program in a file , Realization “ Persistence ” preservation
According to the data organization form in the file , We divide files into text files and binary files :
text file
The text file stores ordinary “ character ” Text ,python The default is unicode Character set , You can use the Notepad program to open
Binary
Binary files use the data content as “ byte ” For storage , Can't open with Notepad , Special software must be used to decode .
Common are :MP4 Video file 、MP3 Audio file 、JPG picture 、doc Document, etc.
When working with text files , Often operate Chinese , At this time, we often encounter the problem of garbled code .
In order to solve the problem of Chinese garbled code , You need to learn the problems before each coding .
The relationship between common codes is as follows :
Its full name is American Standard Code for Information Interchange , American standard code for information exchange ,
This is the earliest and most common single byte coding system in the world , It is mainly used to show modern English and other Western European languages
matters needing attention :
2^7=128
Characters , use 7bit You can completely code ,GBK That is, the Chinese character internal code extension specification , English full name Chinese Internal Code Specification.
GBK Coding standards are compatible GB2312, Collection of Chinese characters 21003 individual 、 Symbol 883 individual , And provide 1894 A character code , Jane 、 Traditional Chinese characters are integrated into a library .
GBK Double byte representation , The overall coding range is 8140-FEFE, The first byte is in 81-FE Between , The last byte is in 40-FE Between
Unicode The coding is designed to fix two bytes , All characters use 16 position
2^16=65536
Express , Including before only 8 Bit English characters, etc , So it's a waste of space
Unicode Completely redesigned , Are not compatible iso8859-1 , Nor is it compatible with any other encoding
For English letters , unicode It also needs two bytes to represent , therefore unicode Not easy to transfer and store .
As a result, there is UTF code , UTF-8 The full name is ( 8-bit UnicodeTransformation Format )
matters needing attention
open() Function to create a file object , The basic syntax is as follows :open( file name [, Open mode ])
Be careful :
If it's just the file name , Represents the file in the current directory . The file name can be entered in the full path , such as : D:\a\b.txt
You can use the original string r“d:\b.txt”
Reduce \
The input of , So the above code can be rewritten as f = open(r"d:\b.txt","w")
The opening method as an input parameter is as follows ( Often use !!!)
Creation of text file objects and binary file objects
If it's binary mode b , Then you create a binary file object , The basic unit of processing is “ byte ”
If there is no add mode b , Text file objects are created by default , The basic unit of processing is “ character ”
Writing a text file is generally a three-step process :
Practical code
# 1. Use open() The way
f = open(r"d:\a.txt", "a")
s = "TimePause\n Time is still \n"
f.write(s)
f.close()
Result display
windows The operating system default code is GBK , Linux The operating system default code is UTF- 8 .
When we use open() when , What is called is operating system related api To open the file , And the default code is GBK
But because we are usually used to setting all code to UTF- 8 ., Therefore, there will be a garbled code problem when opening , As shown in the figure below
Solution :
According to the picture above , Set the text encoding to GBK Read the format
Be careful :
We can also solve the problem of Chinese garbled code by specifying the code . Because we're gonna pycharm
Text read / write codes are set to utf-8,
So long as We specify the code as utf-8( Default gbk), Then we won't have garbled code when reading . The following code
Practical code
# 【 Example 】 Solve the problem of Chinese garbled code by specifying the file code
f = open(r"d:\bb.txt", "w", encoding="utf-8")
f.write(" A small station with warmth \n Time stillness is not a brief history ")
f.close()
Problem description
We are usually used to pycharm All character encodings are set to utf-8 when . When we make a network request , Sometimes it will return the problem of garbled code , Here's the picture
Problem analysis
Because we are pycharm Set all character codes to UTF-8, however Get... Through network request GBK Formatted text , Then we continue with UTF-8 There will be garbled code when encoding and decoding
Solution
You can set the item code to GBK The format is just ; The obtained data can also be manipulated through text operation codes GBK Format read
Or when writing , Directly declare the code as UTF-8
write(a)
: Put the string a Write to a file writelines(b)
: Write the string list to the file , Do not add line breaks Practical code
# 【 operation 】 Add string list data to file
f = open(r"d:\bb.txt", 'w', encoding="utf-8")
s = [" What the hell? \n"] * 3 # adopt \n Realize manual line feed
f.writelines(s)
f.close()
Because the bottom of the file is controlled by the operating system , So the file object we open must explicitly call close() Method to close the file object .
When calling close() When the method is used , First, the buffer data will be written to the file ( It can also be called directly flush() Method ), Close the file again , Release file object
Be careful :
close()
Generally combined with the exception mechanism finally Use it together Practical code
# 【 operation 】 Combined with the exception mechanism finally , Make sure to close the file object
# "a" Set the opening mode to append mode
try:
f = open(r"d:\c.txt", "a")
s = " From The Abyss "
f.write(s)
except BaseException as e:
print(e)
finally:
f.close()
with keyword ( Context manager ) Context resources can be managed automatically , Jump out for whatever reason with block , Can ensure that the file is closed correctly ,
And it can automatically restore the scene when entering the code block after the code block is executed
Practical code
# 【 operation 】 Use with Manage file write operations
s = [" Zigfei "] * 3
with open(r"d:\cc.txt", "w") as f:
f.writelines(s)
File reading steps :
The following three methods are generally used to read files :
read([size])
: Read from file size Characters , And return as a result
without size Parameters , Read the entire file . Read to the end of the file , Will return an empty string
readline()
: Read a line and return as a result
Read to the end of the file , Will return an empty string
readlines()
: In the text file , Each line is stored in the list as a string , Return to the list
Code format
with open(r"d:\a.txt", "r"[, encoding="utf-8"]) as f:
f.read(4)
Be careful :
encoding="utf-8"
,UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbc in position 13: invalid start byte
Practical code
# 【 operation 】 Before reading a file 4 Characters
import pickle
with open(r"d:\a.txt", "r") as f:
print(f.read(4))
# 【 operation 】 Smaller files , Read the contents of the file into the program at one time
with open(r"d:\aa.txt", "r") as f:
print(f.read())
# 【 operation 】 Read a file by line
with open(r"d:\b.txt") as f:
while True:
lines = f.readline()
if not lines: # stay python in if not The following objects will be implicitly converted to True perhaps False Judge , Therefore, an empty string is returned False
break
else:
print(lines, end="")
print()
# 【 operation 】 Using Iterators ( Return one line at a time ) Read text file
# Write and read codes should correspond
with open(r"d:\bb.txt", "r", encoding="utf-8") as f:
for a in f:
print(a, end="")
# 【 operation 】 Add a line number to the end of each line in the text file
with open(r"d:\c.txt", "r") as f:
lines = f.readlines()
lines2 = [line.rstrip() + " # " + str(index) + "\n" for index, line in zip(range(1, len(lines) + 1), lines)]
with open(r"d:\c.txt", "w") as ff:
ff.writelines(lines2)
The processing flow of binary file is consistent with that of text file . First, create a file object ,
After creating the binary file object , Still usable write() 、 read() Read and write files
When creating a file object , First, you need to specify the binary schema , Then the binary file object can be created . for example
f = open(r"d:\a.txt", 'wb')
Writable 、 Rewrite the binary object of the schema f = open(r"d:\a.txt", 'ab')
Writable 、 Append mode binary object f = open(r"d:\a.txt", 'rb')
Readable binary object Practical code
# Reading and writing binary files ( This operation is equivalent to copying )
# f = open(r"d:\a.txt", 'wb') # Writable 、 Rewrite the binary object of the schema
# f = open(r"d:\a.txt", 'ab') # Writable 、 Append mode binary object
# f = open(r"d:\a.txt", 'rb') # Readable binary object
with open(r"d:\aaa.png", "rb") as scrFile, open(r"d:\bbb.png", "wb") as destFile:
for l in scrFile:
destFile.write(l)
Properties of the file object
Open mode of file object
Common methods of file objects
utilize seek() You can move the pointer to read the file to the specified byte position
A Chinese character stands for two bytes , English only takes up one byte
Practical code
print("================= File anywhere operation ======================")
# 【 Example 】 seek() Example of moving a file pointer
with open(r"d:\cc.txt", "r") as f:
print(" The file name is {0}".format(f.name)) # The file name is d:\cc.txt
print(f.tell()) # 0
print(" Read the contents of the file ", str(f.readline())) # Read the contents of the file Ziegfei ziegfei ziegfei
print(f.tell()) # 18
f.seek(4, 0) # Chinese accounts for 2 Bytes , So in seek It needs to be 2 Multiple
print(" What the file reads ", str(f.readline())) # What the file reads Fly zig zag fly zig zag fly
print(f.tell()) # 18
serialize refer to : Convert an object to “ Serialization ” Data form , Store on hard disk or transfer to other places through network .
Deserialization It means the opposite process , Will read to “ Serialized data ” Into objects
have access to pickle Functions in modules , Implement serialization and deserialization operations
Serialization we use :
pickle.dump(obj, file)
obj Is the object to be serialized , file Refers to the stored files pickle.load(file)
from file Reading data , Anti serialization into objects Practical code
import pickle
print("================= Use pickle serialize =======================")
# 【 operation 】 Serialize the object into a file
with open("student.info", "wb") as f:
name = " Time is still "
age = 18
score = [90, 80, 70]
resume = {
"name": name, "age": age, "score": score}
pickle.dump(resume, f)
# 【 operation 】 Deserialize the obtained data into objects
with open("student.info", "rb") as f:
resume = pickle.load(f)
print(resume)
csv Is the comma separator text format , Commonly used for data exchange 、Excel Import and export of file and database data
And Excel Different documents ,CSV In file :
Python Modules of the standard library csv Provides read and write csv The object of the format file
We are excel Create a simple table in and save as csv( Comma separated ) , Let's open it and look at this csv The contents of the document
Practical code
import csv
with open(r"d:\workBook.csv") as a:
o_csv = csv.reader(a) # # establish csv object , It's a list of all the data , Each behavior is an element
headers = next(o_csv) # # Get list objects , Contains information about the title line
print(headers)
for row in o_csv: # Cycle through the lines
print(row)
Result display
Practical code
# 【 operation 】 csv.writer Object write a csv file
headers = [' full name ', ' Age ', ' Work ', ' address ']
rows = [('JOJO', '18', ' massagist ', ' The British '), (' dior ', '19', ' Boss ', ' Egypt '), (' Joruno chobana ', '20', ' Gangster ', ' YIDELI ')]
with open(r"d:\workBook3.csv", "w") as b:
b_scv = csv.writer(b) # establish csv object
b_scv.writerow(headers) # Write to a row ( title )
b_scv.writerows(rows) # Write multiple rows ( data )
Result display
os modular It can help us operate the operating system directly .
We can directly call the executable of the operating system 、 command , Direct operation of documents 、 Catalogue, etc
os modular It is a very important foundation for system operation and maintenance
Practical code
# 【 Example 】 os.system call windows System Notepad program
os.system("notepad.exe")
# 【 Example 】 os.system call windows In the system ping command
# If there's a mess , Please have a look at File operations -> Writing files -> Chinese garbled -> Console output To configure
os.system("ping www.baidu.com")
# 【 Example 】 Run the installed wechat
os.startfile(r"C:\Program Files (x86)\Tencent\WeChat\WeChat.exe")
You can read and write the file content through the file object mentioned above .
If you still You need to do other operations on files and directories , have access to os and os.path modular .
Practical code
import os
# 【 Example 】 os modular : establish 、 Delete directory 、 Get file information, etc
print(" System name :", os.name) # windows-->nt linux-->posix
print(" The path separator used by the current operating system :", os.sep) # windows-->\ linux-->/
print(" Line separator :", repr(os.linesep)) # windows-->\r\n linux-->\n
print(" Current directory :", os.curdir)
a = "3"
print(a)
# Returns the canonical string representation of the object
print(repr(a))
# Get information about files and folders
print(os.stat("MyPy08-FileRead.py"))
# Operation of working directory
print(os.getcwd()) # Get the current working directory
os.chdir("D:") # Switch current working directory
os.mkdir(" Complete learning materials ") # Create directory
os.rmdir(" Complete learning materials ") # Delete directory
# os.makedirs(" race / The yellow race / Chinese ") # Create multi-level directory , After a successful call , Calling again will report an error
# os.rename(" race ", " Asian ") # This method can only be called once
print(os.listdir(" Asian ")) # Subdirectories under the current directory
matters needing attention
Calling os.rename()
when , If an error is reported PermissionError: [WinError 5] Access denied
,
You need to You can configure the user's permissions on the folder to be renamed . After modification, you can rename . As shown in the figure below
os.path The module provides directory related information ( Path judgment 、 Path segmentation 、 Path connection 、 Folder traversal ) The operation of
Practical code
# 【 Example 】 test os.path Common methods in
print(" Is it an absolute path :", os.path.isabs("d:/a.txt"))
print(" Is it a directory : ", os.path.isdir(r"d:\a.txt"))
print(" Does the file exist : ", os.path.exists("a.txt"))
print(" file size : ", os.path.getsize("a.txt"))
print(" Output absolute path :", os.path.abspath("a.txt"))
print(" Output directory :", os.path.dirname("d:/a.txt"))
# Get the creation time 、 Access time 、 Last modified
print(" Output creation time :", os.path.getctime("a.txt"))
print(" Output last access time :", os.path.getatime("a.txt"))
print(" Output last modification time ", os.path.getmtime("a.txt"))
# Split the path 、 Connection operation
path = os.path.abspath("a.txt") # Return to absolute path
print(" Return a tuple : Catalog 、 file :", os.path.split(path))
print(" Return a tuple : route 、 Extension ", os.path.splitext(path))
print(" Return path :aa\bb\cc", os.path.join("aa", "bb", "cc"))
List all... In the specified directory .py file , And output the file name
# List all... In the specified directory .py file , And output the file name
import os
path = os.getcwd()
file_list = os.listdir(path)
for filename in file_list:
pos = filename.rfind(".")
if filename[pos + 1:] == "py":
print(filename, end="\t")
print()
os.walk() Method is a simple and easy to use file 、 Directory traverser , It can help us process files efficiently 、 About the catalogue
The format is as follows :os.walk(top[, topdown=True[, onerror=None[, followlinks=False]]])
Practical code
# 【 Example 】 Use walk() Recursively traverse all files and directories
path = os.getcwd()[:os.getcwd().rfind("\\")] # Get the superior path , The function is to output more files
file_list = os.walk(path, topdown=False)
for root, dirs, files in file_list:
for name in files:
print(os.path.join(root, name))
for name in dirs:
print(os.path.join(root, name)) # For splicing directories
Output results
Practical code
# 【 Example 】 Use recursive algorithm to traverse all files in the directory
def my_print_file(path, level):
child_files = os.listdir(path)
for file in child_files:
file_path = os.path.join(path, file)
print("\t" * level + file_path[file_path.rfind(os.sep)+1:])
if os.path.isdir(file_path):
my_print_file(file_path, level + 1)
my_print_file(path, 0)
shutil The module is python Provided in the standard library , It is mainly used for copying files and folders 、 Move 、 Delete etc. ;
You can also compress files and folders 、 Decompression operation . os Module provides general operations on directories or files .
shutil Module as a supplement , Provides mobile 、 Copy 、 Compress 、 Decompression and other operations , these os None of the modules provide
Practical code - Copy
import shutil
# 【 Example 】 Copy files
os.chdir("D:") # Switch current working directory
shutil.copyfile("a.txt", "a_copy.txt")
# 【 Example 】 Copy folder contents recursively ( Use shutil modular )
shutil.copytree(" Asian / The yellow race ", " race ", ignore=shutil.ignore_patterns("*.html", "*htm")) # " music " Folder does not exist to use
Practical code - Compress and decompress
# 【 Example 】 Compress all contents of the folder ( Use shutil modular )
# take " Asian / The yellow race " All contents in the folder are compressed to " Biological data " Generate under folder race.zip
shutil.make_archive(" Biological data /race", "zip", " Asian / The yellow race ")
# Compress : Compress the specified multiple files into one zip file
z = zipfile.ZipFile("a.zip", "w")
z.write("a.txt")
z.write("b.txt")
z.close()
# 【 Example 】 Decompress the compressed package to the specified folder ( Use shutil modular )
z2 = zipfile.ZipFile("a.zip", "r")
z2.extractall("d:/ Biological data ")
z2.close()