Python There are several built-in modules and methods to handle files in . These methods are divided into, for example os
, os.path
, shutil
and pathlib
And so on . The article will list Python The most common operations and methods for files in .
In this article , How will you learn :
fileinput
Module opens multiple files Use Python It's very easy to read and write files . So , You must first open the file in the appropriate mode . Here is an example of how to open a text file and read its contents .
with open('data.txt', 'r') as f: data = f.read() print('context: {}'.format(data)) Copy code
open()
Receive a filename and a pattern as its parameters ,r
Indicates that the file is opened in read-only mode . If you want to write data to a file , Then use w
As a parameter .
with open('data.txt', 'w') as f: data = 'some data to be written to the file' f.write(data) Copy code
In the above example ,open() Opens a file for reading or writing and returns a file handle ( In this example f
), This handle provides methods that can be used to read or write file data . read Working With File I/O in Python Learn more about how to read and write files .
Suppose your current working directory has a name my_directory
A subdirectory , This directory contains the following :
. ├── file1.py ├── file2.csv ├── file3.txt ├── sub_dir │ ├── bar.py │ └── foo.py ├── sub_dir_b │ └── file4.txt └── sub_dir_c ├── config.py └── file5.txt Copy code
Python Built in os
The module has many useful methods to list the contents of the catalog and filter the results . To get a list of all files and folders for a specific directory in the file system , You can Python Use in os.listdir()
or stay Python 3.x Use in os.scandir()
. If you also want to get file and directory properties ( Such as file size and modification date ), that os.scandir()
Is the preferred method .
import os entries = os.listdir('my_directory') Copy code
os.listdir()
Return to one Python list , It includes path The name of the file and subdirectory of the directory indicated by the parameter .
['file1.py', 'file2.csv', 'file3.txt', 'sub_dir', 'sub_dir_b', 'sub_dir_c'] Copy code
The list of contents doesn't look easy to read right now , Yes os.listdir()
The results of the call to help you view .
for entry in entries: print(entry) """ file1.py file2.csv file3.txt sub_dir sub_dir_b sub_dir_c """ Copy code
In modern times Python In the version , have access to os.scandir()
and pathlib.Path
To replace os.listdir()
.
os.scandir()
stay Python 3.5 Cited in , Its documentation is PEP 471 .
os.scandir()
Call returns an iterator instead of a list .
import os entries = os.scandir('my_directory') print(entries) # <posix.ScandirIterator at 0x105b4d4b0> Copy code
ScandirIterator Points to all entries in the current directory . You can traverse the contents of the iterator , And print the file name .
import os with os.scandir('my_directory') as entries: for entry in entries: print(entry.name) Copy code
here os.scandir()
and with Statement together , Because it supports context management protocol . Use the context manager to close the iterator and automatically release the acquired resources after the iterator is exhausted . stay my_directory
The result of printing the filename is as follows os.listdir()
As you can see in the example :
file1.py file2.csv file3.txt sub_dir sub_dir_b sub_dir_c Copy code
Another way to get a list of directories is to use the pathlib
modular :
from pathlib import Path entries = Path('my_directory') for entry in entries.iterdir(): print(entry.name) Copy code
pathlib.Path()
The return is PosixPath
or WindowsPath
object , It depends on the operating system .
pathlib.Path()
The object has a .iterdir()
To create an iterator that contains all the files and directories in the directory . from .iterdir()
Each generated entry contains information about the file or directory , For example, its name and file properties .pathlib
stay Python3.4 Was first introduced , And yes. Python A good reinforcement , It provides an object-oriented interface to the file system .
In the example above , You call pathlib.Path()
And passed in a path parameter . And then call .iterdir()
To get my_directory
List of all files and directories under .
pathlib
Provides a set of classes , Provides most common operations on the path in a simple and object-oriented way . Use pathlib
Compared with the use of os
Functions in are more efficient . and os
comparison , Use pathlib
Another benefit is that it reduces the number of packages or modules imported by the operating file system path . Want to learn more , You can read Python 3’s pathlib Module: Taming the File System .
Running the above code will get the following results :
file1.py file2.csv file3.txt sub_dir sub_dir_b sub_dir_c Copy code
Use pathlib.Path()
or os.scandir()
To replace os.listdir()
Is the preferred way to get a list of directories , Especially when you need to get file type and file attribute information .pathlib.Path()
Provided in os
and shutil
Most of the functions that handle files and paths in , And its approach is more efficient than these modules . We will discuss how to get file properties quickly .
function
describe
os.listdir()
Return all files and folders in the directory as a list
os.scandir()
Returns an iterator containing all objects in the directory , Object contains file property information
pathlib.Path().iterdir()
Returns an iterator containing all objects in the directory , Object contains file property information
These functions return a list of everything in the directory , Include subdirectories . It may not always be the result you always want , The next section will show you how to filter results from a list of directories .
This section will show you how to use os.listdir()
,os.scandir()
and pathlib.Path()
Print out the name of the file in the directory . To filter directories and list only os.listdir()
Files for the generated directory list , To use os.path
:
import os basepath = 'my_directory' for entry in os.listdir(basepath): # Use os.path.isfile Determine whether the path is a file type if os.path.isfile(os.path.join(base_path, entry)): print(entry) Copy code
Call it here os.listdir()
Returns a list of all content in the specified path , Then use os.path.isfile()
Filter the list to show only file types, not directory types . The code executes as follows :
file1.py file2.csv file3.txt Copy code
A simpler way to list all the files in a directory is to use the os.scandir()
or pathlib.Path()
:
import os basepath = 'my_directory' with os.scandir(basepath) as entries: for entry in entries: if entry.is_file(): print(entry.name) Copy code
Use os.scandir()
Compared with os.listdir()
Looks clearer and easier to understand . Yes ScandirIterator
Every call to entry.isfile()
, If you return True
It means that this item is a file . The output of the above code is as follows :
file1.py file3.txt file2.csv Copy code
next , Show how to use pathlib.Path()
List files in a directory :
from pathlib import Path basepath = Path('my_directory') for entry in basepath.iterdir(): if entry.is_file(): print(entry.name) Copy code
stay .iterdir()
Every call generated .is_file()
. The output is the same as above :
file1.py file3.txt file2.csv Copy code
If you will for Circulation and if Statements combined into a single generator expression , The above code can be more concise . About builder expressions , Recommend an article Dan Bader The article .
The revised version is as follows :
from pathlib import Path basepath = Path('my_directory') files_in_basepath = (entry for entry in basepath.iterdir() if entry.is_file()) for item in files_in_basepath: print(item.name) Copy code
The execution result of the above code is the same as before . This section shows the use of os.scandir()
and pathlib.Path()
Filter files or directories using os.listdir()
and os.path
More intuitive , Code looks simpler .
If you want to list subdirectories instead of files , Please use the following method . Now show me how to use os.listdir()
and os.path()
:
import os basepath = 'my_directory' for entry in os.listdir(basepath): if os.path.isdir(os.path.join(basepath, entry)): print(entry) Copy code
When you call multiple times os.path,join()
when , Operating the file system in this way becomes cumbersome . Running this code on my computer produces the following output :
sub_dir sub_dir_b sub_dir_c Copy code
Here's how to use os.scandir()
:
import os basepath = 'my_directory' with os.scandir(basepath) as entries: for entry in entries: if entry.is_dir(): print(entry.name) Copy code
Same as the example in the file list , Here in os.scandir()
Called on each item returned .is_dir()
. If this is a directory , be is_dir()
return True, And print out the name of the directory . The output is the same as above :
sub_dir_c sub_dir_b sub_dir Copy code
Here's how to use pathlib.Path()
:
from pathlib import Path basepath = Path('my_directory') for entry in basepath.iterdir(): if entry.is_dir(): print(entry.name) Copy code
stay .iterdir()
Called on each item returned by the iterator is_dir()
Check whether it is a file or a directory . If the item is a directory , Then print its name , And the output generated is the same as in the previous example :
sub_dir_c sub_dir_b sub_dir Copy code
Python It is easy to get file attributes such as file size and modification time . By using os.stat()
, os.scandir()
or pathlib.Path
To get .
os.scandir()
and pathlib.Path()
The directory list containing file attributes can be obtained directly . This may be better than using os.listdir()
It is more efficient to list files and get file attribute information of each file .
The following example shows how to get my_directory
Last modified time of file in . Output as time stamp :
import os with os.scandir('my_directory') as entries: for entry in entries: info = entry.stat() print(info.st_mtime) """ 1548163662.3952665 1548163689.1982062 1548163697.9175904 1548163721.1841028 1548163740.765162 1548163769.4702623 """ Copy code
os.scandir()
Return to one ScandirIterator
object .ScandirIterator
Each item in the object has .stat()
Method to get information about the file or directory it points to ..stat()
Provides information such as file size and last modified time . In the example above , Code printed st_time
attribute , This property is the last time the file content was modified .
pathlib
Modules have corresponding methods , File information for getting the same results :
from pathlib import Path basepath = Path('my_directory') for entry in basepath.iterdir(): info = entry.stat() print(info.st_mtime) """ 1548163662.3952665 1548163689.1982062 1548163697.9175904 1548163721.1841028 1548163740.765162 1548163769.4702623 """ Copy code
In the example above , loop .iterdir()
Iterator returned by calling .stat()
To get file properties .st_mtime
Property is a floating-point value , Time stamp . In order to make st_time
The returned value is easier to read , You can write an auxiliary function to convert it to a datetime
object :
import datetime from pathlib import Path def timestamp2datetime(timestamp, convert_to_local=True, utc=8, is_remove_ms=True) """ transformation UNIX Timestamp datetime object :param timestamp: Time stamp :param convert_to_local: Transfer to local time :param utc: Time zone information , China is utc+8 :param is_remove_ms: Remove milliseconds or not :return: datetime object """ if is_remove_ms: timestamp = int(timestamp) dt = datetime.datetime.utcfromtimestamp(timestamp) if convert_to_local: dt = dt + datetime.timedelta(hours=utc) return dt def convert_date(timestamp, format='%Y-%m-%d %H:%M:%S'): dt = timestamp2datetime(timestamp) return dt.strftime(format) basepath = Path('my_directory') for entry in basepath.iterdir(): if entry.is_file() info = entry.stat() print('{} Last modified on {}'.format(entry.name, timestamp2datetime(info.st_mtime))) Copy code
First get my_directory
List of files in and their properties , And then call convert_date()
To convert the last modification time of the file to display it in a human readable way .convert_date()
Use .strftime()
take datetime Type to string .
Output of the above code :
file3.txt Last modified on 2019-01-24 09:04:39 file2.csv Last modified on 2019-01-24 09:04:39 file1.py Last modified on 2019-01-24 09:04:39 Copy code
The syntax for converting dates and times to strings can be confusing .
Sooner or later, your program needs to create a directory in which to store data . os
and pathlib
Contains the function to create a directory . We will consider the following methods :
Method
describe
os.mkdir()
Create a single subdirectory
os.makedirs()
Create multiple directories , Include intermediate directory
Pathlib.Path.mkdir()
Create a single or multiple directory
To create a single directory , Pass directory path as parameter to os.mkdir()
:
import os os.mkdir('example_directory') Copy code
If the directory already exists ,os.mkdir()
Will throw out FileExistsError
abnormal . perhaps , You can also use pathlib
To create a directory :
from pathlib import Path p = Path('example_directory') p.mkdir() Copy code
If the path already exists ,mkdir()
Will throw out FileExistsError
abnormal :
FileExistsError: [Errno 17] File exists: 'example_directory' Copy code
To avoid throwing errors like this , Capture errors when they occur and let your users know :
from pathlib import Path p = Path('example_directory') try: p.mkdir() except FileExistsError as e: print(e) Copy code
perhaps , You can give it .mkdir()
Pass in exist_ok=True
Parameter to ignore FileExistsError
abnormal :
from pathlib import Path p = Path('example_directory') p.mkdir(exist_ok=True) Copy code
If directory already exists , No error will be caused .
os.makedirs()
and os.mkdir()
similar . The difference between the two is ,os.makedirs()
Not only can you create separate directories , You can also recursively create a directory tree . let me put it another way , It can create any necessary intermediate folder , To ensure that it is kept in the complete path .
os.makedirs()
And in bash Run in mkdir -p
similar . for example , To create a set of directory images 2018/10/05, You can do it as follows :
import os os.makedirs('2018/10/05', mode=0o770) Copy code
The above code creates 2018/10/05
Directory structure and provides read for owners and group users 、 Write and execute permissions . The default mode is 0o777
, Added permissions to other user groups .
function tree
Command to confirm the permission of our application :
$ tree -p -i . . [drwxrwx---] 2018 [drwxrwx---] 10 [drwxrwx---] 05 Copy code
The above code prints out the directory tree of the current directory . tree
Usually used to list the contents of a directory in a tree structure . Pass in -p
and -i
Parameter will print the directory name and its file permission information in a vertical list .-p
For export file permissions ,-i
For use tree
Command to generate a vertical list without indents .
As you can see , All directories own 770 jurisdiction . Another way to create multiple directories is to use the pathlib.Path
Of .mkdir()
:
from pathlib import Path p = Path('2018/10/05') p.mkdir(parents=True, exist_ok=True) Copy code
By giving Path.mkdir()
Pass on parents=True
Keyword parameters make it create 05
Directory and all parent directories that make its path valid .
By default ,os.makedirs()
and pathlib.Path.mkdir()
Will be thrown when the target directory exists OSError
. By passing exist_ok=True
This behavior can be overridden as a key parameter ( from Python3.2 Start ).
Running the above code will result in a structure like the following :
└── 2018 └── 10 └── 05 Copy code
I prefer to use the pathlib
, Because I can use the same function method to create one or more directories .
Use one of the above methods to get the list of files in the directory , You may want to search for files that match a specific pattern .
Here are the methods and functions you can use :
endswith()
and startswith()
String method fnmatch.fnmatch()
glob.glob()
pathlib.Path.glob()
These methods and functions are discussed below . The examples in this section will be shown in the some_directory
Under the directory of , The directory has the following structure :
. ├── admin.py ├── data_01_backup.txt ├── data_01.txt ├── data_02_backup.txt ├── data_02.txt ├── data_03_backup.txt ├── data_03.txt ├── sub_dir │ ├── file1.py │ └── file2.py └── tests.py Copy code
If you are using Bash shell, You can use the following command to create the above directory structure :
mkdir some_directory cd some_directory mkdir sub_dir touch sub_dir/file1.py sub_dir/file2.py touch data_{01..03}.txt data_{01..03}_backup.txt admin.py tests.py Copy code
This will create some_directory
Directory and enter it , Then create a sub_dir
. The next line is sub_dir
establish file1.py
and file2.py
, The last line uses the extension to create all other files .
Python There are several built-in Modifying and manipulating strings Methods . When matching file names , Two of them .startswith()
and .endswith()
Very useful . To do that , First, get a list of directories , Then traverse .
import os for f_name in os.listdir('some_directory'): if f_name.endswith('.txt'): print(f_name) Copy code
The above code is found some_directory
All files in , Traverse and use .endswith()
To print all extensions as .txt
The name of the file . The running code output on my computer is as follows :
data_01.txt data_01_backup.txt data_02.txt data_02_backup.txt data_03.txt data_03_backup.txt Copy code
fnmatch
Simple filename pattern matching The ability of string method matching is limited .fnmatch
There are more advanced functions and methods for pattern matching . We will consider using fnmatch.fnmatch()
, This is a supported use *
and ?
Functions with equal wildcards . for example , Use fnmatch
Find all in directory .txt
file , You can do that :
import os import fnmatch for f_name in os.listdir('some_directory'): if fnmatch.fnmatch(f_name, '*.txt'): print(f_name) Copy code
iteration some_directory
List of files in , And use .fnmatch()
For extension .txt
Perform wildcard search for .
Suppose you want to find a .txt
file . for example , You may point to finding a single containing data
Of .txt
file , A set of numbers between underscores , And words in the filename backup
. Is similar to the data_01_backup
, data_02_backup
, or data_03_backup
.
You can use it like this fnmatch.fnmatch()
:
import os import fnmatch for f_name in os.listdir('some_directory'): if fnmatch.fnmatch(f_name, 'data_*_backup.txt'): print(f_name) Copy code
Here just print out the match data_*_backup.txt
File name of the schema . In the pattern *
Will match any character , So running this code will look up the filename to data
Start with backup.txt
All text files for , As shown in the following output :
data_01_backup.txt data_02_backup.txt data_03_backup.txt Copy code
glob
File name pattern matching Another useful pattern matching module is glob
.
.glob()
stay glob
The left and right in the module are like fnmatch.fnmatch()
, But with the fnmach.fnmatch()
The difference is , It will take .
Files at the beginning are treated as special files .
UNIX And related systems use wildcard images in the file list ?
and *
Indicates full match .
for example , stay UNIX shell Use in mv *.py python_files
Mobile all .py
Extension From the current directory to python_files
. this *
Is a wildcard representing any number of characters ,*.py
It's a full model .Windows This is not available in the operating system shell function . but glob
Modules in Python This feature is added to , bring Windows The program can use this feature .
Here's a use glob
Module queries all in the current directory Python Code file :
import glob print(glob.glob('*.py')) Copy code
glob.glob('*.py')
Search current directory for .py
File with extension , And return them as a list . glob
And support shell Style to match :
import glob for name in glob.glob('*[0-9]*.txt'): print(name) Copy code
This will find text files with numbers in all filenames (.txt
) :
data_01.txt data_01_backup.txt data_02.txt data_02_backup.txt data_03.txt data_03_backup.txt Copy code
glob
It is also easy to search files recursively in subdirectories :
import glob for name in glob.iglob('**/*.py', recursive=True): print(name) Copy code
This example uses glob.iglob()
Search the current directory and subdirectories for all .py
file . Pass on recursive=True
As .iglob()
Parameter to search the current directory and subdirectories .py
file .glob.glob()
and glob.iglob()
The difference is ,iglob()
Returns an iterator instead of a list .
Running the above code will result in the following results :
admin.py tests.py sub_dir/file1.py sub_dir/file2.py Copy code
pathlib
It also includes similar methods to obtain the list of files flexibly . The following example shows that you can use the .Path.glob()
List in letters p
File list of file types started .
from pathlib import Path p = Path('.') for name in p.glob('*.p*'): print(name) Copy code
call p.glob('*.p*')
Returns a letter that points to all extensions in the current directory p
Generator object for the beginning file .
Path.glob()
And the ones discussed above os.glob()
similar . As you can see , pathlib
Mixed a lot os
, os.path
and glob
Best features of a module into a module , This makes it easy to use .
Take a look back. , This is the menu we introduced in this section :
function
describe
startswith()
Test whether a string starts with a specific pattern , return True or False
endswith()
Test whether a string ends in a specific pattern , return True or False
fnmatch.fnmatch(filename, pattern)
Test if the filename matches this pattern , return True or False
glob.glob()
Returns a list of filenames that match the pattern
pathlib.Path.glob()
Returns a generator object that matches the pattern
A common programming task is to traverse the tree and process the files in the tree . Let's explore how to use the built-in Python function os.walk()
To achieve this function .os.walk()
Used to generate file names in a directory tree by traversing the tree from top to bottom or from bottom to top . For the purposes of this section , We want to operate the following tree :
├── folder_1 │ ├── file1.py │ ├── file2.py │ └── file3.py ├── folder_2 │ ├── file4.py │ ├── file5.py │ └── file6.py ├── test1.txt └── test2.txt Copy code
Here is an example , Show me how to use os.walk()
List all files and directories in the tree .
os.walk()
The default is to traverse the directory from top to bottom :
import os for dirpath, dirname, files in os.walk('.'): print(f'Found directory: {dirpath}') for file_name in files: print(file_name) Copy code
os.walk()
Return three values in each loop :
In each iteration , It prints out the names of subdirectories and files it finds :
Found directory: . test1.txt test2.txt Found directory: ./folder_1 file1.py file3.py file2.py Found directory: ./folder_2 file4.py file5.py file6.py Copy code
To traverse the tree from the bottom up , Will topdown=False
Key parameter passed to os.walk()
:
for dirpath, dirnames, files in os.walk('.', topdown=False): print(f'Found directory: {dirpath}') for file_name in files: print(file_name) Copy code
Pass on topdown=False
Parameters will make os.walk()
First print out the files it found in the subdirectory :
Found directory: ./folder_1 file1.py file3.py file2.py Found directory: ./folder_2 file4.py file5.py file6.py Found directory: . test1.txt test2.txt Copy code
As you can see , The program lists the contents of the subdirectory before the contents of the root directory . This is useful when you want to recursively delete files and directories . You will learn how to do this in the following sections . By default ,os.walk
Directories created through soft connections are not accessed . By using followlinks = True
Parameter to override the default behavior .
Python Provides tempfile
Module to easily create temporary files and directories .
tempfile
You can open and store temporary data in a file or directory while your program is running . tempfile
These temporary files will be deleted after your program stops running .
Now? , Let's see how to create a temporary file :
from tempfile import TemporaryFile # Create a temporary file and write some data to it fp = TemporaryFile('w+t') fp.write('Hello World!') # Back to the beginning , Reading data from a file fp.seek(0) data = fp.read() print(data) # Close file , After that, he will be deleted fp.close() Copy code
The first step is from tempfile
Module import TemporaryFile
. Next , Use TemporaryFile()
Method and pass in a pattern that you want to open to create an object like file . This creates and opens a file that can be used as a temporary storage area .
In the example above , The model is w + t
, This makes tempfile
Create temporary text file in write mode . It is not necessary to provide a filename for a temporary file , Because after the script runs, it will be destroyed .
After writing to file , You can read from it and close it after processing . Once the file is closed , Will be removed from the file system . Use if you need to name tempfile
Temporary files generated , Please use tempfile.NamedTemporaryFile()
.
Use tempfile
Temporary files and directories created are stored in a special system directory used to store temporary files . Python The directory list will search for directories in which users can create files .
stay Windows On , The contents are in order C:\TEMP
,C:\TMP
,\TEMP
and \TMP
. On all other platforms , The contents are in order / tmp
,/var/tmp
and /usr/tmp
. If none of the above directories exist ,tempfile
Temporary files and directories will be stored in the current directory .
.TemporaryFile()
Also a context manager , So it can work with with Statement together . Using context manager will automatically close and delete files after reading them :
with TemporaryFile('w+t') as fp: fp.write('Hello universe!') fp.seek(0) fp.read() # Temporary files are now closed and deleted Copy code
This creates a temporary file and reads data from it . Once the contents of the file are read , The temporary file is closed and removed from the file system .
tempfile
Can also be used to create temporary directories . Let's see how to use it tempfile.TemporaryDirectory()
To do that :
import tempfile import os tmp = '' with tempfile.TemporaryDirectory() as tmpdir: print('Created temporary directory ', tmpdir) tmp = tmpdir print(os.path.exists(tmpdir)) print(tmp) print(os.path.exists(tmp)) Copy code
call tempfile.TemporaryDirectory()
A temporary directory is created in the file system , And returns an object representing the directory . In the example above , Create directory using context manager , The name of the directory is stored in tmpdir
variable . The third line prints the name of the temporary directory ,os.path.exists(tmpdir)
To confirm whether the directory is actually created in the file system .
After the context manager exits the context , Temporary directory will be deleted , And right os.path.exists(tmpdir)
Will return False, This means that the directory has been successfully deleted .
You can use os
,shutil
and pathlib
Method in module to delete a single file , Directory and entire tree . Here's how to delete files and directories that you no longer need .
To delete a single file , Please use pathlib.Path.unlink()
,os.remove()
or os.unlink()
.
os.remove()
and os.unlink()
Semantically the same . To use os.remove()
Delete file , Do the following :
import os data_file = 'C:\\Users\\vuyisile\\Desktop\\Test\\data.txt' os.remove(data_file) Copy code
Use os.unlink()
Deleting files and using os.remove()
In a similar way :
import os data_file = 'C:\\Users\\vuyisile\\Desktop\\Test\\data.txt' os.unlink(data_file) Copy code
Called on file .unlink()
or .remove()
The file is removed from the file system . If the paths passed to them point to directories instead of files , These two functions will throw OSError
. To avoid this situation , You can check whether the content you want to delete is a file , And delete it when confirming it is a file , Or you can use exception handling OSError
:
import os data_file = 'home/data.txt' # Delete if type is file if os.path.is_file(data_file): os.remove(data_file) else: print(f'Error: {data_file} not a valid filename') Copy code
os.path.is_file()
Check data_file
Is it actually a file . If it is , By calling os.remove()
Delete it . If data_file
Point to folder , An error message is output to the console .
The following example shows how to use exception handling to handle errors when deleting files :
import os data_file = 'home/data.txt' # Use exception handling try: os.remove(data_file) except OSError as e: print(f'Error: {data_file} : {e.strerror}') Copy code
The above code attempts to delete the file before checking its type . If data_file
It's not really a file , Thrown out OSError
Will be in except Clause , And output error messages to the console . Use of printed error messages Python f-strings format .
Last , You can still use it pathlib.Path.unlink()
Delete file :
from pathlib import Path data_file = Path('home/data.txt') try: data_file.unlink() except IsADirectoryError as e: print(f'Error: {data_file} : {e.strerror}') Copy code
This will create a file named data_file
Of Path
object , The object points to a file . stay data_file
On the call .unlink() Will delete home / data.txt
. If data_file
Directing directory , The cause IsADirectoryError
. It is worth noting that , above Python The program has the same permissions as the user running it . If the user does not have permission to delete the file , Will trigger PermissionError
.
The standard library provides the following functions to delete directories :
To delete a single directory or folder, you can use the os.rmdir()
or pathlib.Path.rmdir()
. These two functions are only valid when you delete an empty directory . If the directory is not empty , It will be thrown out. OSError
. Here's how to delete a folder :
import os trash_dir = 'my_documents/bad_dir' try: os.rmdir(trash_dir) except OSError as e: print(f'Error: {trash_dir} : {e.strerror}') Copy code
Now? ,trash_dir
Have gone through os.rmdir()
Been deleted . If the directory is not empty , The error message will be printed on the screen :
Traceback (most recent call last): File '<stdin>', line 1, in <module> OSError: [Errno 39] Directory not empty: 'my_documents/bad_dir' Copy code
Again , You can also use pathlib
To delete a directory :
from pathlib import Path trash_dir = Path('my_documents/bad_dir') try: trash_dir.rmdir() except OSError as e: print(f'Error: {trash_dir} : {e.strerror}') Copy code
We created one here Path
Object points to the directory to be deleted . If the directory is empty , call Path
Object's .rmdir()
Method to delete it .
To delete a non empty directory and a full tree ,Python Provides shutil.rmtree()
:
import shutil trash_dir = 'my_documents/bad_dir' try: shutil.rmtree(trash_dir) except OSError as e: print(f'Error: {trash_dir} : {e.strerror}') Copy code
When calling shutil.rmtree()
when ,trash_dir
All content in will be deleted . In some cases , You may want to delete empty folders recursively . You can use one of the methods discussed above to combine os.walk()
To do this :
import os for dirpath, dirnames, files in os.walk('.', topdown=False): try: os.rmdir(dirpath) except OSError as ex: pass Copy code
This will traverse the tree and try to delete every directory it finds . If the directory is not empty , The cause OSError And skip the directory . The following table lists the functions covered in this section :
function
describe
os.remove()
Delete single file , Cannot delete directory
os.unlink()
and os.remove() equally , Function delete single file
pathlib.Path.unlink()
Delete single file , Cannot delete directory
os.rmdir()
Delete an empty directory
pathlib.Path.rmdir()
Delete an empty directory
shutil.rmtree()
Delete full tree , Can be used to delete non empty directories
Python Incidental shutil
modular . shutil
yes shell Abbreviation for utility . It provides many advanced operations for files , To support file and directory replication , Archive and delete . In this section , You will learn how to move and copy files and directories .
shutil
Provides functions for copying files . The most common functions are shutil.copy()
and shutil.copy2()
. Use shutil.copy()
Copy files from one location to another , Do the following :
import shutil src = 'path/to/file.txt' dst = 'path/to/dest_dir' shutil.copy(src, dst) Copy code
shutil.copy()
And based on UNIX In the system of cp
Command equivalent . shutil.copy(src,dst)
Will file src
Copied to the dst
Location specified in . If dst
It's a document , The contents of the file will be replaced with src
The content of . If dst
Is a directory , be src
Will be copied to this directory . shutil.copy()
Copy only the contents of the file and the permissions of the file . Other metadata ( Such as the creation and modification time of files ) No reservation .
To retain all file metadata at copy time , Please use shutil.copy2()
:
import shutil src = 'path/to/file.txt' dst = 'path/to/dest_dir' shutil.copy2(src, dst) Copy code
Use .copy2()
Keep details about files , For example, last visit time , Permission bits , Last modified time and flag .
although shutil.copy()
Copy only a single file , but shutil.copytree()
The entire catalog and everything contained in it will be copied . shutil.copytree(src,dest)
Receive two parameters : Source directory and destination directory to which files and folders are copied .
Here is an example of how to copy the contents of a folder to another location :
import shutil dst = shutil.copytree('data_1', 'data1_backup') print(dst) # data1_backup Copy code
In this example ,.copytree()
take data_1
Copy the contents of to a new location data1_backup
And return to the target directory . Destination directory cannot be an existing one . It will be created without its parent directory . shutil.copytree()
Is a good way to back up files .
To move a file or directory to another location , Please use shutil.move(src,dst)
.
src
Is the file or directory to move ,dst
Is the target :
import shutil dst = shutil.move('dir_1/', 'backup/') print(dst) # 'backup' Copy code
If backup/
There is , be shutil.move('dir_1/','backup/')
take dir_1/
Move to backup/
. If backup/
non-existent , be dir_1/
Rename to backup
.
Python Contains the os.rename(src,dst)
:
import os os.rename('first.zip', 'first_01.zip') Copy code
Line above first.zip
Rename it to first_01.zip
. If the destination path points to a directory , It will be thrown out. OSError
.
Another way to rename a file or directory is to use the pathlib
Module rename()
:
from pathlib import Path data_file = Path('data_01.txt') data_file.rename('data.txt') Copy code
To use pathlib
Rename file , First, create a pathlib.Path()
object , The object contains the path of the file to replace . The next step is to call on the path object rename()
And pass in the new name of the file or directory you want to rename .
Archiving is a convenient way to package multiple files into one file . The two most common types of archiving are ZIP and TAR. You wrote Python Programs can create archives , Read and extract data from archives . In this section, you will learn how to read and write two compression formats .
zipfile
Module is an underlying module , yes Python Part of the standard library . zipfile
With easy opening and extraction ZIP Functions of files . To read ZIP The content of the document , The first thing to do is create a ZipFile
object .ZipFile
Object similar to using open()
File objects created .ZipFile
Also a context manager , Therefore support with sentence :
import zipfile with zipfile.ZipFile('data.zip', 'r') as zipobj: pass Copy code
Create a ZipFile
object , Pass in ZIP The name of the file and opens in read mode . open ZIP After the document , Can pass zipfile
Module provides functions to access information about archive files . In the example above data.zip
Archive is from the data
Created by , This directory contains a total of 5 File and 1 Subdirectory :
. | ├── sub_dir/ | ├── bar.py | └── foo.py | ├── file1.py ├── file2.py └── file3.py Copy code
To get a list of files in an archive , Please be there. ZipFile
Object namelist()
:
import zipfile with zipfile.ZipFile('data.zip', 'r') as zipobj: zipobj.namelist() Copy code
This generates a list of files :
['file1.py', 'file2.py', 'file3.py', 'sub_dir/', 'sub_dir/bar.py', 'sub_dir/foo.py'] Copy code
.namelist()
Returns a list of the names of files and directories in the archive . To retrieve information about files in an archive , Use .getinfo()
:
import zipfile with zipfile.ZipFile('data.zip', 'r') as zipobj: bar_info = zipobj.getinfo('sub_dir/bar.py') print(bar_info.file_size) Copy code
This will output. :
15277 Copy code
.getinfo()
Return to one ZipInfo
object , This object stores information about individual members of the archive file . To get information about files in an archive , Please pass its path as a parameter to .getinfo()
. Use getinfo()
, You can retrieve information about archive members , For example, the date the file was last modified , Compressed size and its full filename . visit .file_size
The original size of the file will be retrieved in bytes .
The following example shows how to Python REPL Retrieve more details about archived files in . Assume imported zipfile
modular ,bar_info
Same as the object created in the previous example :
>>> bar_info.date_time (2018, 10, 7, 23, 30, 10) >>> bar_info.compress_size 2856 >>> bar_info.filename 'sub_dir/bar.py' Copy code
bar_info
Include relevant bar.py
Details of , For example, the size of compression and its full path .
The first line shows how to retrieve the last modified date of a file . The next line shows how to get the file size after archiving . The last line shows the bar.py
Full path to .
ZipFile
Support context manager protocol , That's how you can relate it to with Why statements are used together . Close automatically after operation ZipFile
object . Trying to get from a closed ZipFile
Opening or extracting files from objects will cause errors .
zipfile
Module allows you to pass .extract()
and .extractall()
from ZIP Extract one or more files from a file .
By default , These methods extract files to the current directory . They all take optional path parameters , Allows you to specify another specified directory to extract files to . If the directory does not exist , The directory will be created automatically . To extract a file from a compressed file , Do the following :
>>> import zipfile >>> import os >>> os.listdir('.') ['data.zip'] >>> data_zip = zipfile.ZipFile('data.zip', 'r') >>> # Extract a single file to the current directory >>> data_zip.extract('file1.py') '/home/test/dir1/zip_extract/file1.py' >>> os.listdir('.') ['file1.py', 'data.zip'] >>> # Bring all files to the specified directory >>> data_zip.extractall(path='extract_dir/') >>> os.listdir('.') ['file1.py', 'extract_dir', 'data.zip'] >>> os.listdir('extract_dir') ['file1.py', 'file3.py', 'file2.py', 'sub_dir'] >>> data_zip.close() Copy code
The third line of code is right os.listdir()
Call to , It shows that there is only one file in the current directory data.zip
.
Next , Open in read mode data.zip
And call .extract()
Extract from file1.py
. .extract()
Returns the full file path of the extracted file . Because no path was specified ,.extract()
Will file1.py
Extract to current directory .
Print a list of directories on the next line , Show that the current directory now includes archive files other than the original archive . It then shows how to extract the entire archive to the specified directory ..extractall()
establish extract_dir
And will data.zip
The content of . Last line closed ZIP Archive file .
zipfile
Support extraction of password protected ZIP. To extract password protected ZIP file , Please pass the password as a parameter to .extract()
or .extractall()
Method :
>>> import zipfile >>> with zipfile.ZipFile('secret.zip', 'r') as pwd_zip: ... # Extract data from encrypted documents ... pwd_zip.extractall(path='extract_dir', pwd='[email protected]') Copy code
Will open in read mode secret.zip
The archive . Password provided to .extractall()
, And the contents of the compressed file are extracted to extract_dir
. because with sentence , After extraction , Archive will close automatically .
To create a new ZIP The archive , Please use write mode (w) open ZipFile
Object and add files to archive :
>>> import zipfile >>> file_list = ['file1.py', 'sub_dir/', 'sub_dir/bar.py', 'sub_dir/foo.py'] >>> with zipfile.ZipFile('new.zip', 'w') as new_zip: ... for name in file_list: ... new_zip.write(name) Copy code
In this example ,new_zip
Open in write mode ,file_list
Each file in is added to the archive . with After statement end , Will close new_zip
. Open in write mode ZIP The file deletes the contents of the compressed file and creates a new archive file .
To add a file to an existing archive , Please open in append mode ZipFile
object , Then add the file :
>>> with zipfile.ZipFile('new.zip', 'a') as new_zip: ... new_zip.write('data.txt') ... new_zip.write('latin.txt') Copy code
This opens the new.zip
The archive . Open in append mode ZipFile
Object allows new files to be added to ZIP File without deleting its current contents . Add files to ZIP After the document ,with Statement will be out of context and closed ZIP file .
TAR File is like ZIP Etc. uncompressed file archive . They can be used gzip
,bzip2
and lzma
Compression method for compression . TarFile
Class allows reading and writing TAR The archive .
Read from archive :
import tarfile with tarfile.open('example.tar', 'r') as tar_file: print(tar_file.getnames()) Copy code
tarfile
Objects open like most file like objects . They have a open()
function , It uses a mode to determine how files are opened .
Use “r”,“w” or “a” Mode opens uncompressed TAR File for reading , Write and append . To turn on compressed TAR file , Please pass the mode parameter to tarfile.open()
, The format for filemode [:compression]
. The following table lists the open TAR Possible modes of files :
Pattern
Behavior
r
Open archive in uncompressed read mode
r:gz
With gzip Compressed read mode open archive
r:bz2
With bzip2 Compressed read mode open archive
w
Open archive in uncompressed write mode
w:gz
With gzip Compressed write mode open archive
w:xz
With lzma Compressed write mode open archive
a
Open archive in uncompressed append mode
.open()
The default is 'r' Pattern . To read uncompressed TAR File and retrieve its filename , Please use .getnames()
:
>>> import tarfile >>> tar = tarfile.open('example.tar', mode='r') >>> tar.getnames() ['CONTRIBUTING.rst', 'README.md', 'app.py'] Copy code
This returns the name of the content in the archive as a list .
Be careful : To show you how to use different tarfile Object methods , In the example TAR File in interactive REPL Manually open and close in session . In this way TAR File interaction , You can view the output of running each command . Usually , You may want to use context manager to open file like objects .
In addition, you can use special properties to access metadata for each entry in the archive :
>>> for entry in tar.getmembers(): ... print(entry.name) ... print(' Modified:', time.ctime(entry.mtime)) ... print(' Size :', entry.size, 'bytes') ... print() CONTRIBUTING.rst Modified: Sat Nov 1 09:09:51 2018 Size : 402 bytes README.md Modified: Sat Nov 3 07:29:40 2018 Size : 5426 bytes app.py Modified: Sat Nov 3 07:29:13 2018 Size : 6218 bytes Copy code
In this example , Loop traversal .getmembers()
List of returned files , And print out the properties of each file ..getmembers()
The returned object has properties that can be accessed programmatically , For example, the name of each file in the archive , Size and last modified . After reading or writing to the archive , It must be shut down to free system resources .
In this section , You will learn how to use the following methods from TAR Extract files in archive :
.extract()
.extractfile()
.extractall()
From you to TAR Extract single file in archive , Please use extract()
, Incoming filename :
>>> tar.extract('README.md') >>> os.listdir('.') ['README.md', 'example.tar'] Copy code
README.md
File extraction from archive to file system . call os.listdir()
confirm README.md
File successfully extracted to current directory . To extract or extract everything from the archive , Please use .extractall()
:
>>> tar.extractall(path="extracted/") Copy code
.extractall()
There is an optional path
Parameter to specify the destination of the extracted file . here , Archive is extracted to extracted
Directory . The following command shows that the archive was successfully extracted :
$ ls example.tar extracted README.md $ tree . ├── example.tar ├── extracted | ├── app.py | ├── CONTRIBUTING.rst | └── README.md └── README.md 1 directory, 5 files $ ls extracted/ app.py CONTRIBUTING.rst README.md Copy code
To extract a file object for reading or writing , Please use .extractfile()
, It receives File name or TarInfo
Object as parameter . .extractfile()
Returns a class file object that can be read and used :
>>> f = tar.extractfile('app.py') >>> f.read() >>> tar.close() Copy code
Open archives should always be closed after reading or writing . To close the archive , Please call on the archive handle .close()
, Or create tarfile
Object with sentence , To automatically close the archive when it is finished . This will free up system resources , And write any changes you make to the archive to the file system .
Create a new TAR The archive , You can do this :
>>> import tarfile >>> file_list = ['app.py', 'config.py', 'CONTRIBUTORS.md', 'tests.py'] >>> with tarfile.open('packages.tar', mode='w') as tar: ... for file in file_list: ... tar.add(file) >>> # Read the contents of the newly created archive >>> with tarfile.open('package.tar', mode='r') as t: ... for member in t.getmembers(): ... print(member.name) app.py config.py CONTRIBUTORS.md tests.py Copy code
First , You want to create a list of files to add to the archive , So you don't have to add each file manually .
Use next line with The raytext manager opens in write mode with the name packages.tar
New archive . In write mode ('w') Open archive enables you to write new files to the archive . All existing files in the archive will be deleted , And create a new archive .
After creating and populating the archive ,with The context manager automatically closes it and saves it to the file system . The last three lines open the archive you just created , And print out the name of the file contained in it .
To add a new file to an existing archive , Please use append mode ('a') open state :
>>> with tarfile.open('package.tar', mode='a') as tar: ... tar.add('foo.bar') >>> with tarfile.open('package.tar', mode='r') as tar: ... for member in tar.getmembers(): ... print(member.name) app.py config.py CONTRIBUTORS.md tests.py foo.bar Copy code
Open archive in append mode allows you to add new files to it without deleting existing files .
tarfile
Can be read and written using gzip
,bzip2
and lzma
Compression of the TAR Archive file . To read or write to a compressed archive , Please use tarfile.open()
, Passing the appropriate pattern for the compression type .
for example , To read or write to use gzip
Compression of the TAR Archived data , Please use separately 'r:gz'
or 'w:gz'
Pattern :
>>> files = ['app.py', 'config.py', 'tests.py'] >>> with tarfile.open('packages.tar.gz', mode='w:gz') as tar: ... tar.add('app.py') ... tar.add('config.py') ... tar.add('tests.py') >>> with tarfile.open('packages.tar.gz', mode='r:gz') as t: ... for member in t.getmembers(): ... print(member.name) app.py config.py tests.py Copy code
'w:gz'
Open in write mode gzip
Compressed archive ,'r:gz'
Open in read mode gzip
Compressed archive . Cannot open compressed archive in append mode . To add a file to a compressed archive , You must create a new archive .
Python The standard library also supports the use of shutil
Advanced method creation in modules TAR and ZIP The archive . shutil
The archive utility in allows you to create , Read and extract ZIP and TAR file . These utilities depend on the underlying tarfile
and zipfile
modular .
shutil.make_archive()
Receive at least two parameters : Name and format of the archive .
By default , It compresses all files in the current directory into format
Archive format specified in parameter . You can pass in optional root_dir
Parameter to compress files in different directories . .make_archive()
Support zip
,tar
,bztar
and gztar
archive format .
Here are the USES shutil
establish TAR Method of archiving :
import shutil # shutil.make_archive(base_name, format, root_dir) shutil.make_archive('data/backup', 'tar', 'data/') Copy code
This will copy. data /
Everything in , And create a file system named backup.tar
Archive of and return its name . To extract an archive , Please call .unpack_archive()
:
shutil.unpack_archive('backup.tar', 'extract_dir/') Copy code
call .unpack_archive()
And pass in the archive name and destination directory , take backup.tar
Content extracted to extract_dir/
in . ZIP Archives can be created and extracted in the same way .
Python Supported by fileinput
Module reads data from multiple input streams or file lists . This module allows you to quickly and easily loop through the contents of one or more text files . Here are the USES fileinput
Typical methods of :
import fileinput for line in fileinput.input() process(line) Copy code
fileinput
Default passed from to sys.argv
Gets its input .
Let's use fileinput
Build a common UNIX Tools cat
Original version of . cat
Tools read files in order , Write them to standard output . When multiple files are given in command line arguments ,cat
The text file will be connected and the result will be displayed in the terminal :
# File: fileinput-example.py import fileinput import sys files = fileinput.input() for line in files: if fileinput.isfirstline(): print(f'\n--- Reading {fileinput.filename()} ---') print(' -> ' + line, end='') print() Copy code
There are two text files in the current directory , Running this command produces the following output :
$ python3 fileinput-example.py bacon.txt cupcake.txt --- Reading bacon.txt --- -> Spicy jalapeno bacon ipsum dolor amet in in aute est qui enim aliquip, -> irure cillum drumstick elit. -> Doner jowl shank ea exercitation landjaeger incididunt ut porchetta. -> Tenderloin bacon aliquip cupidatat chicken chuck quis anim et swine. -> Tri-tip doner kevin cillum ham veniam cow hamburger. -> Turkey pork loin cupidatat filet mignon capicola brisket cupim ad in. -> Ball tip dolor do magna laboris nisi pancetta nostrud doner. --- Reading cupcake.txt --- -> Cupcake ipsum dolor sit amet candy I love cheesecake fruitcake. -> Topping muffin cotton candy. -> Gummies macaroon jujubes jelly beans marzipan. Copy code
fileinput
Allows you to retrieve more information about each line , For example, is it the first line (.isfirstline()), Line number (.lineno()) And file name (.filename()).
You know how to use it now Python Perform the most common actions on files and filegroups . You've learned to use different built-in modules to read , Find and manipulate files .