The first part of this article is reproduced from :
python It's single threaded , Does multithreading make sense
The second half is written by myself .
I often meet friends who mention python It's single threaded , It doesn't make sense to use multithreading when writing code , Today, I would like to share with you about python Single thread and multi thread 、 Multi process related understanding .
First python It's single threaded This is not true .
Here's a concept :Python The global interpreter lock (GIL)
One thing that needs to be clear is GIL Not at all Python Characteristics of , It's about achieving Python Parser (CPython) A concept introduced by . like C++ It's a set of languages ( grammar ) standard , But different compilers can be used to compile executable code . Famous compilers such as GCC,INTEL C++,Visual C++ etc. .Python It's the same thing , The same piece of code can be passed through CPython,PyPy,Psyco Such as different Python Execution environment to execute . Like one of them JPython There is no GIL. But because CPython Is the default in most environments Python execution environment . So in many people's concepts CPython Namely Python, Take it for granted GIL It comes down to Python The flaw of language . So let's be clear here :GIL Not at all Python Characteristics of ,Python It can be completely independent of GIL
import threading
import time
def test1():
for i in range(100000000):
a = 100 - i
def test2():
threads = []
t1 = threading.Thread(target=test1)
t2 = threading.Thread(target=test1)
t3 = threading.Thread(target=test1)
t4 = threading.Thread(target=test1)
threads.append(t1)
threads.append(t2)
threads.append(t3)
threads.append(t4)
threads[0].start()
threads[1].start()
threads[2].start()
threads[3].start()
threads[0].join()
threads[1].join()
threads[2].join()
threads[3].join()
if __name__ == '__main__':
t1 = time.time()
print(' Start time of process one :', time.time()) # Single thread once :
test1()
print(' Single thread once :', time.time() - t1) # Single thread once : 3.872997760772705
test1()
print(' Single thread twice :', time.time() - t1) # Single thread twice : 7.738230466842651
test1()
print(' Single thread three times :', time.time() - t1) # Single thread three times : 11.609771013259888
test1()
print(' Single thread four times :', time.time() - t1) # Single thread four times : 15.493367433547974
t2 = time.time()
test2()
print(' process 1 Multithreading four times :', time.time() - t2) # Multithreading four times : 15.55045747756958
print(' End time of process one :', time.time()) # End of process :
After this code is executed, you will find 4 The time consumed by simultaneous execution of threads is almost the same as that consumed by one thread ,python Multithreading is really useless in improving efficiency . Because it said ,GIL The existence of , It is as efficient as single thread processing .
The essence is this 4 Threads execute alternately , You do it for a while , I'll do it for a while , He did it for a while … It is a very harmonious random execution on a single core
because Python Of GIL The limitation of , Multithreading is more suitable for I/O Intensive application (I/O The release of the GIL, You can allow more concurrency ), Instead of computing intensive applications . In the latter case , For better parallelism , You need to use multiple processes , In order to make CPU Other kernels of .
How to start multiple processes , And verifying whether multiple processes are automatically allocated to multiple CPU It is implemented separately on the ?
The code of our modification appeal is as follows , And copy 4 Share , Named as :
threadTest1.py,threadTest2.py,threadTest3.py,threadTest4.py.
The code is as follows :
import threading
import time
def test1():
for i in range(100000000):
a = 100 - i
def test2():
threads = []
t1 = threading.Thread(target=test1)
t2 = threading.Thread(target=test1)
t3 = threading.Thread(target=test1)
t4 = threading.Thread(target=test1)
threads.append(t1)
threads.append(t2)
threads.append(t3)
threads.append(t4)
threads[0].start()
threads[1].start()
threads[2].start()
threads[3].start()
threads[0].join()
threads[1].join()
threads[2].join()
threads[3].join()
if __name__ == '__main__':
t1 = time.time()
print(' Start time of process one :', time.time()) # Single thread once :
test2()
print(' process 1 Multithreading four times :', time.time() - t1) # Multithreading four times : 15.55045747756958
print(' End time of process one :', time.time()) # End of process :
Then start up quickly 4 Procedures , View run time , You can see , My old computer cpu Tetranuclear , When 4 When two programs are up at the same time , My computer CPU All four cores are occupied , intend 4 Processes are automatically assigned to 4 Nuclear . So we have implemented parallelism .
The operation results are as follows : A single program runs 32.36 second , All 4 Programs are running ,71.65 second , Remove the manual startup time delay 6 second , It can be approximated as 4 Programs are running almost at the same time 64 second . in other words ,4 The running time of a program and the running time of a single program 2 times , But it's not 4 Programs executed in sequence 4 times . The overall speed of the program has been improved . Why is it twice ? Subsequent explanation .
Start time of process one : 1613957587.375495
process 1 Multithreading four times : 64.36468148231506
End time of process one : 1613957651.7401764
Process 4 start time : 1613957595.347951
process 4 Multithreading four times : 63.67764210700989
Process 4 end time : 1613957659.025593
As a contrast , We let a single program for loop 4 Time , have a look CPU Usage situation . Just a little change .
if __name__ == '__main__':
t1 = time.time()
print(' Start time of process one :', time.time()) # Single thread once :
for i in range(4):
test2()
print(' process 1 Multithreading runs four times :', time.time() - t1) # Multithreading four times
print(' End time of process one :', time.time()) # End of process :
function CPU situation
You can see cpu It only took 25%, So only one CPU nucleus . Verified our 4 A program is equivalent to 4 A process , Automatically assigned to 4 nucleus , A program is assigned to a core .
Running results :
Start time of process one : 1613959221.2839491
process 1 Multithreading runs four times : 128.84336948394775
End time of process one : 1613959350.1273186
Is basically 32 Of a second 4 times . Because running a single test2(), need 32 second , as follows :
Start time of process one : 1613959931.0745466
process 1 Multithreading runs four times : 31.243787050247192
End time of process one : 1613959962.3183336
In line with our expectations .
So the conclusion :
For many IO Less intensive computing scenarios , There is no problem with multithreading , For computationally intensive scenarios , Using multiple processes can make full use of CPU Multi-core parallel processing , Simple processing is to isolate the data reasonably , Just copy it into multiple programs and run it together .
in addition CPU It's full 100% I feel that the performance will decline instead , We just run 3 A process to see , How long does a single process take .
Process 2 start time : 1613960078.199962
process 2 Multithreading four times : 50.52488970756531
Process 2 end time : 1613960128.7248516
cpu situation :
see , A single process only needs 50.52488970756531, Theoretically, there should still be some CPU The scramble for resources , Because of my CPU yes 2 Nuclear Physics Nuclear virtual 4 nucleus , function 2 A process ?
cpu situation :
Look at the results :
Start time of process one : 1613960226.8174622
process 1 Multithreading runs four times : 40.730329751968384
End time of process one : 1613960267.547792
And a single process 32 Second is close , Therefore, it makes full use of 2 A physical nucleus ,CPU The scramble is down .
So the conclusion : The number of processes should be the same as CPU The number of physical cores remains the same , The running efficiency of the process is relatively high .