The duration of this section needs to be controlled 15 Within minutes
IO Reuse : To explain the term , Let's first understand the concept of reuse , Reuse means sharing , This understanding is still a bit abstract ,
So , Let's understand the use of multiplexing in the field of communication , In the field of communication, in order to make full use of the physical medium of network connection ,
Time division multiplexing (TDM) or frequency division multiplexing (FDM) technology is often used on the same network link to transmit multiple signals on the same link , Here we basically understand the meaning of reuse ,
That is to say, share some “ medium ” To do the same thing as much as possible ( nature ) What happened , that IO Reuse of “ medium ” What is it? ? So let's first look at the server programming model ,
The request from the client, the server will produce a process to serve it , Every time a customer request comes, a process is generated to serve , However, the process can not be unlimited ,
Therefore, in order to solve the problem of a large number of client access , Introduced IO Multiplexing technology , namely : A process can serve multiple client requests at the same time .
in other words IO Reuse of “ medium ” Is a process ( To be exact select and poll, Because processes also rely on calls select and poll To achieve ),
Reuse a process (select and poll) To many IO Carry out services , Although the client sent IO It's concurrent, but IO The required read and write data is not ready in most cases ,
So you can use a function (select and poll) To listen to IO The state of the data needed , once IO There's data to read and write , The process comes to this IO Carry out services .
After understanding IO After reuse , Let's see how to achieve IO Three of reuse API(select、poll and epoll) The difference and connection
select,poll,epoll All are IO Multiplexing mechanism ,I/O Multiplexing is through a mechanism , Can monitor multiple descriptors , Once a descriptor is ready (
Generally read ready or write ready ), It can inform the application to read and write . but select,poll,epoll It's all synchronous in nature I/O,
Because they all need to be responsible for reading and writing after the reading and writing events are ready , That is to say, the reading and writing process is blocked , The asynchronous I/O You don't need to be responsible for reading and writing ,
asynchronous I/O The implementation of will be responsible for copying data from the kernel to user space . The prototypes of the three are shown below :
int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);
int poll(struct pollfd *fds, nfds_t nfds, int timeout);
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
1.select The first parameter of nfds by fdset The maximum descriptor value in the set plus 1,fdset It's a set of digits , Its size is limited to __FD_SETSIZE(1024),
Each bit of the bit set represents whether its corresponding descriptor needs to be checked . The second, third, and fourth parameters indicate the need to pay attention to read 、 Write 、 Error event file descriptor bit array ,
These parameters are both input and output parameters , It may be modified by the kernel to indicate which descriptors have events of interest ,
So every time I call select Need to be reinitialized before fdset.timeout The parameter is timeout , This structure is modified by the kernel , Its value is the time remaining after the timeout .
select The call steps of are as follows :
(1) Use copy_from_user Copy from user space fdset To kernel space
(2) Register callback function __pollwait
(3) Traverse all of fd, Call its corresponding poll Method ( about socket, This poll The method is sock_poll,sock_poll Call the tcp_poll,udp_poll
perhaps datagram_poll)
(4) With tcp_poll For example , Its core implementation is __pollwait, That is, the callback function registered above .
(5)__pollwait Our main job is to put current( The current process ) Hang to the device's waiting queue , Different devices have different waiting queues , about tcp_poll Come on ,
The waiting queue is sk->sk_sleep( Note that hanging a process in the wait queue does not mean that the process is sleeping ). A message is received at the device ( Network devices ) Or fill in the file data
( Disk device ) after , Will wake up the process waiting for sleep on the queue , At this time current I was awakened .
(6)poll Method returns a statement describing whether the read and write operations are ready mask Mask , According to this mask Mask to fd_set assignment .
(7) If you go through all of fd, It has not returned a readable / writable mask Mask , It will call schedule_timeout Is to call select The process of ( That is to say current)
Go to sleep . When the device driver reads and writes its own resources , Will wake up the process waiting for sleep on the queue . If it exceeds a certain time limit (schedule_timeout Appoint ),
No one wakes up , Call select The process will be awakened again CPU, And then go through it again fd, Judge whether there is a ready fd.
(8) hold fd_set Copy from kernel space to user space .
Sum up select Several shortcomings of :
(1) Every time you call select, All need to put fd Sets are copied from user state to kernel state , The cost is in fd A lot of times it's big
(2) Every call at the same time select You need to traverse all the passed in the kernel fd, The cost is in fd A lot of times it's big
(3)select The number of file descriptors supported is too small , The default is 1024
2. poll And select Different , Through one pollfd Arrays pass events to the kernel that need attention , So there is no limit to the number of descriptors ,pollfd Medium events Fields and revents , respectively,
It is used to mark the events of interest and occurrence , so pollfd Arrays only need to be initialized once .
poll The implementation mechanism and select similar , It corresponds to the sys_poll, It's just poll Pass... To the kernel pollfd Array , Then on pollfd For each descriptor in the poll,
Comparison treatment fdset Come on ,poll More efficient .poll After the return , Need to be right pollfd For each element in the revents value , Whether or not the event happened .
3. until Linux2.6 Only then has appeared by the kernel direct support realization method , That's it epoll, Be recognized as Linux2.6 The best multichannel performance I/O Ready notification method .
epoll It can support both horizontal trigger and edge trigger (Edge Triggered, Just tell the process which file descriptors have just become ready , It only says it once , If we don't act ,
Then it will not tell , This is called edge trigger ), In theory, edge triggered performance is higher , But the code implementation is quite complex .
epoll Also only those file descriptors that are ready , And when we call epoll_wait() When getting the ready file descriptor , Not the actual descriptor returned ,
Instead, a value representing the number of ready descriptors , You just need to go epoll You can get the corresponding number of file descriptors in a specified array , Memory mapping is also used here (mmap) technology ,
In this way, the cost of copying these file descriptors during system call is completely eliminated . Another essential improvement is epoll Use event based ready notification . stay select/poll in ,
The process only calls a certain method after , The kernel scans all monitored file descriptors , and epoll Pass in advance epoll_ctl() To register a file descriptor ,
Once a file descriptor is ready , The kernel will use something like callback The callback mechanism of , Quickly activate the file descriptor , When the process calls epoll_wait() I'll be informed when .
epoll Since it is right. select and poll Improvement , We should be able to avoid the above three shortcomings . that epoll How to solve these problems ? Before that , Let's take a look first epoll and select and
poll The call interface of ,select and poll They provide only one function ——select perhaps poll function . and epoll Three letters were provided Count ,
epoll_create,epoll_ctl and epoll_wait,epoll_create Is to create a epoll Handle ;epoll_ctl It's a note The type of event to listen to ;
epoll_wait It's waiting for events to happen .
For the first disadvantage ,epoll The solution is epoll_ctl Function . Every time you register a new event to epoll When in the handle ( stay epoll_ctl It is specified in EPOLL_CTL_ADD),
Will take all of fd Copy into the kernel , Not in epoll_wait Duplicate copies when .epoll Guaranteed every fd Only copies will be made throughout the process once .
For the second disadvantage ,epoll The solution is not like select or poll Every time current Take turns to join fd The corresponding device is waiting in the queue , Only in epoll_ctl When the
current Hang it up ( This time it's necessary ) And for each fd Specify a callback function , When the device is ready , When waking up the waiters on the waiting queue , The callback is called function ,
And this callback function will be ready to fd Add a ready list ).epoll_wait In fact, you can check whether there are ready ones in the ready list fd
( utilize schedule_timeout() Get some sleep , Judge the effect of a moment , and select The... In realization 7 Step is similar ).
For the third disadvantage ,epoll There is no such restriction , It supports FD The upper limit is the maximum number of files that can be opened , The number is generally much higher 2048, for instance ,
stay 1GB Memory on the machine is approximately 10 All around , The specific number can be cat /proc/sys/fs/file-max see , In general, this number has a lot to do with system memory .
summary :
(1)select,poll The implementation needs to poll all fd aggregate , Until the equipment is ready , You may have to sleep and wake up several times . and epoll In fact, it also needs to call epoll_wait
Keep polling the ready list , During this period, sleep and wakeup may alternate several times , But it's when the device is ready , Call callback function , Put it in place fd Put it in the ready list , And wake up in epoll_wait in
Process of entering sleep . Although we should sleep and alternate , however select and poll stay “ Awake ” It's time to traverse the whole fd aggregate , and epoll stay “ Awake ” Of Just judge the ready linked list
Whether it is empty or not , This saves a lot of CPU Time , This is the performance improvement brought by callback mechanism .
(2)select,poll Every time you call, you have to put fd Set copies from user state to kernel state once , And put current Hang once in the device wait queue , and epoll as long as One copy ,
And put current Hang up the waiting queue only once ( stay epoll_wait The beginning of , Note that the wait queue here is not a device wait queue , Just one. epoll Inside
Waiting queue defined by the Department ), It can also save a lot of money .
These three IO Multiplexing models have different supports on different platforms , and epoll stay windows I don't support it , Good thing we have selectors modular , Help us choose the most suitable one under the current platform by default
# Server side
from socket import *
import selectors
sel=selectors.DefaultSelector()
def accept(server_fileobj,mask):
conn,addr=server_fileobj.accept()
sel.register(conn,selectors.EVENT_READ,read)
def read(conn,mask):
try:
data=conn.recv(1024)
if not data:
print('closing',conn)
sel.unregister(conn)
conn.close()
return
conn.send(data.upper()+b'_SB')
except Exception:
print('closing', conn)
sel.unregister(conn)
conn.close()
server_fileobj=socket(AF_INET,SOCK_STREAM)
server_fileobj.setsockopt(SOL_SOCKET,SO_REUSEADDR,1)
server_fileobj.bind(('127.0.0.1',8088))
server_fileobj.listen(5)
server_fileobj.setblocking(False) # Set up socket The interface of is non blocking
sel.register(server_fileobj,selectors.EVENT_READ,accept) # It's like a net select In my reading list append A file handle has been created
#server_fileobj, And bound a callback function accept
while True:
events=sel.select() # Check all the fileobj, Whether it has been completed wait data Of
for sel_obj,mask in events:
callback=sel_obj.data #callback=accpet
callback(sel_obj.fileobj,mask) #accpet(server_fileobj,1)
# client
from socket import *
c=socket(AF_INET,SOCK_STREAM)
c.connect(('127.0.0.1',8088))
while True:
msg=input('>>: ')
if not msg:continue
c.send(msg.encode('utf-8'))
data=c.recv(1024)
print(data.decode('utf-8'))
be based on selectors Modules implement concurrent FTP
Reference resources : link : https://pan.baidu.com/s/1qYPrHCg password : 9is4