「 This is my participation 2022 For the first time, the third challenge is 2 God , Check out the activity details :2022 For the first time, it's a challenge 」
In the last article Python-gRPC practice (1)--gRPC brief introduction A brief introduction to gRPC
Adopted HTTP2
As its transport protocol , as well as gRPC
How to pass HTTP2
The transmission of data , This article focuses on gRPC
Serialization protocol used --Protocol Buffer
.
Protobuf(Google Protocol Buffers) yes Google Cross language development , Cross platform , Scalable , Data transfer protocol for serializing structured data , At present, it has been widely used in data transmission between server and client , In the project gRPC
If you use it well, you must first understand it clearly Protocol Buffer The use of and grammar .
NOTE: Protobuf It's like Json It can also be used independently , Not limited to
gRPC
In this scene , We can base it on Protobuf Realize your own data serialization / Deserialization .
gRPC
In the early days, only Protobuf, The latest version has begun to support Json 了 , But not many people use . Why? gRPC
At first, choose Protobuf Well , One of the important reasons is Protobuf It is also Google's own product , such gRPC
When upgrading functions ,Protobuf It can also iterate in time , at present Protobuf The version of has been iterated to 3 edition , But only the second edition and the third edition can be accessed , Because the first version was used internally by Google . however gRPC
use Protobuf The important reason is that in common scenes ,gRPC
It is more efficient than what we use now Json higher ,Protobuf Why is your efficiency high ? There is no free lunch , There are gains and losses , Understanding Protobuf Before that, let's look at a passage Json data :
{
"project": "Test",
"timestamp": 1600000000,
"status": true,
"data": [
{
"demo_key": "fake_value",
},
{
"demo_key": "fake_value",
},
{
"demo_key": "fake_value",
}
]
}
Copy code
This paragraph Json Data is a piece of text , This is it. Json The first point of inefficiency -- Coding inefficiency . For example, fields status
Corresponding value true
In memory, only 1 Bytes , But in this data, it takes 4 Bytes , Another example is the field timestamp
The value of is int type ,int Types do not occupy much space in memory , But in Json Data is rendered as strings, which takes up more space . In addition, we can quickly see what is in this data , This is a Json An advantage of , But it also brings another disadvantage -- Information redundancy . For example, fields data
The data of is an array , But the structure inside is consistent , In this way, it will repeat and pass more n Secondary field name .
Protobuf To solve these problems , Firstly, some coding schemes with optimization are introduced , It solves the problem of inefficient coding , For example, it introduces VarInts Encode and decode numbers , This scheme can save digital space , At the same time, bit operation is used to encode and decode , Very efficient , The details can be obtained through Detailed explanation varint Coding principle Get to know . The other improvement is to remove the field name , Use field numbers instead , When transmitting, only the number is transmitted , In this way, the redundancy problem can be solved , But at this time, both parties need to have a translation of the record number, so that the real field name can be obtained through the field number , Just like moss code communication , And in the Protobuf in proto File is such a codebook , It records the relationship between fields and numbers, as well as the interface and service to which the request belongs . So above Python-gRPC practice (1)--gRPC brief introduction The result of catching bags in As an example , The picture shows the relationship with proto file :
syntax = "proto3";
package user;
import "google/protobuf/empty.proto";
// delete user
message DeleteUserRequest {
string uid = 1;
}
service User {
rpc delete_user(DeleteUserRequest) returns (google.protobuf.Empty);
}
Copy code
Relevant requests , The request indicates Field
by 1, The value is 999, After receiving the request, the receiver will start from proto Document check data , adopt URL Get this request is service by User
, rpc by delete_user
Request , So the requested message Namely DeleteUserRequest
, Then you will know Field
by 1 The actual field name is uid
.
Protobuf The coding principle of is worth seeing , At present, there are many online materials , Let's skip here and go directly to how to use Protobuf( Actually, I'm right now Protobuf I don't know much about the coding of - -).
As can be seen from the above example gRPC
When running, you need proto File to get the true field data , and gRPC
It's multilingual , Well, for each language gRPC
How to pass proto File to find out the data .
When we write the project , Most of them will generate the corresponding through an interface code OpenAPI file , Then other tools such as Swagger You can read the file and render a API file . and proto The role of documents is also similar to OpenAPI File similar , It's just not code generation , It's written by developers , Then the developers use different tools to base on proto File generates code in different languages and puts it into the project for use , So make good use of gRPC
You need to know how to write proto file ( Usually in use gRPC when , Is based on Protobuf File generates the corresponding calling code ).
Before introducing grammar , Have a look first proto What is the content of the document , First of all, let's look back at the above proto file :
syntax = "proto3";
package user;
import "google/protobuf/empty.proto";
// delete user
message DeleteUserRequest {
string uid = 1;
}
service User {
rpc delete_user(DeleteUserRequest) returns (google.protobuf.Empty);
}
Copy code
The standard proto Like this sample document, the document can be divided into three parts , The first part is the first three lines , This part is proto Declaration area of the document , The first line indicates the current proto The syntax of the file is proto3( There is no special explanation , The grammar introduced in this article is proto3), The second line indicates that the package name of the file is user, This will facilitate other files to import the definition of this file , The third line indicates import empty.proto
file , Next, you can use it in this file empty.proto
What the file defines .
The second part is 5-8 That's ok , This part is the message body area , Here's a definition called DeleteUserRequest
The body of the message , A message named uid Field of , And its type is string
, The field sort is 1. In actual development , Most of the changes take place in this part , And there are many points needing attention .
The third part is 11-13 That's ok , This part is the service definition area , Here's a definition called User
Service for , One of the services is called delete_user
Methods , And the request accepted by this method is DeleteUserRequest
Message body , The response is Empty
Message body . It can be simply understood that this part defines a class , At the same time, define some methods for each class , These methods only have function signatures , No specific implementation .
Understand the finished Proto After the file structure , You can start to understand Protobuf grammar .
When writing the message body , The most important point is the field number , As can be seen from the previous description , Protobuf Serialization of is translated by field number , So we should ensure that the field number and field are one-to-one correspondence , Generally, we should follow the field number from 1 Gradually increasing , For example, the following message body :
message DemoRequest {
string uid = 1;
string mobile = 2;
int32 age = 3;
}
Copy code
Its field numbers are gradually increasing , When adding new fields later, you should also specify the field number in an incremental way , Never reuse field numbers that once existed , Even if a certain field is reconstructed , For example, change the above message body :
// Usually, fields that have been used are not deleted , This is just a demonstration
message DemoRequest {
string uid = 1;
string mobile = 2;
int32 brithday = 4; // Uniformly use timestamp to represent date
}
Copy code
Although in the changed message body age Fields are brithday To replace the , however brithday The field number of is still increasing 1, This can prevent data parsing exceptions caused by the old version of the client when it does not change with the server .
However, using the method of increasing the field number can let developers know where the previous number is used , But these need to rely on the norms of the team to achieve without problems , So Protobuf Provides reserved
Field , Let's mask some field numbers that can't be used later , Examples are as follows :
message DemoRequest {
string uid = 1;
string mobile = 2;
reserved 3;
int32 brithday = 4; // Uniformly use timestamp to represent date
reserved 5, 6, 10 to 15 // reserved You can also limit multiple field numbers at once , They are in `,` Separate , You can also use `xx to xx` To limit a continuous field number .
}
Copy code
This example can avoid the use of field numbers in subsequent fields 3, That is to use Protobuf The compiler will also report errors , Prevent problems at the source .
NOTE The reason for requiring field numbers from 1 It starts to increase because Protobuf from message When encoding into binary message body , Field number 1-15 Will take up 1 Bytes ,16-2047 Will take up two bytes , priority of use 1-15 The field number of will reduce the transmission of data , If there are many fields in the message body at the beginning , You need to arrange the field numbers of commonly used fields in 1-15 Between . Besides ,19000 To 19999 It's for protocol buffers Realize the reserved field label , Definition message Can not be used when , If these numbers are used ,Protobuf The compiler will report an error .
stay Protobuf In the body of the message , The type of each field is fixed , Because the fixed type of transmission can reduce the occupation of transmission resources , So when we define the fields of the message body , The type of field must be defined in combination with business requirements , Here is a common Protobuf The basic field type is the same as Python Type cross reference table :
It should be noted that , Although the declared field does not indicate its value , But they all have default values :
meanwhile , The defined message body is also Protobuf A type in , This type is called Message
, It can be nested in other Message
in , Protobuf The grammar is as follows :
message DemoSubRequest {
string a = 1;
int32 b = 2;
}
message DemoRequest {
DemoRequest result = 1;
}
Copy code
It can also pass through import The grammar of , from a File import message body to b file , And be b Files use , For example, under the folder project
Yes a Document and b file , among a The documents are as follows :
// Declare the package name as demo_a
package demo_a;
// Define a message body
message DemoRequest {
DemoRequest result = 1;
}
Copy code
and b The document references a The message body of the file , The specific code is as follows :
// Declare the package name as demo_b
package demo_b
import "project/demo_a.proto";
message DemoRequest {
// quote a The message body of the file
project.demo_a.DemoRequest result = 1;
}
Copy code
Besides , Protobuf It also supports defining other types , These types have the following Python Usage of equivalent types , But there are still some differences when using :
Timestamp:
Timestamp yes Protobuf The type of time in ,Protobuf The syntax is as follows :
import "google/protobuf/timestamp.proto";
message DemoRequest {
google.protobuf.Timestamp timestamp = 1;
}
Copy code
This type is actually timestamp Encapsulation , Its default value is timestamp=0( The corresponding date is 1970-01-01), stay Python In the code , It can be done through grammar ToDatetime
To datetime, You can also use grammar FormDatetime
hold datetime To Protobuf Of Timestamp:
from google.protobuf.timestamp_pb2 import Timestamp
Timestamp().ToDatetime()
from datetime import datetime
Timestamp().FormDatetime(datetime.now())
Copy code
Repeated:
Repeated This field table can be repeated any number of times , It's like Python Of Sequence object , But in fact, it can be regarded as Python Of List object ,Protobuf Use Repeated The grammar is as follows :
message DemoRequest {
repeated int32 demo_list = 1;
}
// demo_list value like json
// [1, 2, 3, 4, 5, 6]
Copy code
The message body defines a demo_list
Field , The field is repeated And the internal type is int32, stay Python Use in Repeated Field method and use List The method is the same , But it is not inherited from List Of , Some libraries may need to be converted to List Can be used , such as pymysql
.
Map:
Although most of us are clear Key-Value To define the message body , however Protobuf It also provides a similar dict Of Map,Protobuf Use Map The grammar is as follows :
message DemoRequest {
map<string, int32> demo_map = 1;
}
// demo_map value like json
// {
// "aaa": 123,
// "bbb": 456
// }
Copy code
The message body defines a demo_map
Field , The field is map Type and key Type is string ,value The type is int32, stay Python Use in Map Method and use dict The method is the same , But it is not inherited from dict Of , Some libraries may need to be converted to dict Can be used , such as pymysql
.
NOTE:
- Map The field of type cannot be Repeated, because Repeated Is variable , It's like Python in Dict Of Key It can't be List equally .
- Map The fields of are unordered .
- If there are duplicate fields , Then there is one in the end .
Empty:
Empty yes Protobuf Represents empty type in , Follow Python Medium None equally , Generally, it is not used in the message body , It is used to mark a rpc Method returned null ,Protobuf The grammar is as follows :
import "google/protobuf/empty.proto";
service Demo {
rpc demo (DemoRequest) returns (google.protobuf.Empty);
}
Copy code
stay Python Through from google.protobuf.empty_pb2 import Empty
Import Empty
Object and use , But in the Python It's best not to put Empty
To Python Of None object , because Empty
It is only used to represent that the response of the request point is empty .
Enum: When defining message types , You may want one of the fields to have only one predefined value , Enumeration types are used at this time ,Protobuf Use Enum The grammar is as follows :
message DemoRequest {
enum Status {
open = 0;
half_open = 1;
close = 2;
}
Status status = 1;
}
Copy code
As the grammar shows , First, create a message named Status
Enumerated type of , Then define the type as Status
Field of status
, It is worth noting that enumeration definitions need to contain a constant mapping to 0 And as the first line of the definition , This is because Protobuf It is required that the value of a field in the defined enumeration value must be 0, When there is no default value defined for the field referenced to this type , Its default value is that the value of the enumeration type is 0 Field of .
The actual use gRPC When connecting services , These services do not use only one programming language , Some services may use Python Written , Some services are Java Written , Some services use Go Written . meanwhile , Not all services need to be updated when we release functions , Some services only need to use the old interface , For example, a server interface has been updated , This server corresponds to many clients , If there is no standardized management proto Word of the file , It is possible that all clients need to be upgraded , Instead of just upgrading the client that needs to be upgraded , So we need to manage according to the specification proto file , Reduce the burden of management .
At the beginning , The scheme I choose is the simplest file copy , This is also the way most people use when getting started , It's very easy to use , But the code reuse rate is very low , Copying files will become a burden when there are many projects , Sometimes you need to use diff Tools to compare , Very troublesome .
therefore , Later, I began to consider using version management tools to manage , because proto Files are a subset of the project , When choosing a plan, you will first think of Git Submodul, But this scheme has the risk of rolling back the point of failure , At the same time, we need to produce corresponding proto, More trouble .
The final plan is to build a new git Warehouse to store proto file , And tag To distinguish different versions . Use git Warehouse has another advantage that can be used CI/CD According to proto The file generates the code of the corresponding language and packaging , Some manual steps omitted .
First we need to create a Git Warehouse , Put the Proto The files are moved out and become a separate warehouse , Then according to git flow Process to update proto file , But it's updating Proto The following specifications should be followed when filing :
The common feature of these specifications is not to delete the source file , Every time, only add , So as to ensure that even if proto The file has changed , The old service can still be used normally without updating .
After the update, it can be used by other projects , For example, the current version of this library is 1.0.0, We according to the git flow Process to update proto File and generate the corresponding language code or release package , Finally, the corresponding tag label , about Python You can use this method to install or update dependencies :
pip install https://gitlab.xxx.com/proto/[email protected]
Copy code
And for Java This kind can be packaged into a release Version to maven Use .
Now I have a preliminary understanding gRPC
as well as Protobuf How to use , Next, a simple project will demonstrate how to use gRPC