MongoDB By C++ Non relational database written in language , It is an open source database system based on distributed file storage , Its content storage form is similar to JSON object , Its field values can contain other documents 、 Arrays and document arrays , Very flexible . In this section , Let's have a look Python 3 Next MongoDB Storage operations for .
Before we start , Please make sure it's installed MongoDB And launched its service , And it's installed Python Of PyMongo library .
Connect MongoDB when , We need to use PyMongo In the library MongoClient
. Generally speaking , Pass in MongoDB Of IP And port , The first parameter is the address host
, The second parameter is port port
( If you don't pass it parameters , The default is 27017):
import pymongo client = pymongo.MongoClient(host='localhost', port=27017)
So you can create MongoDB The connection object of .
in addition ,MongoClient
The first parameter of host
It can also be passed directly into MongoDB Connection string of , It uses mongodb
start , for example :
client = MongoClient('mongodb://localhost:27017/')
This can also achieve the same connection effect .
MongoDB Multiple databases can be set up in , Next we need to specify which database to operate on . Here we have test Database as an example to illustrate , The next step is to specify the database to be used in the program :
db = client.test
This call client
Of test
Property to return test database . Of course , We can also specify :
db = client['test']
These two ways are equivalent .
MongoDB Each database of contains many collections (collection), They are similar to tables in a relational database .
The next step is to specify the set to operate on , Here you specify a collection named students. Similar to a specified database , There are also two ways to specify sets :
collection = db.students
collection = db['students']
So we have a statement Collection
object .
Next , You can insert data . about students This collection , New student data , This data is represented in a dictionary :
student = { 'id': '20170101', 'name': 'Jordan', 'age': 20, 'gender': 'male' }
The student number is specified here 、 full name 、 Age and gender . Next , Call directly collection
Of insert()
Method to insert data , The code is as follows :
result = collection.insert(student) print(result)
stay MongoDB in , Every data actually has a _id
Attribute to uniquely identify . If the property is not explicitly specified ,MongoDB Will automatically generate a ObjectId
Type of _id
attribute .insert()
Method returns after execution _id
value .
The operation results are as follows :
5932a68615c2606814c91f3d
Of course , We can also insert multiple pieces of data at the same time , Just pass it as a list , Examples are as follows :
student1 = { 'id': '20170101', 'name': 'Jordan', 'age': 20, 'gender': 'male' } student2 = { 'id': '20170202', 'name': 'Mike', 'age': 21, 'gender': 'male' } result = collection.insert([student1, student2]) print(result)
The return result corresponds to _id
Set :
[ObjectId('5932a80115c2606a59e8a048'), ObjectId('5932a80115c2606a59e8a049')]
actually , stay PyMongo 3.x In the version , It's not officially recommended insert()
The method . Of course , There's nothing wrong with continuing to use it . Official recommendation insert_one()
and insert_many()
Method to insert a single record and multiple records respectively , Examples are as follows :
student = { 'id': '20170101', 'name': 'Jordan', 'age': 20, 'gender': 'male' } result = collection.insert_one(student) print(result) print(result.inserted_id)
The operation results are as follows :
<pymongo.results.InsertOneResult object at 0x10d68b558> 5932ab0f15c2606f0c1cf6c5
And insert()
The method is different , This time back is InsertOneResult
object , We can call it inserted_id
Property acquisition _id
.
about insert_many()
Method , We can pass the data as a list , Examples are as follows :
student1 = { 'id': '20170101', 'name': 'Jordan', 'age': 20, 'gender': 'male' } student2 = { 'id': '20170202', 'name': 'Mike', 'age': 21, 'gender': 'male' } result = collection.insert_many([student1, student2]) print(result) print(result.inserted_ids)
The operation results are as follows :
<pymongo.results.InsertManyResult object at 0x101dea558> [ObjectId('5932abf415c2607083d3b2ac'), ObjectId('5932abf415c2607083d3b2ad')]
The return type of this method is InsertManyResult
, call inserted_ids
Property to get the _id
list .
After inserting data , We can use find_one()
or find()
Method to query , among find_one()
The query results in a single result ,find()
Returns a generator object . Examples are as follows :
result = collection.find_one({'name': 'Mike'}) print(type(result)) print(result)
Here we check name
by Mike
The data of , Its return result is the dictionary type , The operation results are as follows :
<class 'dict'> {'_id': ObjectId('5932a80115c2606a59e8a049'), 'id': '20170202', 'name': 'Mike', 'age': 21, 'gender': 'male'}
You can find , It's more _id
attribute , This is it. MongoDB Added automatically during insertion .
Besides , We can also ObjectId
To query , You need to use bson In the library objectid
:
from bson.objectid import ObjectId result = collection.find_one({'_id': ObjectId('593278c115c2602667ec6bae')}) print(result)
The query result is still dictionary type , As follows :
{'_id': ObjectId('593278c115c2602667ec6bae'), 'id': '20170101', 'name': 'Jordan', 'age': 20, 'gender': 'male'}
Of course , If the query result does not exist , Will return None
.
For multiple data queries , We can use find()
Method . for example , Look here for age 20 The data of , Examples are as follows :
results = collection.find({'age': 20}) print(results) for result in results: print(result)
The operation results are as follows :
<pymongo.cursor.Cursor object at 0x1032d5128> {'_id': ObjectId('593278c115c2602667ec6bae'), 'id': '20170101', 'name': 'Jordan', 'age': 20, 'gender': 'male'} {'_id': ObjectId('593278c815c2602678bb2b8d'), 'id': '20170102', 'name': 'Kevin', 'age': 20, 'gender': 'male'} {'_id': ObjectId('593278d815c260269d7645a8'), 'id': '20170103', 'name': 'Harden', 'age': 20, 'gender': 'male'}
The return is Cursor
type , It's equivalent to a generator , We need to traverse to get all the results , Each of these results is a dictionary type .
If you want to query a person older than 20 The data of , It is written as follows :
results = collection.find({'age': {'$gt': 20}})
The query condition key value here is no longer a simple number , It's a dictionary , Its key name is comparison symbol $gt
, It means greater than , The key value is 20.
The comparison symbols are summarized in the following table .
Symbol
meaning
Example
$lt
Less than
{'age': {'$lt': 20}}
$gt
Greater than
{'age': {'$gt': 20}}
$lte
Less than or equal to
{'age': {'$lte': 20}}
$gte
Greater than or equal to
{'age': {'$gte': 20}}
$ne
It's not equal to
{'age': {'$ne': 20}}
$in
In scope
{'age': {'$in': [20, 23]}}
$nin
Out of range
{'age': {'$nin': [20, 23]}}
in addition , You can also do regular matching queries . for example , Look up the name with M Student data at the beginning , Examples are as follows :
results = collection.find({'name': {'$regex': '^M.*'}})
Use here $regex
To specify regular matching ,^M.*
Representative to M Regular expression at the beginning .
Here, some function symbols are classified as the following table .
Symbol
meaning
Example
Example meaning
$regex
Match regular expression
{'name': {'$regex': '^M.*'}}
name With M start
$exists
Whether the attribute exists
{'name': {'$exists': True}}
name Attributes exist
$type
Type judgment
{'age': {'$type': 'int'}}
age The type of int
$mod
Digital analog operation
{'age': {'$mod': [5, 0]}}
Age model 5 more than 0
$text
Text query
{'$text': {'$search': 'Mike'}}
text Type contains Mike character string
$where
Advanced condition query
{'$where': 'obj.fans_count == obj.follows_count'}
The number of fans is equal to the number of followers
More detailed usage of these operations , Can be in MongoDB Official documents found : https://docs.mongodb.com/manual/reference/operator/query/.
To count the number of data in the query results , You can call count()
Method . such as , Count all the data :
count = collection.find().count() print(count)
Or statistics meet a certain condition of the data :
count = collection.find({'age': 20}).count() print(count)
The running result is a number , That is, the number of qualified data .
Sorting time , Call directly sort()
Method , And pass in the sorted field and ascending descending order flag . Examples are as follows :
results = collection.find().sort('name', pymongo.ASCENDING) print([result['name'] for result in results])
The operation results are as follows :
['Harden', 'Jordan', 'Kevin', 'Mark', 'Mike']
Here we call pymongo.ASCENDING
Specify ascending order . If you want to sort them in descending order , You can pass in pymongo.DESCENDING
.
In some cases , We might want to take just a few elements , You can use skip()
Method offset several positions , For example, offset 2, Just ignore the first two elements , Get the third and later elements :
results = collection.find().sort('name', pymongo.ASCENDING).skip(2) print([result['name'] for result in results])
The operation results are as follows :
['Kevin', 'Mark', 'Mike']
in addition , You can also use limit()
Method to specify the number of results to take , Examples are as follows :
results = collection.find().sort('name', pymongo.ASCENDING).skip(2).limit(2) print([result['name'] for result in results])
The operation results are as follows :
['Kevin', 'Mark']
If not used limit()
Method , It would have returned three results , With restrictions , Will intercept two results and return .
It is worth noting that , When the number of databases is very large , If ten million 、 Billion level , It's best not to use large offsets to query data , Because this is likely to lead to memory overflow . At this point, you can use operations similar to the following to query :
from bson.objectid import ObjectId collection.find({'_id': {'$gt': ObjectId('593278c815c2602678bb2b8d')}})
At this time, you need to record the last query _id
.
For data updates , We can use update()
Method , Specify the updated conditions and the updated data . for example :
condition = {'name': 'Kevin'} student = collection.find_one(condition) student['age'] = 25 result = collection.update(condition, student) print(result)
Here we want to update name
by Kevin
The age of the data : First, specify the query criteria , And then look up the data , After changing age, call update()
Method to transfer the original condition and modified data into .
The operation results are as follows :
{'ok': 1, 'nModified': 1, 'n': 1, 'updatedExisting': True}
The returned result is in dictionary form ,ok
On behalf of successful execution ,nModified
Represents the number of data affected .
in addition , We can also use $set
Operators update the data , The code is as follows :
result = collection.update(condition, {'$set': student})
So you can just update student
Fields that exist in the dictionary . If there were other fields , It will not be updated , It won't delete . And if not $set
Words , Then all the previous data will be used student
Dictionary replacement ; If there are other fields , Will be deleted .
in addition ,update()
In fact, the method is not recommended by the government . It's also divided into update_one()
Methods and update_many()
Method , The usage is more strict , Their second parameter needs to use $
The type operator is used as the key name of the dictionary , Examples are as follows :
condition = {'name': 'Kevin'} student = collection.find_one(condition) student['age'] = 26 result = collection.update_one(condition, {'$set': student}) print(result) print(result.matched_count, result.modified_count)
Here we call update_one()
Method , The second parameter can no longer be passed directly into the modified dictionary , It needs to be used {'$set': student}
Form like this , The return result is UpdateResult
type . Then call... Separately matched_count
and modified_count
attribute , The number of matching data and the number of affected data can be obtained .
The operation results are as follows :
<pymongo.results.UpdateResult object at 0x10d17b678> 1 0
Let's do another example :
condition = {'age': {'$gt': 20}} result = collection.update_one(condition, {'$inc': {'age': 1}}) print(result) print(result.matched_count, result.modified_count)
Here you specify that the query condition is older than 20, Then update the condition to {'$inc': {'age': 1}}
, That's age plus 1, After execution, the age of the first eligible data will be increased by 1.
The operation results are as follows :
<pymongo.results.UpdateResult object at 0x10b8874c8> 1 1
You can see that the number of matches is 1 strip , The number of influence items is also 1 strip .
If the update_many()
Method , All eligible data will be updated , Examples are as follows :
condition = {'age': {'$gt': 20}} result = collection.update_many(condition, {'$inc': {'age': 1}}) print(result) print(result.matched_count, result.modified_count)
At this time, the number of matches is no longer 1 The article , The operation results are as follows :
<pymongo.results.UpdateResult object at 0x10c6384c8> 3 3
You can see , At this time, all the matched data will be updated .
Delete operation is relatively simple , Call directly remove()
Method to specify the conditions for deletion , At this point, all data that meet the conditions will be deleted . Examples are as follows :
result = collection.remove({'name': 'Kevin'}) print(result)
The operation results are as follows :
{'ok': 1, 'n': 1}
in addition , There are still two new ways of recommendation ——delete_one()
and delete_many()
. Examples are as follows :
result = collection.delete_one({'name': 'Kevin'}) print(result) print(result.deleted_count) result = collection.delete_many({'age': {'$lt': 25}}) print(result.deleted_count)
The operation results are as follows :
<pymongo.results.DeleteResult object at 0x10e6ba4c8> 1 4
delete_one()
That is, delete the first qualified data ,delete_many()
That is to delete all eligible data . Their return results are DeleteResult
type , You can call deleted_count
Property to get the number of deleted data .
in addition ,PyMongo Some combination methods are also provided , Such as find_one_and_delete()
、find_one_and_replace()
and find_one_and_update()
, They are found and deleted 、 Replacement and update operations , Its usage is basically consistent with the above method .
in addition , You can also operate on indexes , Related methods are create_index()
、create_indexes()
and drop_index()
etc. .
About PyMongo Detailed usage , See the official document :http://api.mongodb.com/python/current/api/pymongo/collection.html.
in addition , There are also some operations on the database and the collection itself , No more explanation here , See the official document :http://api.mongodb.com/python/current/api/pymongo/.
This section explains how to use PyMongo operation MongoDB Methods of data addition, deletion, modification and query .
This resource starts from Cui Qingcai's personal blog Jingmi : Python3 Practical course of web crawler development | Quiet search
In this paper, from https://juejin.cn/post/6844903597465927694, If there is any infringement , Please contact to delete .