Index:Elasticsearch The logical area used to store data , It's similar to... In a relational database database Concept . One index Can be in one or more shard above , At the same time shard There may be more than one replicas.
Document:Elasticsearch The entity data stored in it , Similar to one of the relational data table A line of data inside .
document By multiple field form , Different document It has the same name field Must have the same type .document Inside field Can be repeated , That's one field There will be multiple values , namely multivalued.
Document type: In order to query , One index There may be many document, That is to say document type. It's similar to... In a relational database table Concept . But we need to pay attention , Different document It has the same name field It must be the same type .
Mapping: It's similar to... In a relational database schema Define concepts . Storage field Related mapping information of , Different document type It will be different mapping.
The picture below is ElasticSearch Compared with some terms of relational database :
Relationnal database
Elasticsearch
Database
Index
Table
Type
Row
Document
Column
Field
Schema
Mapping
Schema
Mapping
Index
Everything is indexed
SQL
Query DSL
SELECT * FROM table…
GET http://…
UPDATE table SET
PUT http://…
Connect Es:
import elasticsearch es = elasticsearch.Elasticsearch([{'host': '127.0.0.1', 'port': 9200}]) Copy code
First look at the search ,q
Search content , Space pair q
Query results have no impact ,size
Specified number ,from_
Specify starting position ,filter_path
You can specify the data you want to display , As shown in this example, only _id
and _type
.
res_3 = es.search(index="bank", q="Holmes", size=1, from_=1) res_4 = es.search(index="bank", q=" 39225 5686 ", size=1000, filter_path=['hits.hits._id', 'hits.hits._type']) Copy code
Query all data of the specified index :
among ,index Specify the index , A string represents an index ; A list represents multiple indexes , Such as index=["bank", "banner", "country"]
; The regular form represents multiple indexes that meet the conditions , Such as index=["apple*"]
, Said to apple
All the indexes at the beginning .
search
You can also specify specific doc-type
.
from elasticsearch_dsl import Search s = Search(using=es, index="index-test").execute() print s.to_dict() Copy code
Query... According to a certain field , Multiple query conditions can be overlapped :
s = Search(using=es, index="index-test").query("match", sip="192.168.1.1") s = s.query("match", dip="192.168.1.2") s = s.excute() Copy code
Multi field query :
from elasticsearch_dsl.query import MultiMatch, Match multi_match = MultiMatch(query='hello', fields=['title', 'content']) s = Search(using=es, index="index-test").query(multi_match) s = s.execute() print s.to_dict() Copy code
You can also use Q()
Object to query multiple fields ,fields
It's a list ,query
For the value to be queried .
from elasticsearch_dsl import Q q = Q("multi_match", query="hello", fields=['title', 'content']) s = s.query(q).execute() print s.to_dict() Copy code
Q()
The first parameter is the query method , It can also be bool
.
q = Q('bool', must=[Q('match', title='hello'), Q('match', content='world')]) s = s.query(q).execute() print s.to_dict() Copy code
adopt Q()
Make a combination query , Equivalent to the other way of writing the above query .
q = Q("match", title='python') | Q("match", title='django') s = s.query(q).execute() print(s.to_dict()) # {"bool": {"should": [...]}} q = Q("match", title='python') & Q("match", title='django') s = s.query(q).execute() print(s.to_dict()) # {"bool": {"must": [...]}} q = ~Q("match", title="python") s = s.query(q).execute() print(s.to_dict()) # {"bool": {"must_not": [...]}} Copy code
Filter , Here is the range filter ,range
Is the method ,timestamp
It's what we're looking for field
name ,gte
Is greater than or equal to ,lt
Is less than , Set it up as needed .
About term
and match
The difference between ,term
It's an exact match ,match
Will blur , Can do word segmentation , Return match score ,(term
If you look up a string of lowercase letters , If there is uppercase, it will return null, i.e. no hit ,match
It can be queried regardless of case , The return result is the same )
# Range queries s = s.filter("range", timestamp={"gte": 0, "lt": time.time()}).query("match", country="in") # General filtration res_3 = s.filter("terms", balance_num=["39225", "5686"]).execute() Copy code
Other writing :
s = Search() s = s.filter('terms', tags=['search', 'python']) print(s.to_dict()) # {'query': {'bool': {'filter': [{'terms': {'tags': ['search', 'python']}}]}}} s = s.query('bool', filter=[Q('terms', tags=['search', 'python'])]) print(s.to_dict()) # {'query': {'bool': {'filter': [{'terms': {'tags': ['search', 'python']}}]}}} s = s.exclude('terms', tags=['search', 'python']) # perhaps s = s.query('bool', filter=[~Q('terms', tags=['search', 'python'])]) print(s.to_dict()) # {'query': {'bool': {'filter': [{'bool': {'must_not': [{'terms': {'tags': ['search', 'python']}}]}}]}}} Copy code
Aggregations can be placed in queries , Filtering and other operations are overlapped , Need to add aggs
.
bucket
It's grouping , The first parameter is the name of the group , Just make your own appointment , The second parameter is the method , The third is designated field
.
metric
Also the same ,metric
The way to do this is sum
、avg
、max
、min
etc. , But it should be noted that , There are two ways to return these values at once ,stats
and extended_stats
, The latter can also return variance equivalence .
# example 1 s.aggs.bucket("per_country", "terms", field="timestamp").metric("sum_click", "stats", field="click").metric("sum_request", "stats", field="request") # example 2 s.aggs.bucket("per_age", "terms", field="click.keyword").metric("sum_click", "stats", field="click") # example 3 s.aggs.metric("sum_age", "extended_stats", field="impression") # example 4 s.aggs.bucket("per_age", "terms", field="country.keyword") # example 5, This aggregation is based on the interval a = A("range", field="account_number", ranges=[{"to": 10}, {"from": 11, "to": 21}]) res = s.execute() Copy code
Finally, we still need to implement execute()
, Notice here ,s.aggs
Operation cannot receive... With variable ( Such as res=s.aggs
, This operation is wrong ), The results of the aggregation will be saved to res
It shows that .
Sort
s = Search().sort( 'category', '-title', {"lines" : {"order" : "asc", "mode" : "avg"}} ) Copy code
Pagination
s = s[10:20] # {"from": 10, "size": 10} Copy code
Some extension methods , Those of you who are interested can take a look :
s = Search() # Set extended properties to use `.extra()` Method s = s.extra(explain=True) # Set parameters using `.params()` s = s.params(search_type="count") # To restrict the return fields , have access to `source()` Method # only return the selected fields s = s.source(['title', 'body']) # don't return any fields, just the metadata s = s.source(False) # explicitly include/exclude fields s = s.source(include=["title"], exclude=["user.*"]) # reset the field selection s = s.source(None) # Use dict Serialize a query s = Search.from_dict({"query": {"match": {"title": "python"}}}) # Modify existing queries s.update_from_dict({"query": {"match": {"title": "python"}}, "size": 42}) Copy code