·本篇:6.9k字 大约需要: 34分钟
ElasticSearch
基本操作
索引
查看索引
1 2 3 4 5 6 7
| # 查看所有索引 GET /_cat/indices # 查看所有索引,并且有字段说明 GET /_cat/indices?v
# 查看某个映射的索引 mapping GET /products/_mapping
|
创建索引
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
| # 创建一个名字为products的索引 PUT /products
# number_of_shards 指定主分片的数量 # number_of_replicas 指定副分片的数量 # 设置配置创建索引 PUT /orders { "settings": { "number_of_shards": 1, "number_of_replicas": 0 } }
# 创建商品索引products,指定mapping {id, title, price, created_at, description} PUT /products { "settings": { "number_of_shards": 1, "number_of_replicas": 0 }, "mappings": { "properties": { "id":{ "type": "integer" }, "title":{ "type": "keyword" }, "price":{ "type": "double" }, "created_at":{ "type": "date" }, "description":{ "type": "text" } } } }
|
删除索引
1 2
| # 删除名为products的索引 DELETE /products
|
文档
创建文档
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| # 创建文档,指定_id POST /products/_doc/1 { "id": 1, "title": "小浣熊", "price": "0.5", "created_at": "2023-03-03", "description": "小浣熊真好吃" }
# 创建文档,自动生成_id Pkmop4YBHIQlzHh1b96_ POST /products/_doc/ { "title": "康师傅", "price": "2.5", "created_at": "2023-03-03", "description": "康师傅也不错" }
|
查看文档
1 2 3
| # 查看文档 GET /products/_doc/1 GET /products/_doc/Pkmop4YBHIQlzHh1b96_
|
删除文档
1 2
| # 删除文档 DELETE /products/_doc/Pkmop4YBHIQlzHh1b96_
|
更新文档
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| # 更新文档 (注意:该方式会删除原始文档,再重新添加,可以传递全部字段进行更新) PUT /products/_doc/Pkmop4YBHIQlzHh1b96_ { "title": "统一", "price": "2.5", "created_at": "2023-03-03", "description": "统一也不错" }
# 更新文档,指定字段进行更新 POST /products/_doc/Pkmop4YBHIQlzHh1b96_/_update { "doc":{ "description":"统一也还行" } }
|
文档批量操作
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| # 文档批量操作,_bulk,数据必须放在同一行 POST /products/_doc/_bulk {"index": {"_id": 2}} {"id": 2,"title": "中浣熊","price": "0.5","created_at": "2023-03-03","description": "中浣熊真好吃"} {"index": {"_id": 3}} {"id": 3,"title": "大浣熊","price": "0.5","created_at": "2023-03-03","description": "大浣熊真好吃"} # 文档批量操作 添加 更新 删除 POST /products/_doc/_bulk {"index": {"_id": 4}} {"id": 2,"title": "超大浣熊","price": "0.5","created_at": "2023-03-03","description": "超大浣熊真好吃"} {"update": {"_id": 3}} {"doc": {"title": "浣熊"}} {"delete": {"_id": 2}}
|
注意:在文档批量操作中,每条语句之间是独立运行的,一条语句失败不会影响后续语句的执行结果
高级查询
说明
ES中提供了一种强大的检索数据方式,这种检索方式称之为Query DSL
,Query DSL
是利用Rest API传递JSON格式的请求体(Request Body)数据
与ES进行交互,这种方式的丰富查询语法
让ES检索变得更强大,更简洁
语法
1 2
| # GET /索引名/_doc/_search {json格式请求体数据} # GET /索引名/_search {json格式请求体数据}
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| # Query DSL 语法 # 查询所有 match_all GET /products/_doc/_search { "query": { "match_all": {} } }
GET /products/_search { "query": { "match_all": {} } }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70
| { "took" : 5, # 从执行到返回的时间,单位ms "timed_out" : false, # 代表是否超时 "_shards" : { # 当前索引的分片信息 "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { # 查询的结果对象 "total" : { "value" : 4, # 符合条件的总记录数 "relation" : "eq" }, "max_score" : 1.0, # 搜索文档的最大得分 "hits" : [ # 返回的结果数据数组 { "_index" : "products", "_type" : "_doc", "_id" : "1", "_score" : 1.0, "_source" : { "id" : 1, "title" : "小浣熊", "price" : "0.5", "created_at" : "2023-03-03", "description" : "小浣熊真好吃" } }, { "_index" : "products", "_type" : "_doc", "_id" : "Pkmop4YBHIQlzHh1b96_", "_score" : 1.0, "_source" : { "title" : "统一", "price" : "2.5", "created_at" : "2023-03-03", "description" : "统一也还行" } }, { "_index" : "products", "_type" : "_doc", "_id" : "4", "_score" : 1.0, "_source" : { "id" : 2, "title" : "超大浣熊", "price" : "0.5", "created_at" : "2023-03-03", "description" : "超大浣熊真好吃" } }, { "_index" : "products", "_type" : "_doc", "_id" : "3", "_score" : 1.0, "_source" : { "id" : 3, "title" : "浣熊", "price" : "0.5", "created_at" : "2023-03-03", "description" : "大浣熊真好吃" } } ] } }
|
常见检索
查询所有[match_all]
match_all关键字:返回索引中的全部文档
1 2 3 4 5 6
| GET /products/_search { "query":{ "match_all": {} } }
|
关键词查询[term]
term关键字:用来使用关键词查询
1 2 3 4 5 6 7 8 9 10
| GET /products/_search { "query": { "term": { "price": { "value": 0.5 } } } }
|
NOTE1:通过使用term查询得知ES中默认使用分词器为标准分词器(StandardAnalyzer),标准分词器对于英文单词分词,对于中文单字分词
NOTE2:通过使用term查询得知,在ES的Mapping Type中Keyword、date、integer、long、double、boolean、or、ip这些类型不分词,只有text类型分词
范围查询[range]
range 关键字:用来指定查询指定范围内的文档
1 2 3 4 5 6 7 8 9 10 11 12 13
| # gte >= # lte <= GET /products/_search { "query": { "range": { "price": { "gte": 1, "lte": 3 } } } }
|
前缀查询[prefix]
prefix关键字:用来检索含有指定前缀的关键词的相关文档
1 2 3 4 5 6 7 8 9 10
| GET /products/_search { "query": { "prefix": { "title": { "value": "小" } } } }
|
通配符查询[wildcard]
*wildcard关键字:通配符查询,?用来匹配一个任意字符,用来匹配多个任意字符 **
1 2 3 4 5 6 7 8 9 10
| GET /products/_search { "query": { "wildcard": { "description": { "value": "go*" } } } }
|
多id查询[ids]
ids关键字:值为数组类型,用来根据一组id获取多个对应的文档
1 2 3 4 5 6 7 8
| GET /products/_search { "query": { "ids": { "values": [1, 3, 4] } } }
|
模糊查询[fuzzy]
fuzzy关键字:用来模糊查询含有指定关键字的文档
1 2 3 4 5 6 7 8 9 10
| GET /products/_search { "query": { "fuzzy": { "title": { "value": "小浣豆" } } } }
|
注意:fuzzy 模糊查询,最大模糊错误必须在0-2之间
- 搜索关键词长度为2,不允许存在模糊
- 搜索关键词长度为3-5,允许一次模糊
- 搜索关键词长度大于5,允许最大两次模糊
布尔查询[bool]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| # must:相当于&&同时成立 # should:相当于||成立一个就行 # must_not:相当于!,不能满足任何一个 GET /products/_search { "query": { "bool": { "should": [ { "ids": { "values": [1] } }, { "term": { "title": { "value": "小熊猫" } } } ] } } }
|
多字段查询[multi_match]
1 2 3 4 5 6 7 8 9 10
| # 字段类型分词,将查询条件分词之后进行查询该字段,如果该字段不分词就会将查询条件作为整体进行查询 GET /products/_search { "query": { "multi_match": { "query": "小浣熊", "fields": ["title", "description"] } } }
|
默认字段分词查询[query_string]
1 2 3 4 5 6 7 8 9 10 11
| # 查询字段分词就将查询条件分词查询 # 查询字段不分词将查询条件不分词查询 GET /products/_search { "query": { "query_string": { "default_field": "description", "query": "浣熊" } } }
|
高亮查询[highlight]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
| # "require_field_match": "false", 对指定字段匹配关闭 # size 返回指定条数,默认返回10条 # from 用来指定起始返回位置,和size关键字连用可实现分页效果 # sort 将文档对指定字段进行排序 # _source 指定哪些字段进行返回 GET /products/_search { "query": { "query_string": { "default_field": "description", "query": "浣熊" } }, "highlight": { "pre_tags": ["<span style='color:red;'>"], "post_tags": ["</span>"], "require_field_match": "false", "fields": { "*": {} } }, "from": 0, "size": 10, "sort": [ { "price": { "order": "desc" } } ], "_source": ["title", "price"] }
|
索引原理
倒排索引
倒排索引(Inverted Index)也叫反向索引,有反向索引必有正向索引。通俗地来讲,正向索引是通过key找value,反向索引则是通过value找key,ES底层在检索时底层使用的就是倒排索引
索引模型
现有索引和映射如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| PUT /test { "settings": { "number_of_shards": 1, "number_of_replicas": 0 }, "mappings": { "properties": { "title":{ "type": "keyword" }, "price":{ "type": "double" }, "description":{ "type": "text" } } } }
|
先录入如下数据,有三个字段title、price、description等
_id |
title |
price |
description |
1 |
蓝月亮洗衣液 |
19.9 |
蓝月亮洗衣液很 高效 |
2 |
iphone13 |
19.9 |
很 不错的手机 |
3 |
小浣熊干脆面 |
1.5 |
小浣熊很 好吃 |
在ES中除了text类型分词,其它类型不分词,因此根据不同字段创建索引如下:
term |
_id(文档id) |
蓝月亮洗衣液 |
1 |
iphone13 |
2 |
小浣熊干脆面 |
3 |
term |
_id(文档id) |
19.9 |
[1, 2] |
1.5 |
3 |
[1(文档id):1(在字段中出现的次数):9(字段数据长度)]
term |
_id |
term |
_id |
term |
_id |
蓝 |
1 |
不 |
2 |
小 |
3 |
月 |
1 |
错 |
2 |
浣 |
3 |
亮 |
1 |
的 |
2 |
熊 |
3 |
洗 |
1 |
手 |
2 |
好 |
3 |
衣 |
1 |
机 |
2 |
吃 |
3 |
液 |
1 |
|
|
|
|
很 |
[1:1:9, 2:1:5, 3:1:5] |
|
|
|
|
高 |
1 |
|
|
|
|
效 |
1 |
|
|
|
|
注意:ElasticSearch分别为每个字段都建立了一个倒排索引,因此查询时查询字段的term,就能知道文档ID,就能快速找到文档
IK使用
IK有两种颗粒度的拆分:
- ik_smart:会做最粗粒度的拆分
- ik_max_word:会将文本做最细粒度的拆分
1 2 3 4 5 6
| # 分词器测试 POST /_analyze { "analyzer": "ik_smart", "text": "中华人民共和国国歌" }
|
过滤查询
过滤查询
过滤查询,其实准确来说,ES中的查询操作分为两种:查询(query)
和过滤(filter)
,查询即是之前提到的query查询
,它(查询)默认会计算每个返回文档的得分,然后根据得分排序;而过滤(filter)
只会筛选出符合的文档,并不计算得分,而且它可以缓存文档。所以,单从性能考虑,过滤比查询更快。换句话说过滤适合在大范围筛选数据,而查询则适合精度匹配数据。一般应用时,应先使用过滤操作过滤数据,然后使用查询匹配数据
使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| GET /products/_search { "query": { "bool": { "must": [ { "term": { "description": { "value": "浣熊" } }} ], "filter": [ { "term": { "description": "好吃" } } ] } } }
|
SpringBoot整合ElasticSearch
引入依赖
1 2 3 4
| <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-elasticsearch</artifactId> </dependency>
|
配置客户端
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| @Configuration public class RestClientConfig extends AbstractElasticsearchConfiguration { @Value("${myElasticSearch.host}") private String host;
@Bean @Override public RestHighLevelClient elasticsearchClient() { ClientConfiguration clientConfiguration = ClientConfiguration.builder() .connectedTo(host) .build(); return RestClients.create(clientConfiguration).rest(); } }
|
客户端对象
ElasticsearchOperations
RestHighLevelClient 推荐
ElasticsearchOperations
- 特点:始终使用面向对象方式操作ES
- 索引:用来存放相似文档的集合
- 映射:用来决定放入文档的每个字段以什么方式录入到ES中的字段类型,字段分词器等
- 文档:可以被索引的最小单元,以json数据格式表示
相关注解
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
|
@Document(indexName = "products", createIndex = true) public class Product { @Id private Integer id; @Field(type = FieldType.Keyword) private String title; @Field(type = FieldType.Double) private Double price; @Field(type = FieldType.Text, analyzer = "ik_max_word") private String description; }
|
插入/更新文档
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
|
@Test public void testSave() { Product product = new Product(1, "小浣熊", 1.0, "小浣熊真好吃"); Product product1 = new Product(); product1.setId(1); product1.setPrice(1.5); Product save = getElasticsearchOperations().save(product); LOGGER.info("save result = {}", save); }
|
删除文档
1 2 3 4 5 6 7 8 9 10 11
|
@Test public void testDelete() { Product product = new Product(); product.setId(1); String delete = getElasticsearchOperations().delete(product); LOGGER.info("delete result = {}", delete); }
|
查询文档
1 2 3 4 5 6 7 8 9
|
@Test public void testGet() { Product product = getElasticsearchOperations().get("1", Product.class); LOGGER.info("get result = {}", product); }
|
删除所有文档
1 2 3 4 5 6 7 8 9 10 11 12 13
|
@Test public void testDeleteAll() { ByQueryResponse delete = getElasticsearchOperations().delete(Query.findAll(), Product.class); try{ LOGGER.info("deleteAll result = {}", new ObjectMapper().writeValueAsString(delete)); }catch(JsonProcessingException e){ e.printStackTrace(); } }
|
查询所有文档
1 2 3 4 5 6 7 8 9 10 11 12
|
@Test public void testFindAll() { SearchHits<Product> search = getElasticsearchOperations().search(Query.findAll(), Product.class); LOGGER.info("总分数: result = {}", search.getMaxScore()); LOGGER.info("符合条件总条数: result = {}", search.getTotalHits()); search.stream() .forEach(productSearchHit -> LOGGER.info("findAll result = {}", productSearchHit.getContent())); }
|
RestHighLevelClient
索引相关操作
创建索引映射
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
|
@Test public void testIndexAndMapping() {
CreateIndexRequest createIndexRequest = new CreateIndexRequest("products"); createIndexRequest.mapping("{\n" + " \"properties\": {\n" + " \"title\":{\"type\":\"keyword\"},\n" + " \"price\":{\"type\":\"double\"},\n" + " \"created_at\":{\"type\": \"date\"},\n" + " \"description\":{\"type\": \"text\", \"analyzer\": \"ik_max_word\"}\n" + " }\n" + " }", XContentType.JSON); try{ CreateIndexResponse createIndexResponse = restHighLevelClient.indices().create(createIndexRequest, RequestOptions.DEFAULT); LOGGER.info("response result = {}", createIndexResponse.isAcknowledged()); }catch(IOException e){ e.printStackTrace(); } }
|
删除索引
1 2 3 4 5 6 7 8 9 10 11 12 13 14
|
@Test public void testDeleteIndex() { try{ AcknowledgedResponse acknowledgedResponse = restHighLevelClient.indices().delete(new DeleteIndexRequest("products"), RequestOptions.DEFAULT); LOGGER.info("response result = {}", acknowledgedResponse.isAcknowledged()); }catch(IOException e){ e.printStackTrace(); } }
|
文档相关操作
创建文档
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
|
@Test public void testCreate() { IndexRequest indexRequest = new IndexRequest("products"); indexRequest.id("3") .source("{\"title\": \"香辣木瓜丝\", \"price\": \"2.5\", \"description\": \"香辣木瓜丝真好吃\", \"created_at\": \"2023-03-03\"}", XContentType.JSON); try{ IndexResponse index = getRestHighLevelClient().index(indexRequest, RequestOptions.DEFAULT); LOGGER.info("Index result = {}", index.status()); }catch(IOException e){ e.printStackTrace(); } }
|
更新文档
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
|
@Test public void testUpdate() { UpdateRequest updateRequest = new UpdateRequest("products","2"); updateRequest.doc("{\"title\": \"香辣土豆丝\"}", XContentType.JSON);
try{ UpdateResponse updateResponse = restHighLevelClient.update(updateRequest, RequestOptions.DEFAULT); LOGGER.info("Update result = {}", updateResponse.status()); }catch(IOException e){ e.printStackTrace(); } }
|
删除文档
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
|
@Test public void testDelete() { try{ DeleteResponse deleteResponse = getRestHighLevelClient().delete(new DeleteRequest("products", "3"), RequestOptions.DEFAULT); LOGGER.info("Delete result = {}", deleteResponse.status()); }catch(IOException e){ e.printStackTrace(); } }
|
基于id查询文档
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
|
@Test public void testGet() { try{ GetResponse getResponse = getRestHighLevelClient().get(new GetRequest("products", "1"), RequestOptions.DEFAULT); LOGGER.info("Get result = {}", getResponse.getSourceAsString()); }catch(IOException e){ e.printStackTrace(); } }
|
查询所有文档
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
|
@Test public void testGetAll() { SearchRequest searchRequest = new SearchRequest("products"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.query(QueryBuilders.matchAllQuery()); searchRequest.source(sourceBuilder); try{ SearchResponse searchResponse = getRestHighLevelClient().search(searchRequest, RequestOptions.DEFAULT); LOGGER.info("总条数 result = {}", searchResponse.getHits().getTotalHits().value); LOGGER.info("最大得分 result = {}", searchResponse.getHits().getMaxScore()); SearchHit[] hits = searchResponse.getHits().getHits(); Arrays.stream(hits) .forEach(hit -> LOGGER.info("id = {}, result = {}", hit.getId(), hit.getSourceAsString())); }catch(IOException e){ e.printStackTrace(); } }
|
不同条件查询文档
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
|
@Test public void testQuery() { query(QueryBuilders.termQuery("description", "土豆")); query(QueryBuilders.rangeQuery("price").gt(0).lte(3)); query(QueryBuilders.prefixQuery("description", "香")); query(QueryBuilders.wildcardQuery("description", "香*")); query(QueryBuilders.idsQuery().addIds("1").addIds("2")); query(QueryBuilders.multiMatchQuery("香辣小浣熊", "description", "title")); }
public void query(QueryBuilder queryBuilder) { SearchRequest searchRequest = new SearchRequest("products"); SearchSourceBuilder builder = new SearchSourceBuilder();
builder.query(queryBuilder);
searchRequest.source(builder); try{ SearchResponse searchResponse = getRestHighLevelClient() .search(searchRequest, RequestOptions.DEFAULT); LOGGER.info("符合条件总条数 result = {}", searchResponse.getHits().getTotalHits().value); LOGGER.info("获取文档最大得分 result = {}", searchResponse.getHits().getMaxScore()); SearchHit[] hits = searchResponse.getHits().getHits(); Arrays.stream(hits) .forEach(hit -> LOGGER.info("id = {}, result = {}", hit.getId(), hit.getSourceAsString())); }catch(IOException e){ e.printStackTrace(); } }
|
分页查询
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
|
@Test public void testSearch() { SearchRequest searchRequest = new SearchRequest("products"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); HighlightBuilder highlightBuilder = new HighlightBuilder(); highlightBuilder.requireFieldMatch(false) .field("description") .field("title") .preTags("<span style='color:red'>") .postTags("</span>"); sourceBuilder.query(QueryBuilders.termQuery("description", "好吃")) .from(0) .size(3) .sort("price", SortOrder.DESC) .fetchSource(new String[]{"title", "price"}, new String[]{}) .highlighter(highlightBuilder); searchRequest.source(sourceBuilder); try{ SearchResponse searchResponse = getRestHighLevelClient().search(searchRequest, RequestOptions.DEFAULT); LOGGER.info("符合条件总条数 result = {}", searchResponse.getHits().getTotalHits().value); LOGGER.info("获取文档最大得分 result = {}", searchResponse.getHits().getMaxScore()); SearchHit[] hits = searchResponse.getHits().getHits(); Arrays.stream(hits) .forEach(hit -> LOGGER.info("id = {}, result = {}, highlighter = {}", hit.getId(), hit.getSourceAsString(), hit.getHighlightFields())); }catch(IOException e){ e.printStackTrace(); } }
|
过滤查询
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
|
@Test public void testFilterQuery() { SearchRequest searchRequest = new SearchRequest("products"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.query(QueryBuilders.matchAllQuery()) .postFilter(QueryBuilders.termQuery("description", "香"));
searchRequest.source(sourceBuilder); try{ SearchResponse searchResponse = getRestHighLevelClient().search(searchRequest, RequestOptions.DEFAULT); LOGGER.info("符合条件总条数 result = {}", searchResponse.getHits().getTotalHits().value); LOGGER.info("获取文档最大得分 result = {}", searchResponse.getHits().getMaxScore()); SearchHit[] hits = searchResponse.getHits().getHits(); Arrays.stream(hits) .forEach(hit -> LOGGER.info("id = {}, result = {}", hit.getId(), hit.getSourceAsString())); }catch(IOException e){ e.printStackTrace(); } }
|
应用使用
实体类
1 2 3 4 5 6 7 8 9 10 11
| @Data @NoArgsConstructor @AllArgsConstructor @EqualsAndHashCode public class Product1 { private Integer id; private String title; private Double price; private String description; }
|
将对象放入ES中
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
|
@Test public void testIndex() { Product1 product1 = new Product1(); product1.setId(4); product1.setTitle("红烧肉"); product1.setPrice(10.5); product1.setDescription("红烧肉肥而不腻"); try{ IndexRequest indexRequest = new IndexRequest("products"); indexRequest.id(product1.getId().toString()) .source(new ObjectMapper().writeValueAsString(product1), XContentType.JSON); IndexResponse indexResponse = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT); LOGGER.info("status = {}", indexResponse.status()); }catch(IOException e){ e.printStackTrace(); } }
|
从ES中获取数据并转换为对象
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
| @Test public void testSearch() { SearchRequest searchRequest = new SearchRequest("products"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); HighlightBuilder highlightBuilder = new HighlightBuilder(); highlightBuilder.requireFieldMatch(false) .field("description") .preTags("<span style='color:red;'>") .postTags("</span>"); sourceBuilder.query(QueryBuilders.termQuery("description", "好吃")) .from(0) .size(3) .highlighter(highlightBuilder); searchRequest.source(sourceBuilder); try{ SearchResponse searchResponse = getRestHighLevelClient().search(searchRequest, RequestOptions.DEFAULT); LOGGER.info("总条数 result = {}", searchResponse.getHits().getTotalHits().value); LOGGER.info("最大得分 result = {}", searchResponse.getHits().getMaxScore()); SearchHit[] hits = searchResponse.getHits().getHits(); Arrays.stream(hits) .forEach(hit -> LOGGER.info("id = {}, result = {}", hit.getId(), hit.getSourceAsString())); List<Product1> product1s = Arrays.stream(hits).map(hit -> { Product1 product1 = null; try{ product1 = new ObjectMapper().readValue(hit.getSourceAsString(), Product1.class); product1.setId(Integer.valueOf(hit.getId())); Map<String, HighlightField> highlightFields = hit.getHighlightFields(); if(highlightFields.containsKey("description")){ product1.setDescription(highlightFields.get("description").fragments()[0].toString()); } }catch(JsonProcessingException e){ e.printStackTrace(); } return product1; }).collect(Collectors.toList()); product1s.forEach(product1 -> LOGGER.info("result = {}", product1));
}catch(IOException e){ e.printStackTrace(); } }
|
聚合查询
简介
聚合:Aggregation,简称Aggs,是ES除搜索功能外提供的针对ES数据做统计分析的功能。聚合有助于根据搜索查询提供聚合数据,聚合查询是数据库中重要的功能特性,ES作为搜索引擎兼数据库,同样提供了强大的聚合分析能力,它基于查询条件来对数据进行分桶、计算的方法。有点类似于SQL中的group by再加一些函数方法的操作
注意:text类型是不支持聚合的
测试数据
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| # 创建索引及映射 PUT /fruit { "mappings": { "properties": { "title": { "type": "keyword" }, "price":{ "type": "double" }, "description":{ "type": "text", "analyzer": "ik_max_word" } } } }
# 插入数据 PUT /fruit/_bulk {"index": {}} {"title": "面包", "price": 19.9, "description": "小面包非常好吃"} {"index": {}} {"title": "旺仔牛奶", "price": 29.9, "description": "非常好喝"} {"index": {}} {"title": "日本豆", "price": 19.9, "description": "日本豆非常好吃"} {"index": {}} {"title": "小馒头", "price": 19.9, "description": "小馒头非常好吃"} {"index": {}} {"title": "大辣片", "price": 39.9, "description": "大辣片非常好吃"} {"index": {}} {"title": "透心凉", "price": 9.9, "description": "透心凉非常好喝"} {"index": {}} {"title": "小浣熊", "price": 19.9, "description": "童年的味道"} {"index":{}} {"title": "海苔", "price": 19.9, "description": "海的味道"}
|
基本操作
根据某个字段分组
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
| # 根据某个字段进行分组,统计数量 GET /fruit/_search { "query": { "term": { "description": { "value": "好吃" } } }, "aggs": { "price_group": { "terms": { "field": "price", "size": 10 } } } }
GET /fruit/_search { "query": { "match_all": {} }, "size": 0, "aggs": { "price_group": { "terms": { "field": "price", "size": 10 } } } }
|
求最大值
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| # 求最大值 GET /fruit/_search { "query": { "match_all": {} }, "size": 0, "aggs": { "max_price": { "max": { "field": "price" } } } }
|
求最小值
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| # 求最小值 GET /fruit/_search { "query": { "match_all": {} }, "size": 0, "aggs": { "min_price": { "min": { "field": "price" } } } }
|
求平均值
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| # 求平均值 GET /fruit/_search { "query": { "match_all": {} }, "size": 0, "aggs": { "avg_price": { "avg": { "field": "price" } } } }
|
求和
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| # 求和 GET /fruit/_search { "query": { "match_all": {} }, "size": 0, "aggs": { "sum_price": { "sum": { "field": "price" } } } }
|
整合应用
基于terms类型进行聚合
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
|
@Test public void testTermsAggs() { SearchRequest searchRequest = new SearchRequest("fruit"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.query(QueryBuilders.matchAllQuery()) .size(0) .aggregation(AggregationBuilders.terms("price_group") .field("price")); searchRequest.source(sourceBuilder); try{ SearchResponse searchResponse = getRestHighLevelClient().search(searchRequest, RequestOptions.DEFAULT); ParsedDoubleTerms price_group = searchResponse.getAggregations().get("price_group"); price_group.getBuckets().forEach(bucket -> LOGGER.info("key = {}, doc_count = {}", bucket.getKey(), bucket.getDocCount())); }catch(IOException e){ e.printStackTrace(); } }
|
聚合函数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
|
@Test public void testFunctionAvgAggs() { SearchRequest searchRequest = new SearchRequest("fruit"); SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); sourceBuilder.query(QueryBuilders.matchAllQuery()) .size(0)
.aggregation(AggregationBuilders.max("max_price").field("price")); searchRequest.source(sourceBuilder); try{ SearchResponse searchResponse = getRestHighLevelClient().search(searchRequest, RequestOptions.DEFAULT); Aggregations aggregations = searchResponse.getAggregations(); ParsedMax max_price = aggregations.get("max_price"); LOGGER.info("value = {}", max_price.getValue()); }catch(IOException e){ e.printStackTrace(); } }
|
相关概念
集群
一个集群就是由一个或多个节点组织在一起,它们共同持有你整个的数据,并一起提供索引和搜索功能
。一个集群由一个唯一的名字标识,这个名字默认就是elasticsearch
。这个名字是很重要的,因为一个节点只能通过指定某个集群的名字,来加入这个集群
节点
一个节点是你集群中的一个服务器,作为集群的一部分,它存储你的数据,参与集群的索引和搜索功能。和集群类似,一个节点也是由一个名字来标识的,默认情况下,这个名字是一个随机的漫威漫画角色的名字,这个名字会在启动的时候赋予节点
索引
一组相似文档的集合
映射
用来定义索引存储文档的结构如:字段、类型等
文档
索引中的一条记录,可以被索引的最小单元
分片
Elasticsearch提供了将索引划分成多份的能力,被划分为的多份就叫做分片,当你创建一个索引的时候,你可以指定你想要的分片的数量。每个分片本身也是一个功能完善并且独立的“索引”,
复制
索引的分片中一份或多份副本
搭建集群
1 2 3 4
| # 1.准备3个ES节点 ES 9200 9300 - web: 9201 tcp: 9301 node-1 elasticsearch.yml - web: 9202 tcp: 9302 node-2 elasticsearch.yml - web: 9203 tcp: 9303 node-3 elasticsearch.yml
|
- 注意
- 所有节点集群名称必须一致,cluster.name
- 每个节点必须有一个唯一名字,node.name
- 开启每个节点远程连接,netword.host: 0.0.0.0
- 指定使用IP地址进行集群节点通信,network.publish_host
- 修改web端口、tcp端口,http.port: transport.tcp.port
- 指定集群中所有节点通信列表,discovery.seed_hosts: node-1,node-2和node-3相同
- 允许集群初始化master节点节点数:cluster.initial_master_nodes: [“node-1”, “node-2”, “node-3”]
- 集群最少几个节点可以,getway.recover_after_nodes: 2
- 开启每个节点跨域访问,http.cors.enabled: true,http.cors.allow-origin: “*”