其他 Elasticsearch作支持
本章介绍了对无法通过存储库界面直接访问的 Elasticsearch作的额外支持。 建议将这些作添加为自定义实施,如 Custom Repository Implementations 中所述。
索引设置
使用 Spring Data Elasticsearch 创建 Elasticsearch 索引时,可以使用@Setting
注解。
可以使用以下参数:
-
useServerConfiguration
不会发送任何设置参数,因此 Elasticsearch 服务器配置会确定这些参数。 -
settingPath
引用一个 JSON 文件,该文件定义必须在 Classpath 中解析的设置 -
shards
要使用的分片数,默认为 1 -
replicas
副本数,默认为 1 -
refreshIntervall
,默认为 “1s” -
indexStoreType
默认为 “fs”
也可以定义索引排序(查看链接的 Elasticsearch 文档,了解可能的字段类型和值):
@Document(indexName = "entities")
@Setting(
sortFields = { "secondField", "firstField" }, (1)
sortModes = { Setting.SortMode.max, Setting.SortMode.min }, (2)
sortOrders = { Setting.SortOrder.desc, Setting.SortOrder.asc },
sortMissingValues = { Setting.SortMissing._last, Setting.SortMissing._first })
class Entity {
@Nullable
@Id private String id;
@Nullable
@Field(name = "first_field", type = FieldType.Keyword)
private String firstField;
@Nullable @Field(name = "second_field", type = FieldType.Keyword)
private String secondField;
// getter and setter...
}
1 | 定义排序字段时,请使用 Java 属性的名称 (firstField),而不是可能为 Elasticsearch 定义的名称 (first_field) |
2 | sortModes ,sortOrders 和sortMissingValues 是可选的,但如果它们已设置,则条目数必须与sortFields 元素 |
索引映射
当 Spring Data Elasticsearch 使用IndexOperations.createMapping()
方法,它使用映射注释概述中描述的注释,尤其是@Field
注解。
除此之外,还可以添加@Mapping
注解添加到类中。
此批注具有以下属性:
-
mappingPath
JSON 格式的 Classpath 资源;如果此项不为空,则将其用作映射,不执行其他映射处理。 -
enabled
当设置为 false 时,此标志将写入 Map,并且不会进行进一步处理。 -
dateDetection
和numericDetection
在 mapping 中设置相应的属性(如果未设置为DEFAULT
. -
dynamicDateFormats
当此 String 数组不为空时,它定义用于自动日期检测的日期格式。 -
runtimeFieldsPath
JSON 格式的 Classpath 资源,其中包含写入索引映射的运行时字段的定义,例如:
{
"day_of_week": {
"type": "keyword",
"script": {
"source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
}
}
}
过滤器生成器
Filter Builder 提高了查询速度。
private ElasticsearchOperations operations;
IndexCoordinates index = IndexCoordinates.of("sample-index");
Query query = NativeQuery.builder()
.withQuery(q -> q
.matchAll(ma -> ma))
.withFilter( q -> q
.bool(b -> b
.must(m -> m
.term(t -> t
.field("id")
.value(documentId))
)))
.build();
SearchHits<SampleEntity> sampleEntities = operations.search(query, SampleEntity.class, index);
对大结果集使用 Scroll
Elasticsearch 有一个滚动 API,用于以块的形式获取大结果集。
Spring Data Elasticsearch 在内部使用它来提供<T> SearchHitsIterator<T> SearchOperations.searchForStream(Query query, Class<T> clazz, IndexCoordinates index)
方法。
IndexCoordinates index = IndexCoordinates.of("sample-index");
Query searchQuery = NativeQuery.builder()
.withQuery(q -> q
.matchAll(ma -> ma))
.withFields("message")
.withPageable(PageRequest.of(0, 10))
.build();
SearchHitsIterator<SampleEntity> stream = elasticsearchOperations.searchForStream(searchQuery, SampleEntity.class,
index);
List<SampleEntity> sampleEntities = new ArrayList<>();
while (stream.hasNext()) {
sampleEntities.add(stream.next());
}
stream.close();
没有SearchOperations
API 访问滚动 id,如果需要访问此 ID,则AbstractElasticsearchTemplate
可以使用(这是不同ElasticsearchOperations
implementations):
@Autowired ElasticsearchOperations operations;
AbstractElasticsearchTemplate template = (AbstractElasticsearchTemplate)operations;
IndexCoordinates index = IndexCoordinates.of("sample-index");
Query query = NativeQuery.builder()
.withQuery(q -> q
.matchAll(ma -> ma))
.withFields("message")
.withPageable(PageRequest.of(0, 10))
.build();
SearchScrollHits<SampleEntity> scroll = template.searchScrollStart(1000, query, SampleEntity.class, index);
String scrollId = scroll.getScrollId();
List<SampleEntity> sampleEntities = new ArrayList<>();
while (scroll.hasSearchHits()) {
sampleEntities.addAll(scroll.getSearchHits());
scrollId = scroll.getScrollId();
scroll = template.searchScrollContinue(scrollId, 1000, SampleEntity.class);
}
template.searchScrollClear(scrollId);
要将 Scroll API 与存储库方法一起使用,返回类型必须定义为Stream
在 Elasticsearch 存储库中。
然后,该方法的实现将使用 ElasticsearchTemplate 中的 scroll 方法。
interface SampleEntityRepository extends Repository<SampleEntity, String> {
Stream<SampleEntity> findBy();
}
排序选项
除了分页和排序中描述的默认排序选项外, Spring Data Elasticsearch 还提供了类org.springframework.data.elasticsearch.core.query.Order
它源自org.springframework.data.domain.Sort.Order
.
它提供了其他参数,在指定结果的排序时,这些参数可以发送到 Elasticsearch(请参阅 www.elastic.co/guide/en/elasticsearch/reference/7.15/sort-search-results.html)。
还有org.springframework.data.elasticsearch.core.query.GeoDistanceOrder
类,可用于按地理距离对搜索作的结果进行排序。
如果要检索的类具有GeoPoint
名为 location 的属性,则Sort
将按到给定点的距离对结果进行排序:
Sort.by(new GeoDistanceOrder("location", new GeoPoint(48.137154, 11.5761247)))
运行时字段
从 Elasticsearch 版本 7.12 开始,添加了运行时字段 (www.elastic.co/guide/en/elasticsearch/reference/7.12/runtime.html) 的功能。 Spring Data Elasticsearch 以两种方式支持此功能:
索引映射中的运行时字段定义
定义运行时字段的第一种方法是将定义添加到索引映射中(参见 www.elastic.co/guide/en/elasticsearch/reference/7.12/runtime-mapping-fields.html)。 要在 Spring Data Elasticsearch 中使用此方法,用户必须提供包含相应定义的 JSON 文件,例如:
{
"day_of_week": {
"type": "keyword",
"script": {
"source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
}
}
}
此 JSON 文件的路径(必须存在于类路径中)必须在@Mapping
实体的注释:
@Document(indexName = "runtime-fields")
@Mapping(runtimeFieldsPath = "/runtime-fields.json")
public class RuntimeFieldEntity {
// properties, getter, setter,...
}
在 Query 上设置的运行时字段定义
定义运行时字段的第二种方法是将定义添加到搜索查询中(请参阅 www.elastic.co/guide/en/elasticsearch/reference/7.12/runtime-search-request.html)。 以下代码示例显示如何使用 Spring Data Elasticsearch 执行此作:
使用的实体是一个简单对象,它有一个price
财产:
@Document(indexName = "some_index_name")
public class SomethingToBuy {
private @Id @Nullable String id;
@Nullable @Field(type = FieldType.Text) private String description;
@Nullable @Field(type = FieldType.Double) private Double price;
// getter and setter
}
以下查询使用运行时字段,该字段计算priceWithTax
值,并在搜索查询中使用此值来查找priceWithTax
大于或等于给定值:
RuntimeField runtimeField = new RuntimeField("priceWithTax", "double", "emit(doc['price'].value * 1.19)");
Query query = new CriteriaQuery(new Criteria("priceWithTax").greaterThanEqual(16.5));
query.addRuntimeField(runtimeField);
SearchHits<SomethingToBuy> searchHits = operations.search(query, SomethingToBuy.class);
这适用于Query
接口。
时间点 (PIT) API
ElasticsearchOperations
支持 Elasticsearch 的时间点 API(参见 www.elastic.co/guide/en/elasticsearch/reference/8.3/point-in-time-api.html)。
以下代码片段演示如何将此功能与虚构的Person
类:
ElasticsearchOperations operations; // autowired
Duration tenSeconds = Duration.ofSeconds(10);
String pit = operations.openPointInTime(IndexCoordinates.of("person"), tenSeconds); (1)
// create query for the pit
Query query1 = new CriteriaQueryBuilder(Criteria.where("lastName").is("Smith"))
.withPointInTime(new Query.PointInTime(pit, tenSeconds)) (2)
.build();
SearchHits<Person> searchHits1 = operations.search(query1, Person.class);
// do something with the data
// create 2nd query for the pit, use the id returned in the previous result
Query query2 = new CriteriaQueryBuilder(Criteria.where("lastName").is("Miller"))
.withPointInTime(
new Query.PointInTime(searchHits1.getPointInTimeId(), tenSeconds)) (3)
.build();
SearchHits<Person> searchHits2 = operations.search(query2, Person.class);
// do something with the data
operations.closePointInTime(searchHits2.getPointInTimeId()); (4)
1 | 为索引创建时间点(可以是多个名称)和保持活动持续时间,并检索其 ID |
2 | 将该 ID 传递到查询中,以便与下一个 keep-alive 值一起搜索 |
3 | 对于下一个查询,请使用上一次搜索返回的 ID |
4 | 完成后,使用最后返回的 ID 关闭时间点 |
搜索模板支持
支持使用搜索模板 API。
要使用它,首先需要创建一个存储脚本。
这ElasticsearchOperations
接口扩展ScriptOperations
它提供了必要的功能。
这里使用的示例假设我们有Person
具有名为firstName
.
搜索模板脚本可以像这样保存:
import org.springframework.data.elasticsearch.core.ElasticsearchOperations;
import org.springframework.data.elasticsearch.core.script.Script;
operations.putScript( (1)
Script.builder()
.withId("person-firstname") (2)
.withLanguage("mustache") (3)
.withSource(""" (4)
{
"query": {
"bool": {
"must": [
{
"match": {
"firstName": "{{firstName}}" (5)
}
}
]
}
},
"from": "{{from}}", (6)
"size": "{{size}}" (7)
}
""")
.build()
);
1
Use the putScript()
method to store a search template script
2
The name / id of the script
3
Scripts that are used in search templates must be in the mustache language.
4
The script source
5
The search parameter in the script
6
Paging request offset
7
Paging request size
To use a search template in a search query, Spring Data Elasticsearch provides the SearchTemplateQuery
, an implementation of the org.springframework.data.elasticsearch.core.query.Query
interface.
In the following code, we will add a call using a search template query to a custom repository implementation (see
Custom Repository Implementations) as an example how this can be integrated into a repository call.
We first define the custom repository fragment interface:
interface PersonCustomRepository {
SearchPage<Person> findByFirstNameWithSearchTemplate(String firstName, Pageable pageable);
}
The implementation of this repository fragment looks like this:
public class PersonCustomRepositoryImpl implements PersonCustomRepository {
private final ElasticsearchOperations operations;
public PersonCustomRepositoryImpl(ElasticsearchOperations operations) {
this.operations = operations;
}
@Override
public SearchPage<Person> findByFirstNameWithSearchTemplate(String firstName, Pageable pageable) {
var query = SearchTemplateQuery.builder() (1)
.withId("person-firstname") (2)
.withParams(
Map.of( (3)
"firstName", firstName,
"from", pageable.getOffset(),
"size", pageable.getPageSize()
)
)
.build();
SearchHits<Person> searchHits = operations.search(query, Person.class); (4)
return SearchHitSupport.searchPageFor(searchHits, pageable);
}
}
1
Create a SearchTemplateQuery
2
Provide the id of the search template
3
The parameters are passed in a Map<String,Object>
4
Do the search in the same way as with the other query types.
Nested sort
Spring Data Elasticsearch supports sorting within nested objects (www.elastic.co/guide/en/elasticsearch/reference/8.9/sort-search-results.html#nested-sorting)
The following example, taken from the org.springframework.data.elasticsearch.core.query.sort.NestedSortIntegrationTests
class, shows how to define the nested sort.
var filter = StringQuery.builder("""
{ "term": {"movies.actors.sex": "m"} }
""").build();
var order = new org.springframework.data.elasticsearch.core.query.Order(Sort.Direction.DESC,
"movies.actors.yearOfBirth")
.withNested(
Nested.builder("movies")
.withNested(
Nested.builder("movies.actors")
.withFilter(filter)
.build())
.build());
var query = Query.findAll().addSort(Sort.by(order));
About the filter query: It is not possible to use a CriteriaQuery
here, as this query would be converted into a Elasticsearch nested query which does not work in the filter context. So only StringQuery
or NativeQuery
can be used here. When using one of these, like the term query above, the Elasticsearch field names must be used, so take care, when these are redefined with the @Field(name="…")
definition.
For the definition of the order path and the nested paths, the Java entity property names should be used.