facet,英文翻译为方面。Lucene中的facet查询其实就是对事物的方面查询。我们以手机举例。一个手机可以有品牌,型号,运营商等多个facet,不同的facet类型可以组合成不同的手机或者手机的集合。如品牌为小米,运营商为移动构成的就是移动发行的所有小米的手机型号(小米1,小米2,小米3)等。而品牌为小米,型号为小米4构成的手机集合就是小米四的所有运营商发行版(小米4移动版,小米4联通版,小米4电信版等)。我们在对一样事物的搜索时也时常使用这种方式,先确定手机品牌,再逐步对型号,运营商等方面进行限制最终得到想要的结果。下面介绍怎么在Lucene中如何使用facet
1.引入相关依赖
<!--引入方面查询(facet search)的依赖-->
<!-- https://mvnrepository.com/artifact/org.apache.lucene/lucene-facet -->
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-facet</artifactId>
<version>7.2.1</version>
</dependency>
2.建立普通索引的同时建立TaxonomyIndex(分类索引)
@Test
public void buildIndex() throws Exception{
Directory directory = FSDirectory.open(Paths.get(indexDir));
IndexWriter writer = new IndexWriter(directory, new IndexWriterConfig(new WhitespaceAnalyzer()));
//使用DirectoryTaxonomyWriter写入进行切面查询所需要的Taxonomy索引
Directory taxioDirectory = FSDirectory.open(Paths.get(taxoDir));
DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxioDirectory);
FacetsConfig config = new FacetsConfig();
Document doc = new Document();
doc.add(new TextField("device", "手机", Field.Store.YES));
doc.add(new TextField("name", "米1", Field.Store.YES));
doc.add(new FacetField("brand", "小米"));
doc.add(new FacetField("network", "移动4G"));
//写入索引的同时写入taxo索引
writer.addDocument(config.build(taxoWriter, doc));
doc = new Document();
doc.add(new TextField("device", "手机", Field.Store.YES));
doc.add(new TextField("name", "米4", Field.Store.YES));
doc.add(new FacetField("brand", "小米"));
doc.add(new FacetField("network", "联通4G"));
writer.addDocument(config.build(taxoWriter, doc));
doc = new Document();
doc.add(new TextField("device", "手机", Field.Store.YES));
doc.add(new TextField("name", "荣耀6", Field.Store.YES));
doc.add(new FacetField("brand", "华为"));
doc.add(new FacetField("network", "移动4G"));
writer.addDocument(config.build(taxoWriter, doc));
doc = new Document();
doc.add(new TextField("device", "电视", Field.Store.YES));
doc.add(new TextField("name", "小米电视2", Field.Store.YES));
doc.add(new FacetField("brand", "小米"));
writer.addDocument(config.build(taxoWriter, doc));
writer.close();
taxoWriter.close();
}
3.按维度进行细分查询,同时获取维度相关信息
/**
* 对facet查询进行测试
* @throws Exception
*/
@Test
public void testFacetSearch() throws Exception{
Directory directory = FSDirectory
.open(Paths.get(indexDir));
DirectoryReader indexReader = DirectoryReader.open(directory);
IndexSearcher searcher = new IndexSearcher(indexReader);
//同时还需要taxonomy reader
Directory taxoDirectory = FSDirectory
.open(Paths.get(taxoDir));
TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoDirectory);
FacetsConfig config = new FacetsConfig();
//相应的Collector是必不可少的
FacetsCollector facetsCollector = new FacetsCollector();
//按照手机这个维度查询
System.out.println("---------手机----------");
TermQuery query = new TermQuery(new Term("device", "手机"));
TopDocs docs = FacetsCollector.search(searcher, query, 10, facetsCollector);
printDocs(docs, searcher);
System.out.println("----------facet-----------");
Facets facets = new FastTaxonomyFacetCounts(taxoReader, config, facetsCollector);
List<FacetResult> results = facets.getAllDims(10);
//打印其他维度信息
for (FacetResult tmp : results){
System.out.println(tmp);
}
System.out.println("=======================");
//2.drill down,品牌选小米
System.out.println("-----小米手机-----");
DrillDownQuery drillDownQuery = new DrillDownQuery(config, query);
drillDownQuery.add("brand", "小米");
FacetsCollector fc1 = new FacetsCollector();//要new新collector,否则会累加
docs = FacetsCollector.search(searcher, drillDownQuery, 10, fc1);
printDocs(docs, searcher);
System.out.println("----------facet-----------");
facets = new FastTaxonomyFacetCounts(taxoReader, config, fc1);
results = facets.getAllDims(10);
//获得小米手机的分布,总数2个,网络:移动4G 1个,联通4G 1个
for (FacetResult tmp : results) {
System.out.println(tmp);
}
System.out.println("=======================");
//3.drill down,在brand这个facet选择了小米之后继续选择另一个方面network为移动4G
System.out.println("-----移动4G小米手机-----");
//可以看到使用的是同一个DrillDownQuery
drillDownQuery.add("network", "移动4G");
FacetsCollector fc2 = new FacetsCollector();
docs = FacetsCollector.search(searcher, drillDownQuery, 10, fc2);
printDocs(docs, searcher);
System.out.println("----------facet-----------");
facets = new FastTaxonomyFacetCounts(taxoReader, config, fc2);
results = facets.getAllDims(10);
for (FacetResult tmp : results) {
System.out.println(tmp);
}
System.out.println("=======================");
//使用sideWay查看其它平行维度的信息
System.out.println("-----小米手机drill sideways-----");
DrillSideways ds = new DrillSideways(searcher, config, taxoReader);
DrillDownQuery drillDownQuery1 = new DrillDownQuery(config, query);
drillDownQuery1.add("brand", "小米");
DrillSideways.DrillSidewaysResult result = ds.search(drillDownQuery1, 10);
docs = result.hits;
printDocs(docs, searcher);
System.out.println("----------facet-----------");
results = result.facets.getAllDims(10);
for (FacetResult tmp : results) {
System.out.println(tmp);
}
System.out.println("=======================");
indexReader.close();
taxoReader.close();
}
查询结果如下所示,可以看到随着维度的细分查询结果逐渐精确
---------手机----------
device:手机
name:米1
device:手机
name:米4
device:手机
name:荣耀6
----------facet-----------
dim=brand path=[] value=3 childCount=2
小米 (2)
华为 (1)
dim=network path=[] value=3 childCount=2
移动4G (2)
联通4G (1)
=======================
-----小米手机-----
device:手机
name:米1
device:手机
name:米4
----------facet-----------
dim=brand path=[] value=2 childCount=1
小米 (2)
dim=network path=[] value=2 childCount=2
移动4G (1)
联通4G (1)
=======================
-----移动4G小米手机-----
device:手机
name:米1
----------facet-----------
dim=brand path=[] value=1 childCount=1
小米 (1)
dim=network path=[] value=1 childCount=1
移动4G (1)
=======================
-----小米手机drill sideways-----
device:手机
name:米1
device:手机
name:米4
----------facet-----------
dim=brand path=[] value=3 childCount=2
小米 (2)
华为 (1)
dim=network path=[] value=2 childCount=2
移动4G (1)
联通4G (1)
=======================
版权声明:本文为m0_37556444原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。