Elasticsearch Analysis Mmseg

The Mmseg Analysis plugin integrates Lucene mmseg4j-analyzer:http://code.google.com/p/mmseg4j/ into elasticsearch, support customized dictionary.
Alternatives To Elasticsearch Analysis Mmseg
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Elasticsearch Analysis Ik15,4976119 days ago17January 15, 2018382apache-2.0Java
The IK Analysis plugin integrates Lucene IK analyzer into elasticsearch, support customized dictionary.
Awesome Elasticsearch4,616
2 months ago2unlicense
A curated list of the most important and useful resources about elasticsearch: articles, videos, blogs, tips and tricks, use cases. All about Elasticsearch!
Crate3,7724110 hours ago13October 25, 2016262apache-2.0Java
CrateDB is a distributed and scalable SQL database for storing and analyzing massive amounts of data in near real-time, even with complex queries. It is PostgreSQL-compatible, and based on Lucene.
Ai_tutorial1,810
10 hours ago
精选机器学习,NLP,图像识别, 深度学习等人工智能领域学习资料,搜索,推荐,广告系统架构及算法技术资料整理。算法大牛笔记汇总
Elassandra1,6331412 years ago17September 01, 202041apache-2.0Java
Elassandra = Elasticsearch + Apache Cassandra
Luke1,444
4 years agon,ullapache-2.0Java
This is mavenised Luke: Lucene Toolbox Project
Fess911122513 hours ago145September 17, 202326apache-2.0Java
Fess is very powerful and easily deployable Enterprise Search Server.
Hibernate Search4788405517 hours ago141October 18, 20213otherJava
Hibernate Search: full-text search for domain model
Elasticsearch Analysis Mmseg350
6 years ago7March 02, 20177apache-2.0Java
The Mmseg Analysis plugin integrates Lucene mmseg4j-analyzer:http://code.google.com/p/mmseg4j/ into elasticsearch, support customized dictionary.
Fast Elasticsearch Vector Scoring348
2 years ago22apache-2.0Java
Score documents using embedding-vectors dot-product or cosine-similarity with ES Lucene engine
Alternatives To Elasticsearch Analysis Mmseg
Select To Compare


Alternative Project Comparisons
Readme

Mmseg Analysis for Elasticsearch

The Mmseg Analysis plugin integrates Lucene mmseg4j-analyzer:http://code.google.com/p/mmseg4j/ into elasticsearch, support customized dictionary.

The plugin ships with analyzers: mmseg_maxword ,mmseg_complex ,mmseg_simple and tokenizers: mmseg_maxword ,mmseg_complex ,mmseg_simple and token_filter: cut_letter_digit .

Versions

Mmseg ver ES version
master 5.x -> master
5.5.2 5.5.2
5.4.3 5.4.3
5.3.2 5.3.2
5.2.2 5.2.2
5.1.2 5.1.2
1.10.1 2.4.1
1.9.5 2.3.5
1.8.1 2.2.1
1.7.0 2.1.1
1.5.0 2.0.0
1.4.0 1.7.0
1.3.0 1.6.0
1.2.1 0.90.2
1.1.2 0.20.1

Package

mvn package

Install

Unzip and place into elasticsearch's plugins folder, download plugin from here: https://github.com/medcl/elasticsearch-analysis-mmseg/releases

Install by command: ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-mmseg/releases/download/v5.5.2/elasticsearch-analysis-mmseg-5.5.2.zip

Mapping Configuration

Here is a quick example:

1.Create a index

curl -XPUT http://localhost:9200/index

2.Create a mapping

curl -XPOST http://localhost:9200/index/fulltext/_mapping -d'
{
        "properties": {
            "content": {
                "type": "text",
                "term_vector": "with_positions_offsets",
                "analyzer": "mmseg_maxword",
                "search_analyzer": "mmseg_maxword"
            }
        }
    
}'

3.Indexing some docs

curl -XPOST http://localhost:9200/index/fulltext/1 -d'
{"content":"美国留给伊拉克的是个烂摊子吗"}
'

curl -XPOST http://localhost:9200/index/fulltext/2 -d'
{"content":"公安部:各地校车将享最高路权"}
'

curl -XPOST http://localhost:9200/index/fulltext/3 -d'
{"content":"中韩渔警冲突调查:韩警平均每天扣1艘中国渔船"}
'

curl -XPOST http://localhost:9200/index/fulltext/4 -d'
{"content":"中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"}
'

4.Query with highlighting

curl -XPOST http://localhost:9200/index/fulltext/_search  -d'
{
    "query" : { "term" : { "content" : "中国" }},
    "highlight" : {
        "pre_tags" : ["<tag1>", "<tag2>"],
        "post_tags" : ["</tag1>", "</tag2>"],
        "fields" : {
            "content" : {}
        }
    }
}
'

Here is the query result


{
    "took": 14,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 2,
        "hits": [
            {
                "_index": "index",
                "_type": "fulltext",
                "_id": "4",
                "_score": 2,
                "_source": {
                    "content": "中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"
                },
                "highlight": {
                    "content": [
                        "<tag1>中国</tag1>驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首 "
                    ]
                }
            },
            {
                "_index": "index",
                "_type": "fulltext",
                "_id": "3",
                "_score": 2,
                "_source": {
                    "content": "中韩渔警冲突调查:韩警平均每天扣1艘中国渔船"
                },
                "highlight": {
                    "content": [
                        "均每天扣1艘<tag1>中国</tag1>渔船 "
                    ]
                }
            }
        ]
    }
}

Have fun.

Popular Elasticsearch Projects
Popular Lucene Projects
Popular Data Storage Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Java
Plugin
Elasticsearch
Lucene