中文文本语义相似度(Chinese Semantic Text Similarity)语料库建设
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Text2vec67813 months ago21May 17, 20222apache-2.0Python
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
a year ago13December 16, 2021mitPython
Core Data of HowNet and OpenHowNet Python API
5 years ago5
Macropodus25612 years ago7December 25, 20201mitPython
自然语言处理工具Macropodus,基于Albert+BiLSTM+CRF深度学习网络架构,中文分词,词性标注,命名实体识别,新词发现,关键词,文本摘要,文本相似度,科学计算器,中文数字阿拉伯数字(罗马数字)转换,中文繁简转换,拼音转换。tookit(tool) of NLP,CWS(chinese word segnment),POS(Part-Of-Speech Tagging),NER(name entity recognition),Find(new words discovery),Keyword(keyword extraction),Summarize(text summarization),Sim(text similarity),Calculate(scientific calculator),Chi2num(chinese number to arabic number)
5 years ago4Java
A text analyzer which is based on machine learning,statistics and dictionaries that can analyze text. So far, it supports hot word extracting, text classification, part of speech tagging, named entity recognition, chinese word segment, extracting address, synonym, text clustering, word2vec model, edit distance, chinese word segment, sentence similarity,word sentiment tendency, name recognition, idiom recognition, placename recognition, organization recognition, traditional chinese recognition, pinyin transform.
Chinese Sentence Similarity Task129
2 years ago
4 years agomitC
Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components
4 years ago2Python
WordMultiSenseDisambiguation, chinese multi-wordsense disambiguation based on online bake knowledge base and semantic embedding similarity compute,基于百科知识库的中文词语多词义/义项获取与特定句子词语语义消歧.
Cn Words55
4 years agoJupyter Notebook
Get Similar Chinese Words and Sentences
22 days ago3April 29, 20191bsd-3-clausePython
A small package to fuzzy match chinese words
STS 中文文本语义相似度语料库建设

文本语义相似度(Semantic Text Similarity)是自然语言处理处理中的基本问题。



项目意义:目前英文sts语料训练数据较为丰富,中文sts(Chinese Semantic Text Similarity)语料很少,而语料是文本进行深度学习的基本起点。

项目实施起始日期:2016-06-06 06:06:06 0 0 131 66

如有引用或使用本训练集请注明作者信息: 唐善成, 白云悦, 马付玉. 中文语义相似度训练集. 西安科技大学.2016. IAdmireu/ChineseSTS

Tang Shancheng, Bai Yunyue, Ma Fuyu. Chinese Semantic Text Similarity Trainning Dataset. Xi'an University of Science and Technology.2016. IAdmireu/ChineseSTS

