Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Qdrant | 10,572 | a day ago | 88 | apache-2.0 | Rust | |||||
Qdrant - Vector Database for the next generation of AI applications. Also available in the cloud https://cloud.qdrant.io/ | ||||||||||
Deep Learning For Recommendation Systems | 2,141 | 3 years ago | ||||||||
This repository contains Deep Learning based articles , paper and repositories for Recommender Systems | ||||||||||
Hora | 2,091 | 2 years ago | 2 | August 07, 2021 | 13 | apache-2.0 | Rust | |||
🚀 efficient approximate nearest neighbor search algorithm collections library written in Rust 🦀 . | ||||||||||
Reclearn | 1,406 | a year ago | 5 | mit | Python | |||||
Recommender Learning with Tensorflow2.x | ||||||||||
Lectures Labs | 1,309 | 6 months ago | 2 | mit | Jupyter Notebook | |||||
Slides and Jupyter notebooks for the Deep Learning lectures at Master Year 2 Data Science from Institut Polytechnique de Paris | ||||||||||
Machine Learning Specialization Coursera | 1,179 | a day ago | 4 | mit | Jupyter Notebook | |||||
Contains Solutions and Notes for the Machine Learning Specialization By Stanford University and Deeplearning.ai - Coursera (2022) by Prof. Andrew NG | ||||||||||
Deeprec | 1,068 | a year ago | 8 | gpl-3.0 | Python | |||||
An Open-source Toolkit for Deep Learning based Recommendation with Tensorflow. | ||||||||||
Recsys2019_deeplearning_evaluation | 871 | a year ago | 1 | agpl-3.0 | Python | |||||
This is the repository of our article published in RecSys 2019 "Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches" and of several follow-up studies. | ||||||||||
Neurec | 861 | 2 years ago | 1 | November 04, 2019 | 15 | Python | ||||
Next RecSys Library | ||||||||||
Tutorials | 847 | 5 months ago | 3 | other | Jupyter Notebook | |||||
AI-related tutorials. Access any of them for free → https://towardsai.net/editorial |
开源项目Recommender System with TF2.0
主要是对经典的推荐算法论文进行复现,包括Matching(召回)(MF、BPR、SASRec等)、Ranking(排序)(DeepFM、DCN等)。
建立原因:
项目特点:
README.md
,对于模型的训练使用有详细的介绍;
【2021.11.17】该项目建立了新的分支reclearn
,主要是对master内容进行了整理,构建了一个用于推荐算法学习的包,可以通过pip install reclearn
进行安装,具体可以查看reclearn;
【2021.11.17】该项目建立了新的分支reclearn
,主要是对master内容进行了整理,构建了一个用于推荐算法学习的包,可以通过pip install reclearn
进行安装,具体可以查看reclearn;
【2021.05.19】Wide&Deep模型,之前Wide部分采用连续型数据,更改为采用稀疏离散型数据作为输入;
【2021.05.18】更新内容较多,分为以下:
data_process
文件,将CTR模型中的utils.py
移动到该文件夹下,并改名为criteo.py
,以后所有模型训练时统一调用该文件夹下处理后的数据;I1-I13
)采用离散化分桶,与离散型数据合并;tf.ont_hot
,改用tf.nn.embedding_lookup
,通过映射方式实现;【2020.12.20】在Top-K模型中,评估方式为正负样本1:100的模型(MF-BPR、SASRec等),之前评估代码效率太低,因此进行了调整(目前评估时间大幅度缩短),同时也更新了utils.py
文件;
【2020.11.18】在Top-K模型中,不再考虑dense_inputs
、sparse_inputs
,并且user_inputs
和seq_inputs
不考虑多个类别,只将id
特征作为输入(降低了模型的可扩展性,但是提高了模型的可读性);
【2020.11.18】BPR、SASRec模型进行了更新,加入了实验结果;
Paper|Model | Published | Author |
---|---|---|
Matrix Factorization Techniques for Recommender Systems|MF | IEEE Computer Society,2009 | Koren|Yahoo Research |
BPR: Bayesian Personalized Ranking from Implicit Feedback|MF-BPR | UAI, 2009 | Steffen Rendle |
Neural network-based Collaborative Filtering|NCF | WWW, 2017 | Xiangnan He |
Self-Attentive Sequential Recommendation|SASRec | ICDM, 2018 | UCSD |
STAMP: Short-Term Attention/Memory Priority Model for Session-based Recommendation| STAMP | KDD, 2018 | Qiao Liu |
Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding|Caser | WSDM, 2018 | Jiaxi Tang |
Next Item Recommendation with Self-Attentive Metric Learning|AttRec | AAAAI, 2019 | Shuai Zhang |
Paper|Model | Published | Author |
---|---|---|
Factorization Machines|FM | ICDM, 2010 | Steffen Rendle |
Field-aware Factorization Machines for CTR Prediction|FFM | RecSys, 2016 | Criteo Research |
Wide & Deep Learning for Recommender Systems|WDL | DLRS, 2016 | Google Inc. |
Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features|Deep Crossing | KDD, 2016 | Microsoft Research |
Product-based Neural Networks for User Response Prediction|PNN | ICDM, 2016 | Shanghai Jiao Tong University |
Deep & Cross Network for Ad Click Predictions|DCN | ADKDD, 2017 | Stanford University|Google Inc. |
Neural Factorization Machines for Sparse Predictive Analytics|NFM | SIGIR, 2017 | Xiangnan He |
Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks|AFM | IJCAI, 2017 | Zhejiang University|National University of Singapore |
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction|DeepFM | IJCAI, 2017 | Harbin Institute of Technology|Noah’s Ark Research Lab, Huawei |
xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems|xDeepFM | KDD, 2018 | University of Science and Technology of China |
Deep Interest Network for Click-Through Rate Prediction|DIN | KDD, 2018 | Alibaba Group |
一些公开数据集链接失效,总是有同学找我要,但是由于数据集过大,无法上传。因此我提供以下链接方便下载:
项目中难免会存在一些代码Bug,感谢以下朋友指出问题:
wangzhe258369:指出在DIN模型中tf.keras.layers.BatchNormalization
默认行为是training=False
,此时不会去更新BN中的moving_mean和moving_variance变量。但是重新修改了DIN模型代码内容时,再仔细查找了资料,发现:
如果使用模型调用fit()的话,是可以不给的(官方推荐是不给),因为在fit()的时候,模型会自己根据相应的阶段(是train阶段还是inference阶段)决定training值,这是由learning——phase机制实现的。
boluochuile:发现SASRec模型训练出错,原因是验证集必须使用tuple
的方式,已更正;
dominic-z:指出DIN中Attention的mask问题,更改为从seq_inputs
中得到mask,因为采用的是0填充(这里与重写之前的代码不同,之前是在每个mini-batch中选择最大的长度作为序列长度,不会存在序列过长被切割的问题,而现在为了方便,采用最普遍padding
的方法)
dominic-z:指出DIN训练中seq_inputs
shape与model不匹配的问题,已更正,应该是(batch_size, maxlen, behavior_num)
,model相关内容进行更改,另外对于行为数量,之前的名称seq_len
有歧义,改为behavior_num
;添加了重写之前的代码,在DIN/old目录下
zhangfangkai、R7788380:指出在使用movielens的utils.py
文件中,trans_score
并不能指定正负样本,应将
data_df.loc[data_df.label < trans_score, 'label'] = 0
data_df.loc[data_df.label >= trans_score, 'label'] = 1
更改为:
data_df = data_df[data_df.label >= trans_score]
1、对于项目有任何建议或问题,可以在Issue
留言,或者发邮件至[email protected]
。
2、作者有一个自己的公众号:潜心学习的潜心,如果喜欢里面的内容,不妨点个关注。