Awesome Open Source

Programming Languages

Search results for pre training

170 search results found

Llmsurvey ⭐ 7,255

The official GitHub page for the survey paper "A Survey of Large Language Models".

Lmops ⭐ 3,145

General technology for enabling AI capabilities w/ LLMs and MLLMs

Uer Py ⭐ 2,802

Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Mplug Owl ⭐ 1,657

[Official Implementation] mPLUG-Owl & mPLUG-Owl2: Alibaba MLLM Family.

Awesome Self Supervised Gnn ⭐ 1,356

Papers about pretraining and self-supervised learning on Graph Neural Networks (GNN).

Spark ⭐ 1,355

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

Prompt In Context Learning ⭐ 1,236

Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.

Awesome Graph Self Supervised Learning ⭐ 1,194

Awesome Graph Self-Supervised Learning

Oscar and VinVL

Data Juicer ⭐ 994

A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据！

Tencentpretrain ⭐ 951

Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo

Xmodaler ⭐ 929

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

Bert_language_understanding ⭐ 886

Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN

An Open-sourced Knowledgable Large Language Model Framework.

Awesome Clip ⭐ 782

Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).

Awesome Vision Language Pretraining Papers ⭐ 724

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

Vl Bert ⭐ 680

Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".

EntitySeg Toolbox: Towards Open-World and High-Quality Image Segmentation

魔搭大模型训练推理部署工具箱，支持LLaMA、千问、ChatGLM、BaiChuan等多种模型及Lo LLM training/inference/deployment framework of ModelScope community, Support various models like LLaMA, Qwen, ChatGLM, Baichuan and others, and training methods like LoRA, ResTuning, NEFTune, etc.)

Imagenet21k ⭐ 576

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper

Awesome Timeseries Spatiotemporal Lm Llm ⭐ 541

A professional list on Large (Language) Models and Foundation Models (LLM, LM, FM) for Time Series, Spatiotemporal, and Event Data.

Uni Mol ⭐ 495

Official Repository for the Uni-Mol Series Methods

Gpt Gnn ⭐ 451

Code for KDD'20 "Generative Pre-Training of Graph Neural Networks"

Paddlefleetx ⭐ 424

飞桨大模型开发套件，提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"

Graphcl ⭐ 417

[NeurIPS 2020] "Graph Contrastive Learning with Augmentations" by Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, Yang Shen

PITI: Pretraining is All You Need for Image-to-Image Translation

Azureml Bert ⭐ 384

End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service

Xpretrain ⭐ 382

Multi-modality pre-training

Llm Shearing ⭐ 351

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Languagebind ⭐ 346

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Bigdetection ⭐ 291

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training @ KDD 2020

Awesome Llm4rs Papers ⭐ 282

Large Language Model-enhanced Recommender System Papers

An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"

Code for our SIGKDD'22 paper Pre-training-Enhanced Spatial-Temporal Graph Neural Network For Multivariate Time Series Forecasting.

Awesome Recommend System Pretraining Papers ⭐ 262

Paper List for Recommend-system PreTrained Models

Chatplug ⭐ 251

A Chinese Open-Domain Dialogue System

PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)

Conceptual 12m ⭐ 235

Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.

[ICCV2023] UniVTG: Towards Unified Video-Language Temporal Grounding

Ponderv2 ⭐ 229

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm

Et Bert ⭐ 219

The repository of ET-BERT, a network traffic classification model on encrypted traffic. The work has been accepted as The Web Conference (WWW) 2022 accepted paper.

Gpt 2 Tensorflow2.0 ⭐ 218

OpenAI GPT2 pre-training and sequence prediction implementation in Tensorflow 2.0

Probing Vits ⭐ 209

Probing the representations of Vision Transformers.

Kaleido Bert ⭐ 207

(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain.

Multimodal_bigmodels_survey ⭐ 207

[MIR-2023] A continuously updated paper list for multi-modal pre-trained big models

Gearnet ⭐ 196

GearNet and Geometric Pretraining Methods for Protein Structure Representation Learning, ICLR'2023 (https://arxiv.org/abs/2203.06125)

Sigir2020_peterrec ⭐ 194

Universal User Representation Pre-training for Cross-domain Recommendation and User Profiling

Mathpile ⭐ 192

Generative AI for Math: MathPile

[NeurIPS 2022] DRAGON 🐲: Deep Bidirectional Language-Knowledge Graph Pretraining

Electra Pytorch ⭐ 188

A simple and working implementation of Electra, the fastest way to pretrain language models from scratch, in Pytorch

All In One ⭐ 180

[CVPR2023] All in One: Exploring Unified Video-Language Pre-training

Awesome Mim ⭐ 178

[Survey] Masked Modeling for Self-supervised Representation Learning on Vision and Beyond (https://arxiv.org/abs/2401.00897)

Awesome Vision And Language Pre Training ⭐ 176

Recent Advances in Vision and Language Pre-training (VLP)

[NeurIPS2022] Egocentric Video-Language Pretraining

Recommendation Systems Without Explicit Id Features A Literature Review ⭐ 171

Large pre-trained Foundation recommender models

The official repo for [NeurIPS'23] "SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model"

Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.

code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022

Bamboo: 4 times larger than ImageNet; 2 time larger than Object365; Built by active learning.

Mlm Pytorch ⭐ 141

An implementation of masked language modeling for Pytorch, made as concise and simple as possible

Pytorch_violet ⭐ 130

A PyTorch implementation of VIOLET

Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"

Official implementation of SaProt.

Frozenbilm ⭐ 120

[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models

Drrepair ⭐ 115

DrRepair: Learning to Repair Programs from Error Messages

Scaling Laws Openclip ⭐ 112

Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)

3d Vista ⭐ 109

Official implementation of ICCV 2023 paper "3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment"

Just Ask ⭐ 101

[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

AAAI-20 paper: Cross-Lingual Natural Language Generation via Pre-Training

Data_management_llm ⭐ 101

Collection of training data management explorations for large language models

Awesome Pretraining For Graph Neural Networks ⭐ 100

A curated list of papers on pre-training for graph neural networks (Pre-train4GNN).

Ontoprotein ⭐ 98

Code and datasets for the ICLR2022 paper "OntoProtein: Protein Pretraining With Gene Ontology Embedding"

Pretraining With Human Feedback ⭐ 97

Code accompanying the paper Pretraining Language Models with Human Preferences

Bert Tickets ⭐ 94

[NeurIPS 2020] "The Lottery Ticket Hypothesis for Pre-trained BERT Networks", Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Zhangyang Wang, Michael Carbin

Vidchapters ⭐ 93

[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale

Proteinworkshop ⭐ 93

Benchmarking framework for protein representation learning. Includes a large number of pre-training and downstream task datasets, models and training/task utilities.

Helo Word ⭐ 88

Team Kakao&Brain's Grammatical Error Correction System for the ACL 2019 BEA Shared Task

Autoregressive Predictive Coding ⭐ 87

Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

Code and Data for EMNLP2020 Paper "KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation"

PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)

[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

Moleculestm ⭐ 81

Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-

The official repo for [TGRS'22] "An Empirical Study of Remote Sensing Pretraining"

Forge_vfm4ad ⭐ 76

A comprehensive survey of forging vision foundation models for autonomous driving, including challenges, methodologies, and opportunities.

(NeurIPS 2022) Self-Supervised Visual Representation Learning with Semantic Grouping

[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Adv Ss Pretraining ⭐ 65

[CVPR 2020] Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning

Code and pre-trained models for the paper "Domain-Agnostic Molecular Generation with Self-feedback."

Linkbert ⭐ 63

[ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links

TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning (NAACL 2022)

Official codes: Self-Supervised Learning by Estimating Twin Class Distribution

Code for our paper "Graph Pre-training for AMR Parsing and Generation" in ACL2022

Powerfulpromptft ⭐ 59

[NeurIPS 2023 Main Track] This is the repository for the paper titled "Don’t Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner"

Improving Biomedical Pretrained Language Models with Knowledge [BioNLP 2021]

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

Revisiting Contrastive Ssl ⭐ 51

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [NeurIPS 2021]

Mix Generation ⭐ 51

MixGen: A New Multi-Modal Data Augmentation

Related Searches

Python Pre Training (73)

1-100 of 170 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.