Awesome Open Source

Programming Languages

Search results for python foundation models

foundation-models x

44 search results found

Unilm ⭐ 16,971

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Llava ⭐ 12,514

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Otter ⭐ 3,322

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Next Gpt ⭐ 2,602

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Ask Anything ⭐ 2,404

[VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

EVA Series: Visual Representation Fantasies from BAAI

Autodistill ⭐ 1,286

Images to inference with no labeling (use foundation models to train supervised models)

Emu Series: Generative Multimodal Models from BAAI

Internvideo ⭐ 736

InternVideo: General Video Foundation Models via Generative and Discriminative Learning (https://arxiv.org/abs/2212.03191)

One Peace ⭐ 714

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

Fastervit ⭐ 539

[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention

Groundinglmm ⭐ 434

Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Mindvideo ⭐ 314

Official code base for MinD-Video

Fondant ⭐ 293

Production-ready data processing made easy and shareable

Pointllm ⭐ 276

[arXiv 2023] PointLLM: Empowering Large Language Models to Understand Point Clouds

[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI

[NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"

Ponderv2 ⭐ 229

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm

Official Repository of NeurIPS 2023 - MedFM Challenge

This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"

Stu Net ⭐ 167

The largest pre-trained medical image segmentation model (1.4B parameters) based on the largest public dataset (>100k annotations), up until April 2023.

Lrv Instruction ⭐ 160

[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

Grid Playground ⭐ 125

Platform for General Robot Intelligence Development

Emernerf ⭐ 120

PyTorch Implementation of EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision

Intelligent App Workshop ⭐ 114

Immersive workshop showcasing the remarkable potential of integrating SoTA foundation models to enhance product experiences and streamline backend workflows. Leverages Microsoft's Copilot stack, Semantic Kernel and Azure primitives to offer an engaging and comprehensive introduction to AI-infused app development and deployment

RS5M: a large-scale vision language dataset for remote sensing

Voxposer ⭐ 103

VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models

Vip Llava ⭐ 81

ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

Attackvlm ⭐ 79

Code of the paper: On Evaluating Adversarial Robustness of Large Vision-Language Models

Blackvip ⭐ 65

Official implementation for CVPR'23 paper "BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning"

Microlens ⭐ 63

A huge rec dataset with raw text/audio/image/videos provided (Talk Invited at DeepMind).

Generative Ai Sagemaker Cdk Demo ⭐ 56

Deploy Generative AI models from Amazon SageMaker JumpStart using AWS CDK

FEMR (Framework for Electronic Medical Records) provides tooling for large-scale, self-supervised learning using electronic health records

FLAIR: A Foundation LAnguage-Image model of the Retina for fundus image understanding.

[ICLR 2024] Generating DP Synthetic Data without Training

Guidance For Natural Language Queries Of Relational Databases On Aws ⭐ 22

Demonstration of Natural Language Query (NLQ) of an Amazon RDS for PostgreSQL database, using SageMaker JumpStart Foundation Models, LangChain, Streamlit, and Chroma.

Official Implementation of Collaborating Foundation models for Domain Generalized Semantic Segmentation

[ICLR24] AutoVP: An Automated Visual Prompting Framework and Benchmark

Kernel Infonce ⭐ 7

Official implementation of ICLR 2024 paper "Contrastive Learning Is Spectral Clustering On Similarity Graph" (https://arxiv.org/abs/2303.15103)

Official implementation of Matrix Variational Masked Autoencoder (M-MAE) for paper "Information Flow in Self-Supervised Learning" (https://arxiv.org/abs/2309.17281)

Matrix Ssl ⭐ 6

Official implementation of paper "Matrix Information Theory for Self-supervised Learning" (https://arxiv.org/abs/2305.17326)

Official PyTorch Implementation for Towards Foundation Models Learned from Anatomy in Medical Imaging via Self-Supervision

Surgicaldino ⭐ 6

[IPCAI'2024] Surgical-DINO: Adapter Learning of Foundation Model for Depth Estimation in Endoscopic Surgery

Multi Temporal Crop Classification Baseline ⭐ 5

Baseline model for crop type segmentation as part of the HLS FM downstream task evaluations

Related Searches

Python Machine Learning (14,099)

Python Tensorflow (13,736)

Python Deep Learning (13,092)

Python Natural Language Processing (9,064)

Python Artificial Intelligence (8,580)

Python Amazon Web Services (8,273)

Python Pytorch (7,877)

Python Pandas (6,193)

Python Data Science (4,679)

Python Language (4,480)

1-44 of 44 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.