Awesome Open Source

Programming Languages

Search results for multimodal gpt4

19 search results found

Tree Of Thoughts ⭐ 3,798

Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%

Mplug Owl ⭐ 1,657

[Official Implementation] mPLUG-Owl & mPLUG-Owl2: Alibaba MLLM Family.

Build, Deploy, and Scale Reliable Swarms of Autonomous Agents for Workflow Automation. Join our Community: https://discord.gg/DbjBMJTSWD

Democratization of RT-2 "RT-2: New model translates vision and language into action"

Implementation of "PaLM-E: An Embodied Multimodal Language Model"

Vlmevalkit ⭐ 137

Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 30+ HF models, 15+ benchmarks

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

Build high-performance AI models with modular building blocks

Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"

Awesome Multimodal Prompts ⭐ 79

Prompts of GPT-4V & DALL-E3 to full utilize the multi-modal ability. GPT4V Prompts, DALL-E3 Prompts.

Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment: Robotic Learning Datasets and RT-X Models"

Swarms Pytorch ⭐ 67

Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch 😊

Kosmos X ⭐ 53

The Next Generation Multi-Modality Superintelligence

Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model"

Kosmos2.5 ⭐ 34

My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"

Mambatransformer ⭐ 31

Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling

Reform Eval ⭐ 19

An benchmark for evaluating the capabilities of large vision-language models (LVLMs)

Implementation of AutoRT: "AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents"

Described ⭐ 14

Automatically describe images sent by users on popular media platforms, incredibly useful for the visually impaired and for complicated imagery.

PegasusX: The Future of Multimodal Embeddings 🦄 🦄

Plug in and play Implementation of "A Generalist Agent" by Deepmind.

The open source community's implementation of the all-new Multi-Modal Causal Attention from "DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention"

Multimodal Tot ⭐ 6

Multi-Modal Tree of thoughts for DALLE-3 like auto self improvement

Implementation of GATS from the paper: "GATS: Gather-Attend-Scatter" in pytorch and zeta

A simple implementation of a CLIP that splits up an image into quandrants and then gets the embeddings for each quandrant

Mlxtransformer ⭐ 5

Simple Implementation of a Transformer in the new framework MLX by Apple

Related Searches

Python Multimodal (186)

Python Gpt4 (144)

Artificial Intelligence Gpt4 (129)

Deep Learning Multimodal (86)

Artificial Intelligence Multimodal (58)

Machine Learning Multimodal (50)

Pytorch Multimodal (42)

1-19 of 19 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.