multimodal

Star

Here are 741 public repositories matching this topic...

danijar / granular

Star

Fast dataset format and loader

python machine-learning research ai artificial-intelligence datasets multimodal

Updated Jul 25, 2024
Python

Yangyi-Chen / Multimodal-AND-Large-Language-Models

Star

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

machine-learning multimodal large-language-models general-purpose-model

Updated Jul 25, 2024

NVIDIA / NeMo

Star

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation large-language-models speaker-diariazation generative-ai

Updated Jul 25, 2024
Python

rustic-ai / ui-components

Star

React component library for crafting user-friendly and engaging conversational experiences

chat ai reactjs mui reactjs-components conversational-ai multimodal

Updated Jul 25, 2024
JavaScript

livekit / agents

Star

Build real-time multimodal AI applications 🤖🎙️📹

real-time video ai voice agents voice-assistant multimodal

Updated Jul 25, 2024
Python

notes for software engineers getting up to speed on new AI developments. Serves as datastore for https://latent.space writing, and product brainstorming, but has cleaned up canonical references under the /Resources folder.

ai openai gpt multimodal gpt-3 prompt-engineering stable-diffusion

Updated Jul 25, 2024
HTML

aws-samples / improve-employee-productivity-using-genai

Star

Employee Productivity GenAI Assistant Example is an innovative code sample and architecture pattern designed to enhance writing tasks efficiency using AWS serverless technologies and Amazon Bedrock's generative AI models.

aws aws-lambda aws-s3 aws-apigateway aws-serverless aws-dynamodb aws-sam multimodal servereless aws-cloud9 generative-ai anthropic-claude genai aws-bedrock bedrock-claude-llm

Updated Jul 25, 2024
JavaScript

zjysteven / lmms-finetune

Star

A minimal codebase for finetuning large multimodal models, supporting llava-1.5, qwen-vl, llava-interleave, llava-next-video, phi3-v etc.

finetuning multimodal vision-language foundation-models instruction-tuning large-language-model llava visual-instruction-tuning multimodal-large-language-models large-multimodal-models qwen-vl llava-next

Updated Jul 25, 2024
Python

louis030195 / screen-pipe

Sponsor

Star

Library to build personalized AI powered by what you've seen, said, or heard. Works with Ollama. Alternative to Rewind.ai. Open. Secure. You own your data. Rust.

machine-learning ai computer-vision ml vision multimodal llm

Updated Jul 25, 2024
Rust

kyegomez / swarms

Sponsor

Star

The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework Join our Community: https://discord.com/servers/agora-999382051935506503

Updated Jul 25, 2024
Python

rerun-io / rerun

Star

Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.

visualization python rust computer-vision cpp robotics multimodal

Updated Jul 25, 2024
Rust

AI4HealthUOL / MDS-ED

Star

Repository for the paper 'MDS-ED: Multimodal Decision Support in the Emergency Department – a benchmark dataset based on MIMIC-IV'.

benchmark deep-learning waveforms ecg healthcare datasets multimodal medical-dataset

Updated Jul 25, 2024
Python

iterative / datachain

Star

DataChain 🔗 Process and curate unstructured data using local ML models and LLM calls

ai cv embeddings data-analytics data-wrangling multimodal mlops llm llm-eval

Updated Jul 25, 2024
Python

LLaVA-VL / LLaVA-Interactive-Demo

Star

LLaVA-Interactive-Demo

multimodal lmm

Updated Jul 25, 2024
Python

xlang-ai / OSWorld

Star

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

agent cli benchmark natural-language-processing gui reinforcement-learning artificial-intelligence code-generation language-model vlm rpa multimodal llm large-action-model

Updated Jul 25, 2024
Python

songqiang321 / Awesome-AI-Papers

Star

This repository is used to collect papers and code in the field of AI.

Updated Jul 25, 2024

rte-design / ASTRA.ai

Star

A lightning-fast workflow builder, it supports multimodal interaction, highly customizable extensions, and is intuitive to use even without any coding knowledge.

nodejs python agent golang workflow typescript ai cpp chatbot realtime gemini openai voice-assistant realtime-framework multimodal gpt-4 llm nextjs14

Updated Jul 25, 2024
Go

willxxy / awesome-mmps

Star

Corpus of resources for multimodal machine learning with physiological signals (mmps).

machine-learning deep-learning signal-processing physiological-signals multimodal-learning multimodal multimodal-deep-learning multimodal-data

Updated Jul 25, 2024

JosefAlbers / Phi-3-Vision-MLX

Star

Phi-3 for Mac: Locally-run Vision and Language Models for Apple Silicon

macos api agent mac metal lora multi-agent-systems mlx vlm fine-tuning finetuning multimodal llm phi-3 phi-3-vision phi-3-mini

Updated Jul 25, 2024
Jupyter Notebook

dusty-nv / NanoLLM

Star

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.

speech multimodal rag edge-ai vector-database vision-transformer llm-inference

Updated Jul 25, 2024
Python

Improve this page

Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodal

Here are 741 public repositories matching this topic...

danijar / granular

Yangyi-Chen / Multimodal-AND-Large-Language-Models

NVIDIA / NeMo

rustic-ai / ui-components

livekit / agents

swyxio / ai-notes

aws-samples / improve-employee-productivity-using-genai

zjysteven / lmms-finetune

louis030195 / screen-pipe

kyegomez / swarms

rerun-io / rerun

AI4HealthUOL / MDS-ED

iterative / datachain

LLaVA-VL / LLaVA-Interactive-Demo

xlang-ai / OSWorld

songqiang321 / Awesome-AI-Papers

rte-design / ASTRA.ai

willxxy / awesome-mmps

JosefAlbers / Phi-3-Vision-MLX

dusty-nv / NanoLLM

Improve this page

Add this topic to your repo