Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
[ICLR 2025] From anything to mesh like human artists. Official impl. of "MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers"
[CSUR] A Survey on Video Diffusion Models
[3DV 2025] Code for "FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent" by Cameron Smith*, David Charatan*, Ayush Tewari, and Vincent Sitzmann
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Solving SMPL/MANO parameters from keypoint coordinates.
Threestudio extension of the paper "Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation".
[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
PyTorch code and models for the DINOv2 self-supervised learning method.
OpenMMLab Self-Supervised Learning Toolbox and Benchmark
Meta-Transformer for Unified Multimodal Learning
VRS is a file format optimized to record & playback streams of sensor data, such as images, audio samples, and any other discrete sensors (IMU, temperature, etc), stored in per-device streams of ti…
This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the informati…
ImageBind One Embedding Space to Bind Them All
Unifying Variational Autoencoder (VAE) implementations in Pytorch (NeurIPS 2022)
Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive arch…
Physics Informed Deep Learning: Data-driven Solutions and Discovery of Nonlinear Partial Differential Equations
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
Google AI 2018 BERT pytorch implementation
Recipe for a General, Powerful, Scalable Graph Transformer
[CVPR 2023] Executing your Commands via Motion Diffusion in Latent Space, a fast and high-quality motion diffusion model
Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch
Learning Graph Normalization for Graph Neural Networks