๐Ÿค— HuggingFace Daily Papers
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding
โ–ฒ 63 ๐ŸŽ“ H-EmbodVis
  • A video diffusion model is repurposed as a latent world simulator to enhance multimodal large language models with implicit 3D structural priors and p
SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Edit
โ–ฒ 57 ๐Ÿข Baidu
  • SAMA presents a factorized approach to video editing that separates semantic anchoring from motion modeling, enabling instruction-guided edits with pr
FASTER: Rethinking Real-Time Flow VLAs
โ–ฒ 41 ๐ŸŽ“ The University of Hong Kong
  • Fast Action Sampling for ImmediaTE Reaction (FASTER) reduces real-time reaction latency in Vision-Language-Action models by adapting sampling schedule
3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model
โ–ฒ 41 ๐ŸŽ“ Yonsei University
  • A novel 3D-aware video customization framework is presented that decouples spatial geometry from temporal motion using a 1-frame optimization approach
Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer
โ–ฒ 34 ๐ŸŽ“ MMLab@NTU
  • A three-stage motion generation framework combines discrete token-based planning with diffusion-based synthesis to improve controllability and fidelit
MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction
โ–ฒ 28 ๐ŸŽ“ MMLab@NTU
  • MonoArt presents a unified framework for reconstructing articulated 3D objects from single images through progressive structural reasoning that enable
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distilla
โ–ฒ 28 ๐Ÿข NVIDIA
  • Nemotron-Cascade 2 is a 30B parameter Mixture-of-Experts model with 3B activated parameters that achieves exceptional reasoning and agentic capabiliti
Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation To
โ–ฒ 26 ๐ŸŽ“ University of Hong Kong
  • CubiD is a discrete generation model for high-dimensional representations that enables fine-grained masking and learns rich correlations across spatia
Multimodal Large Language Models 2Spatial Blindness 13D Structural Priors 1Physical Laws 1Video Diffusion Model 1Latent World Simulator 1Spatiotemporal Features 1Token-Level Adaptive Gated Fusion 1Generative Priors 1Embodied Manipulation 1Semantic Anchoring 1Motion Alignment 1
๐Ÿ›๏ธ Top Research Institutions
Evaluating Counterfactual Strategic Reasoning in Large Language Models
๐Ÿ›๏ธ National Technical University of Athens AI & Machine Learning
  • Current large language models exhibit significant social biases that are not adequately addressed by existing debiasing
  • The paper proposes UGID, a unified graph isomorphism approach for effectively debiasing large language models.
NavTrust: Benchmarking Trustworthiness for Embodied Navigation
๐Ÿ›๏ธ University of California, Riverside AI & Machine Learning
  • Automated Alzheimer's Disease detection methods often lack alignment with clinical constructs, leading to potential misd
  • The paper presents Agentic Cognitive Profiling, which realigns automated detection methods with clinical validity.
HierarchicalKV: A GPU Hash Table with Cache Semantics for Continuous Online Embedding Stor
๐Ÿ›๏ธ NVIDIA Systems & Infrastructure
  • Traditional GPU hash tables waste memory by preserving every inserted key, leading to inefficiencies in high-bandwidth m
  • HierarchicalKV introduces a GPU hash table with cache semantics that optimizes memory usage for continuous online embedd
Towards Exponential Quantum Improvements in Solving Cardinality-Constrained Binary Optimiz
๐Ÿ›๏ธ University of Cambridge Theory & Algorithms
  • Cardinality-constrained binary optimization is a fundamental computational challenge with wide-ranging applications.
  • Introduction of a Grover-based quantum algorithm that significantly improves the efficiency of solving these optimizatio
Post-Quantum Cryptography from Quantum Stabilizer Decoding
๐Ÿ›๏ธ Massachusetts Institute of Technology Applications
  • Current post-quantum cryptography relies on a limited number of hardness assumptions, posing significant security risks.
  • A new framework for post-quantum cryptography based on quantum stabilizer decoding techniques.
Exposing Cross-Modal Consistency for Fake News Detection in Short-Form Videos
๐Ÿ›๏ธ Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) Other CS
  • Short-form video platforms are significant sources of misinformation, where cross-modal relationships can mislead viewer
  • A framework that exposes cross-modal consistency for detecting fake news in short-form videos.
Risk-Based Auto-Deleveraging
๐Ÿ›๏ธ Columbia University Quantitative Finance
  • The mechanisms of auto-deleveraging in cryptocurrency futures exchanges are poorly understood and under-researched.
  • This paper proposes a risk-based framework to analyze auto-deleveraging mechanisms in the context of cryptocurrency futu
Consistencies in Social Ranking
๐Ÿ›๏ธ The University of Tokyo Economics
  • Ranking individuals based on performance across different contexts is often inconsistent and problematic.
  • A systematic approach to establish consistencies in social ranking across various coalitions.
AI & Machine Learning
487 papers 7 cats
Systems & Infrastructure
26 papers 6 cats
Software & Programming
16 papers 4 cats
Theory & Algorithms
26 papers 5 cats
Applications
124 papers 7 cats
Other CS
28 papers 8 cats
Quantitative Finance
11 papers 9 cats
Economics
8 papers 3 cats