Image and Video Segmentation with SAM 2, Gemma 2 for Efficient Language Models, Boosting Small Models with Contrastive Fine-Tuning, and MM-Vet v2 Challenges Large Multimodal Models
Aug 5 2024
Length: 14 mins
Podcast

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Please try again later

Please try again later

Please try again

Listen for free

View show details

Summary
SAM 2: Segment Anything in Images and Videos Gemma 2: Improving Open Language Models at a Practical Size Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning OmniParser for Pure Vision Based GUI Agent SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities

Show More Show Less

Average customer ratings

No Reviews are Available

Report a review on Amazon