Unboxing AI: The Podcast for Computer Vision Engineers
By: Unboxing AI
Podcast

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Unboxing AI: The Podcast for Computer Vision Engineers

By: Unboxing AI

Listen for free

Summary
I'm Gil Elbaz, Co-founder and CTO of Datagen. In this podcast, I speak with interesting computer vision thinkers and practitioners. I ask the big questions that touch on the issues and challenges that ML and CV engineers deal with every day. On the way, I hope you uncover a new subject or gain a different perspective, as well as enjoying engaging conversation. It’s about much more than the technical processes – it’s about people, journeys, and ideas. Turn up the volume, insights inside.

Unboxing AI

Show More Show Less

Science

Show More Show Less

Episodes View all

YOLO: Building AI with an Open-Source Community

Apr 16 2023

ABSTRACT
Our guest this episode is Glenn Jocher, CEO and founder of Ultralytics, the company that brought you YOLO v5 and v8. Gil and Glenn discuss how to build an open-source community on Github, the history of YOLO and even particle physics. They also talk about the progress of AI, diffusion and transformer models and the importance of simulated synthetic data today. The first episode of season 2 is full of stimulating conversation to understand the applications of YOLO and the impact of open source on the AI community. TOPICS & TIMESTAMPS

0:00 Introduction
2:03 First Steps in Machine Learning
9:40 Neutrino Particles and Simulating Neutrino Detectors
14:18 Ultralytics
17:36 Github
21:09 History of YOLO
25:28 YOLO for Keypoints
29:00 Applications of YOLO
30:48 Transformer and Diffusion Models for Detection
35:00 Speed Bottleneck
37:23 Simulated Synthetic Data Today
42:08 Sentience of AGI and Progress of AI
46:42 ChatGPT, CLIP and LLaMA Open Source Models
50:04 Advice for Next Generation CV Engineers

LINKS & RESOURCES

Linkedin

Twitter

Google scholar

Ultralytics

Github

National Geospatial Intelligence Agency

Neutrino

Antineutrino

Joseph Redmon

Ali Farhadi

Enrico Fermi

Kashmir World Foundation

R-CNN

Fast R-CNN

LLaMA model

MS COCO

GUEST BIO

Glenn Jocher is currently the founder and CEO of Ultralytics, a company focused on enabling developers to create practical, real-time computer vision capabilities with a mission to make AI easy to develop. He has built one of the largest developer communities on GitHub in the machine learning space with over 50,000 stars for his YOLO v5 and YOLO v8 releases. This is one of the leading packages used for the development of edge device computer vision with a focus on object classification, detection, and segmentation at real-time speeds with limited compute resources. Glenn previously worked at the United States National Geospatial Intelligence Agency and published the first ever Global Antineutrino map.

ABOUT THE HOST:

I’m Gil Elbaz, co-founder and CTO of Datagen. In this podcast, I speak with interesting computer vision thinkers and practitioners. I ask the big questions that touch on the issues and challenges that ML and CV engineers deal with every day. On the way, I hope you uncover a new subject or gain a different perspective, as well as enjoying engaging conversation. It’s about much more than the technical processes – it’s about people, journeys, and ideas. Turn up the volume, insights inside.

Show More Show Less

53 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free
Synthetic Data: Simulation & Visual Effects at Scale

Jan 4 2023

ABSTRACT

Gil Elbaz speaks with Tadas Baltrusaitis, who recently released the seminal paper DigiFace 1M: 1 Million Digital Face Images for Face Recognition. Tadas is a true believer in synthetic data and shares his deep knowledge of the subject along with insights on the current state of the field and what CV engineers need to know. Join Gil as they discuss morphable models, multimodal learning, domain gaps, edge cases and more

TOPICS & TIMESTAMPS

0:00 Introduction

2:06 Getting started in computer science

3:40 Inferring mental states from facial expressions

7:16 Challenges of facial expressions

8:40 Open Face

10:46 MATLAB to Python

13:17 Multimodal Machine Learning

15:52 Multimodals and Synthetic Data

16:54 Morphable Models

19:34 HoloLens

22:07 Skill Sets for CV Engineers

25:25 What is Synthetic Data?

27:07 GANs and Diffusion Models

31:24 Fake it Til You Make It

35:25 Domain Gaps

36:32 Long Tails (Edge Cases)

39:42 Training vs. Testing

41:53 Future of NeRF and Diffusion Models

48:26 Avatars and VR/AR

50:39 Advice for Next Generation CV Engineers

51:58 Season One Wrap-Up

LINKS & RESOURCES

Tadas Baltrusaitis

LinkedIn Github Google Scholar

Fake it Til You Make It

Video

Github

Digiface 1M

A 3D Morphable Eye Region Model for Gaze Estimation
Hololens

Multimodal Machine Learning: A Survey and Taxonomy

3d face reconstruction with dense landmarks

Open Face

Open Face 2.0

Dr. Rana el Kaliouby

Dr. Louis-Philippe Morency

Peter Robinson

Jamie Shotton

Errol Wood

Affectiva

GUEST BIO

Tadas Baltrusaitis is a principal scientist working in the Microsoft Mixed Reality and AI lab in Cambridge, UK where he leads the human synthetics team. He recently co-authored the groundbreaking paper DigiFace 1M, a data set of 1 million synthetic images for facial recognition. Tadas is also the co-author of Fake It Till You Make It: Face Analysis in the Wild Using Synthetic Data Alone, among other outstanding papers. His PhD research focused on automatic facial expression analysis in difficult real world settings and he was a postdoctoral associate at Carnegie Mellon University where his primary research lay in automatic understanding of human behavior, expressions and mental states using computer vision.

ABOUT THE HOST

I’m Gil Elbaz, co-founder and CTO of Datagen. In this podcast, I speak with interesting computer vision thinkers and practitioners. I ask the big questions that touch on the issues and challenges that ML and CV engineers deal with every day. On the way, I hope you uncover a new subject or gain a different perspective, as well as enjoying engaging conversation. It’s about much more than the technical processes – it’s about people, journeys, and ideas. Turn up the volume, insights inside.

Show More Show Less

54 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free
SLAM and the Evolution of Spatial AI

Nov 7 2022

Host Gil Elbaz welcomes Andrew J. Davison, the father of SLAM. Andrew and Gil dive right into how SLAM has evolved and how it started. They speak about Spatial AI and what it means along with a discussion about global belief propagation. Of course, they talk about robotics, how it's impacted by new technologies like NeRF and what is the current state-of-the-art.

Timestamps and Topics

[00:00:00] Intro

[00:02:07] Early Research Leading to SLAM

[00:04:49] Why SLAM

[00:08:20] Computer Vision Based SLAM

[00:09:18] MonoSLAM Breakthrough
[00:13:47] Applications of SLAM [00:16:27] Modern Versions of SLAM [00:21:50] Spatial AI [00:26:04] Implicit vs. Explicit Scene Representations [00:34:32] Impact on Robotics [00:38:46] Reinforcement Learning (RL) [00:43:10] Belief Propagation Algorithms for Parallel Compute [00:50:51] Connection to Cellular Automata [00:55:55] Recommendations for the Next Generation of Researchers
Interesting Links:

Andrew Blake

Hugh Durrant-Whyte

John Leonard

Steven J. Lovegrove

Alex Mordvintsev

Prof. David Murray

Richard Newcombe

Renato Salas-Moreno

Andrew Zisserman
A visual introduction to Gaussian Belief Propagation
Github: Gaussian Belief Propagation

A Robot Web for Distributed Many-Device Localisation

In-Place Scene Labelling and Understanding with Implicit Scene Representation

Video

Video: Robotic manipulation of object using SOTA

Andrew Reacting to NERF in 2020

Cellular automata

Neural cellular automata

Dyson Robotics

Guest Bio

Andrew Davison is a professor of Robot Vision at the Department of Computing, Imperial College London. In addition, he is the director and founder of the Dyson robotics laboratory. Andrew pioneered the cornerstone algorithm - SLAM (Simultaneous Localisation and Mapping) and has continued to develop SLAM in substantial ways since then. His research focus is in improving & enhancing SLAM in terms of dynamics, scale, detail level, efficiency and semantic understanding of real-time video. SLAM has evolved into a whole new domain of “Spatial AI” leveraging neural implicit representations and the suite of cutting-edge methods creating a full coherent representation of the real world from video.

About the Host

I'm Gil Elbaz, co-founder and CTO of Datagen. I speak with interesting computer vision thinkers and practitioners. I ask the big questions that touch on the issues and challenges that ML and CV engineers deal with every day. On the way, I hope you uncover a new subject or gain a different perspective, as well as enjoying engaging conversation. It's about much more than the technical processes. It's about people, journeys and ideas. Turn up the volume, insights inside.

Show More Show Less

1 hr and 3 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free

What listeners say about Unboxing AI: The Podcast for Computer Vision Engineers

Average customer ratings

Reviews - Please select the tabs below to change the source of reviews.

Audible.co.uk reviews

Amazon reviews

No Reviews are Available

Report a review on Amazon

Audiobook Categories

Popular Lists

Explore Audible

Unboxing AI: The Podcast for Computer Vision Engineers

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

Unboxing AI: The Podcast for Computer Vision Engineers

Summary

YOLO: Building AI with an Open-Source Community

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

Synthetic Data: Simulation & Visual Effects at Scale

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

SLAM and the Evolution of Spatial AI

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

What listeners say about Unboxing AI: The Podcast for Computer Vision Engineers

Reviews - Please select the tabs below to change the source of reviews.

Audible.co.uk reviews

Amazon reviews