WEBVTT 1 00:00:00.000 --> 00:00:04.904 We're filled with the use cases of language right now, 2 00:00:04.904 --> 00:00:07.907 language AI, which is really exciting. 3 00:00:07.907 --> 00:00:10.276 But we underappreciate 4 00:00:10.276 --> 00:00:13.146 how perceptual and physical 5 00:00:13.146 --> 00:00:18.146 our living and working experiences are. 6 00:00:18.651 --> 00:00:19.919 AI is moving fast. 7 00:00:19.919 --> 00:00:21.788 ROI expectations are rising, 8 00:00:21.788 --> 00:00:23.990 and leaders everywhere are racing to figure out 9 00:00:23.990 --> 00:00:27.160 what it actually takes to build an AI-ready organization. 10 00:00:27.160 --> 00:00:29.829 To help us understand and make sense of this moment, 11 00:00:29.829 --> 00:00:31.998 I'm joined by two people at the very center 12 00:00:31.998 --> 00:00:33.867 of this conversation globally, 13 00:00:33.867 --> 00:00:36.569 Dr. Fei-Fei Li and Dylan Bolden. 14 00:00:36.569 --> 00:00:40.040 Many people call Dr. Li the godmother of AI 15 00:00:40.040 --> 00:00:41.141 and for good reason, 16 00:00:41.141 --> 00:00:43.043 she created ImageNet, 17 00:00:43.043 --> 00:00:44.277 which is the database 18 00:00:44.277 --> 00:00:47.047 that literally taught computers to see. 19 00:00:47.047 --> 00:00:49.849 That breakthrough made possible everything from Google Lens 20 00:00:49.849 --> 00:00:51.384 to self-driving cars. 21 00:00:51.384 --> 00:00:54.354 And today as co-founder and CEO of World Labs, 22 00:00:54.354 --> 00:00:56.923 she's building large world models, 23 00:00:56.923 --> 00:00:59.426 AI that doesn't just process data, 24 00:00:59.426 --> 00:01:04.426 but understands and navigates the physical world around us. 25 00:01:04.798 --> 00:01:07.267 And Dylan Bolden has spent over two decades helping 26 00:01:07.267 --> 00:01:10.437 the world's most complex organizations figure out 27 00:01:10.437 --> 00:01:11.738 what's next. 28 00:01:11.738 --> 00:01:14.140 He's managing director and senior partner at BCG, 29 00:01:14.140 --> 00:01:17.644 and in that role, he leads the firm's global thinking on AI 30 00:01:17.644 --> 00:01:19.179 and its impact on business. 31 00:01:19.179 --> 00:01:21.514 Dr. Li, Dylan, welcome. 32 00:01:21.514 --> 00:01:22.348 Thank you. 33 00:01:22.348 --> 00:01:23.483 Glad to be here. 34 00:01:23.483 --> 00:01:27.754 You used to be the chief AI scientist at Google Cloud 35 00:01:27.754 --> 00:01:29.089 at one point. 36 00:01:29.089 --> 00:01:32.025 During that time, was there any one moment 37 00:01:32.025 --> 00:01:36.396 that made you realize that spatial intelligence 38 00:01:36.396 --> 00:01:37.497 was the future? 39 00:01:37.497 --> 00:01:40.233 Obviously at this point we know that, but back then. 40 00:01:41.601 --> 00:01:42.869 The short answer is yes. 41 00:01:42.869 --> 00:01:47.307 So I became Google Chief Scientist in 2017, 42 00:01:47.307 --> 00:01:50.543 but that was already 17 years into my career 43 00:01:50.543 --> 00:01:51.811 as a AI scientist. 44 00:01:51.811 --> 00:01:55.815 It's been my life's work thinking about vision, 45 00:01:55.815 --> 00:01:58.218 perception, robotic learning. 46 00:01:58.218 --> 00:02:00.120 At Google Cloud, 47 00:02:00.120 --> 00:02:03.857 what did change me is that I get to work 48 00:02:03.857 --> 00:02:06.860 with all the vertical industries, 49 00:02:06.860 --> 00:02:10.029 from hospitals to financial services, 50 00:02:10.029 --> 00:02:14.968 from energy to agriculture. 51 00:02:14.968 --> 00:02:19.672 And so much of those needs were physical.For example, 52 00:02:19.672 --> 00:02:21.374 in financial services, 53 00:02:21.374 --> 00:02:25.578 ensures damage assessment 54 00:02:25.578 --> 00:02:28.414 or appraisal is very visual. 55 00:02:28.414 --> 00:02:33.414 In transportation, whether we're talking about driving 56 00:02:34.387 --> 00:02:38.925 or shipping or--all those are very visual 57 00:02:38.925 --> 00:02:43.196 and perceptual--or agriculture. You know, weeds, 58 00:02:44.197 --> 00:02:48.701 understanding weeds and picking weeds or harvesting apples 59 00:02:48.701 --> 00:02:51.871 or counting salmons of certain size. 60 00:02:51.871 --> 00:02:55.141 All of these are extremely visual spatial. 61 00:02:55.141 --> 00:02:59.479 So in our world, I know that we're filled 62 00:02:59.479 --> 00:03:03.483 with the use cases of language right now, 63 00:03:03.483 --> 00:03:06.486 language AI, which is really exciting. 64 00:03:06.486 --> 00:03:08.855 But we underappreciate 65 00:03:08.855 --> 00:03:13.855 how perceptual and physical our living 66 00:03:13.860 --> 00:03:16.963 and working experiences are. 67 00:03:16.963 --> 00:03:19.065 You know, I think about businesses 68 00:03:19.065 --> 00:03:22.802 that require physical space, retail, 69 00:03:22.802 --> 00:03:25.672 travel, industries, hospitality. 70 00:03:26.706 --> 00:03:28.474 In those environments, 71 00:03:28.474 --> 00:03:32.478 when we have AI that can perceive space, 72 00:03:32.478 --> 00:03:35.081 that can see space, 73 00:03:35.081 --> 00:03:38.751 which of those business models breaks first? 74 00:03:38.751 --> 00:03:43.751 Think about a retailer that can better in real time, understand based on 75 00:03:43.756 --> 00:03:47.594 how the customer experiences the store, 76 00:03:47.594 --> 00:03:49.562 where inventory needs to be. 77 00:03:49.562 --> 00:03:53.233 But I think honestly the first place to break is in labor, 78 00:03:53.233 --> 00:03:56.236 figuring out where's the best place 79 00:03:56.236 --> 00:04:00.540 to put labor to get the most out of it within a store 80 00:04:00.540 --> 00:04:04.244 where at certain times a day, 81 00:04:04.244 --> 00:04:07.413 do I put people in certain parts of the operation 82 00:04:07.413 --> 00:04:08.514 with them on a restaurant? 83 00:04:08.514 --> 00:04:11.784 Do I put them next to the front of the store? 84 00:04:11.784 --> 00:04:13.286 Do I put them in the back? 85 00:04:13.286 --> 00:04:16.089 So really good observation around 86 00:04:16.089 --> 00:04:19.726 how consumers experience the store 87 00:04:19.726 --> 00:04:21.060 I think gets you to better decisions 88 00:04:21.060 --> 00:04:22.262 around how to deploy labor. 89 00:04:22.262 --> 00:04:23.162 So that's one example. 90 00:04:23.162 --> 00:04:25.231 But I'm very interested to hear. 91 00:04:25.231 --> 00:04:26.065 Dr. Li. 92 00:04:26.065 --> 00:04:28.234 Your perspective here. 93 00:04:28.234 --> 00:04:31.337 So there's a lot of future opportunities, 94 00:04:31.337 --> 00:04:35.308 but yes, as the technology matures, 95 00:04:35.308 --> 00:04:36.409 I agree with Dylan, 96 00:04:36.409 --> 00:04:41.409 we don't really know what's around in our physical space. 97 00:04:41.714 --> 00:04:43.850 We don't have a easy way 98 00:04:43.850 --> 00:04:46.886 to express our imagination 99 00:04:46.886 --> 00:04:49.856 into a potential physical space 100 00:04:49.856 --> 00:04:53.026 if we're talking about storytelling or design 101 00:04:53.026 --> 00:04:56.763 or optimizing for office space 102 00:04:56.763 --> 00:05:01.367 or designing, you know, hospitality environments 103 00:05:01.367 --> 00:05:05.071 or live events or many of these. 104 00:05:05.071 --> 00:05:09.475 So I think as this technology matures, 105 00:05:09.475 --> 00:05:12.679 as we push the frontier of this technology, 106 00:05:12.679 --> 00:05:14.480 we will create world models 107 00:05:14.480 --> 00:05:17.817 that can reconstruct physical space, 108 00:05:17.817 --> 00:05:21.087 can understand what's in this physical space, 109 00:05:21.087 --> 00:05:25.291 can predict the next state of this physical space, 110 00:05:25.291 --> 00:05:30.291 can translate imagination into digital 111 00:05:30.363 --> 00:05:32.732 expression of the physical space. 112 00:05:32.732 --> 00:05:36.669 All this will empower many business use cases. 113 00:05:36.669 --> 00:05:39.205 Thank you so much, Dylan Bolden and Dr. Fei-Fei Li. 114 00:05:39.205 --> 00:05:40.340 Thank you. Thank you.