A couple of years ago, Alex Yu and Amit Jain came together to found a company that’d let people capture objects in 3D using their smartphones — no additional equipment required. At the time, Yu was an AI researcher at UC Berkeley while Jain was an Apple employee building out the Vision Pro‘s multimedia experiences.
Their company, Luma, launched a smartphone app in 2021, which quickly gained traction — going on to attract millions of users (just over two million as of publication time). But now, as generative AI tech floods the channels, Yu and Jain hope to evolve Luma into something bigger — and, with any luck, better — than they originally envisioned.
Luma today announced that it’ll begin leveraging a compute cluster of ~3,000 Nvidia A100 GPUs to train new AI models that can — in Yu’s words — “see and understand, show and explain and eventually interact with [the] world.”
The first phase of this plan entails creating models capable of generating 3D objects from text descriptions; Luma launched one such model on its Discord server earlier this year, called Genie. The next will be developing “next gen” generative AI models that address what Yu characterizes as the “uncanny valley” problem in current-gen GenAI.
“We believe that multimodality is critical for intelligence. To go beyond language models, the next unlock will come from vision,” Yu told TechCrunch in an email interview. “[However,] AI needs to get a lot smarter to deliver the potential the world sees in it.
To realize this vision vision (pardon the pun), Luma has raised $43 million in a Series B round with participation from Andreessen Horowitz among other backers old and new. According to a source familiar with the matter, the round values Luma at between $200 million and $300 million; Luma’s war chest now stands at more than $70 million.
Luma’s current focus — launching 3D model-creating AI models — is an increasingly competitive space. There’s object-crafting platforms like 3DFY and Scenario, as well as startups such as Hypothetic, Kaedim, Auctoria and Mirage. Stability AI recently launched a 3D-model-generating standalone tool as did newer venture Atlas. Even incumbents like Autodesk and Nvidia are beginning to dip their toes in the sector with apps like Get3D, which converts images to 3D models, and ClipForge, which generates models from text descriptions.
So how will Luma’s tools stand apart? Fidelity mainly, Yu says.
“Current models are all being trained on two-dimensional images and, when asked to generate scenes, they mangle spaces, bodies and movements,” he said. “It’s very difficult to generate anything coherent and usable in the first few tries, limiting where you can use the outputs … [We’re bringing] about the most advanced generative photorealistic technolog[ies] in an intuitive app.”
That’s promising a lot considering it’s early in Luma’s ambitious new roadmap. An improved version of Genie launches today, but future, more capable generative AI models are a ways away.
Luma’s wasting no time though, planning to double its 24-person workforce by the end of next year while piecing together a model-running server cluster of “thousands” of GPUs. Perhaps it’ll make headway after all; time will tell.
“We’ve been growing the team across generative AI research, engineering, design and product in order to bring our vision to life, and plan to accelerate the pace here significantly following this round,” Yu said. “With Genie, for the first time creating 3D things at scale has become possible with AI, and that’s grown to 100,000 users in just four weeks … [But we want to] build vastly more capable, intelligent, and useful visual models for our users.”