Meta 3D Gen

ai.meta.com

472 points by meetpateltech 3 days ago

wkat4242 3 days ago

I can't wait for this to become usable. I love VR but the content generation is just sooooo labour intensive. Help creating 3D models would help so much and be the #1 enabler for the metaverse IMO.

jsheard 3 days ago

VR is especially unforgiving of "fake" detailing, you need as much detail as possible in the actual geometry to really sell it. That's the opposite how these models currently work, they output goopy low-res geometry and approximate most of the detailing with textures, which would be immediately register as fake with stereoscopic depth perception.
- spookie 3 days ago
  
  Yup. I'm doing a VR project, urban environment. Haven't really found a good enough solution for 3D reconstruction from images.
  Yes, there is gaussian splatting, NeRF and derivatives, but their outputs _really don't look good_. It's also necessary to have the surface remeshed if you go through that route, and then you need to retexture it.
  Crazy thing being able to see things up to scale and so close up :)
  - dclowd9901 3 days ago
    
    Not meta VR, but one of my favorite things to do in gran turismo 7 with my PSVR2 is just “sit” in the cars and look around the cabins. The level of detail the devs put in is on another level.
    
    mhh__ 3 days ago
    
    I love simracing in the rain (rf2 is really punishing in particular and actually looks quite good) mainly for similar reasons.
    
    8n4vidtmkvmk 3 days ago
    
    Would be nice if they modelled real purchasable cars this way. With functional knobs and hud would be even better.
    
    Haemm0r 3 days ago
    
    Knobs in cars are not a thing anymore... No need for 3D for that ;-)
    
    brookst 2 days ago
    
    Instead they have to model screens and the operating systems that allow us to change the temperature with only 6 clicks!
    
    anakaine a day ago
    
    A number of manufa turers are heading back to knobs for frequent functions after customer sentiment whiplash.
  - bhewes 3 days ago
    
    I find it much easier to remesh and deal with textures with a crappy 3d reconstruction vs working with 2d images only. I also shoot HDRI and photos for PBR. I find sculpting tools super useful for VR, but yeah its still an Art even with all the AI help.
  - 4ggr0 2 days ago
    
    > _really don't look good_
    If you use "*" instead of "_" you can write in italic :) just a thought
  - ibrarmalik 3 days ago
    
    By output you mean the extracted surface geometry? Or are you directly rendering NeRFs in VR.
    
    spookie 3 days ago
    
    Given the scale it wouldn't be wise to render them directly. There's also the issue of being able to record in real life without changes happening while doing so.
    I should've have clarified it, but yes I was talking about the extracted surface geometry.
- SV_BubbleTime 3 days ago
  
  Agreed.
  Everyone I see text to 3D, it’s ALWAYS textured. That is the obvious give-away that it is still garbage.
  Show me text to wireframe that looks good and I’ll get excited.
- newswasboring 2 days ago
  
  I would push back on this a bit. The best games I've played on my VR headset are not "realistic". "Fake" is more about consistency I guess, if everything around me in VR looks equal amount of "goopy" then I don't think it will feel weird.
  - taneq 2 days ago
    
    Doom 2 in gzdoom VR mode is amazingly convincing for how janky the graphics are. Even with the enemies as 2D sprites it’s quite compelling, so it’s my go-to rebuttal for all the “VR games need amazing graphics” sentiment. Good low-detail artwork is fine.
- edkennedy 3 days ago
  
  We need more of the stable diffusion tool set in 3d AI creation. Upscaling does incredible things in SD adding in all kinds of detail and bringing 512x512 to 4k. Inpainting to redo weird arms or deformities. Controlnets like outlines, depth, pose etc to do variations on an existing model.
  The real fun begins when rigging gets automated. Then full AI scene generation of all the models… then add agency… then the trip never ends.
- outside415 3 days ago
  
  this is why I love half life alyx. it just gets so much detail in VR space in a way that no other game ever has that makes for a truly immersive experience.
- lomase 2 days ago
  
  Half Life Alyx bakes everything on textures and is the best looking vr game.
  - mplewis 2 days ago
    
    Alyx doesn’t use models that look like cheese left in a hot car.
    
    lomase 8 hours ago
    
    We leave that for Half Life 1.
- TylerE 3 days ago
  
  I'd liken it to the trend from 5-10 years ago for every game to have randomly generated levels.
  It does't feel like an expansive world - it's the same few basic building blocks combined in every possible combination. It doesn't feel intentional or interesting.
- Liquix 3 days ago
  
  does displacement mapping not hold up in VR?
  - lawlessone 3 days ago
    
    I think displacement maps are often made by starting with high detailed models and converting some of the smaller details to normal, bump, reflection? maps etc.
    
    readyman 3 days ago
    
    correct. the process is called texture baking. however, displacement uses displacement maps, which are grayscale, essentially depth maps. displacement is not very useful for realtime rendering because then the displacement has to be calculated in realtime. displacement is mostly used in offline rendering. realtime rendering just uses normal maps.
  - amonith 2 days ago
    
    It absolutely does, I don't quite get comments here. All VR games use normal assets with all their normal (map) shenanigans. It works the same as on desktop. Sure if you look at something super close the "illusion" breaks, but it has nothing to do with VR, same thing happens in flat screen games.
    Maybe people mistakenly think that most standalone Quest games don't have those maps because they don't work? Well it's not the case. The standalone games (especially on Q2 vs Q3) have just very low performance budget. You strip out what you can to make your game render in 90 fps for each eye (each eye uses a different camera perspective so each frame scene has to be rendered twice).
- crazygringo 3 days ago
  
  Funny, pretty much everything on my Meta Quest seems to be "fake" detailing, without much detail in the actual geometry.
  I mean, yes it's obvious because the GPU is only so powerful. The difference against my Xbox is night-and-day.
  But even if VR is unforgiving of it, it's simply what we've got, at least on affordable devices. These models seem to be perfectly fine for current mainstream VR. Maybe Apple Vision is better, I don't know.
samspenc 3 days ago

There are a few services that do this already, but they are all somewhat lacking, hopefully Meta's paper / solution brings some significant improvements in this space.
The existing ones:
- Meshy https://www.meshy.ai/ one of the first movers in this space, though it's quality isn't that great
- Rodin https://hyperhuman.deemos.com/rodin newer but folks are saying this is better
- Luma Labs has a 3D generator https://lumalabs.ai/genie but doesn't seem that popular
tsimionescu 2 days ago

Regardless of cost, the meta verse is a dead end concept - at least until we get Ghost in the Shell-style computer-brain interfaces. Second Life probably was peak popularity for this kind of meta verse.

mintone 3 days ago

I've been bullish[1] on this as a major aspect of generative AI for a while now, so it's great to see this paper published.

3D has an extremely steep learning curve once you try to do anything non-trivial, especially in terms of asset creation for VR etc. but my real interest is where this leads in terms of real-world items. One of the major hurdles is that in the real-world we aren't as forgiving as we are in VR/games. I'm not entirely surprised to see that most of the outputs are "artistic" ones, but I'm really interested to see where this ends up when we can give AI combined inputs from text/photos/LIDAR etc and have it make the model for a physical item that can be 3D printed.

[1] https://www.technicalchops.com/articles/ai-inputs-and-output...

aabhay 2 days ago

I have to be a soggy blanket person, but there are some pretty strong reasons why 3D generative AI is going to have a much shallower adoption cycle than 2D. (I founded a company that was likely the first generative AI company for 3D assets)
1. 3D is actually a broad collection of formats and not a single thing or representation. This is because of the deep relation between surface topology and render performance. 2. 3D is much more challenging to use in any workflow. Its much more physical, and the ergonomics of 3DOF makes it naturally hard to place in as many places as 2D 3. 3D is much more expensive to produce per unit value, in many ways. This is why, for example, almost every indie web comic artist draws in 2D instead of 3D. In an ai first world it might be less “work” but will still be leaps and bounds more expensive.
In my opinion, the media that have the most appeal for genAI are basically (in order)
- images - videos - music - general audio - 2D animation - tightly scoped 3D experiences such as avatars - games - general 3D models.
My conclusion from being in this space was that there’s likely a world where 3D-style videos generated from pixels are more poised to take off than 3D as a data type.
- noduerme 2 days ago
  
  You skipped an important market segment. Porn. The history of technology for making and distributing images, from cave paintings to ancient Rome through the Renaissance, Betamax, Photoshop, Tomb Raider and Instagram filters, has always been driven by improving the rendering of boobs.
- robust-cactus 2 days ago
  
  At this point GLTF seems pretty darn good and seems broadly usable. It embeds the mesh, textures, animations right in a single file. It can represent a single model or a scene. I also has a binary format.
  3d need not be so complicated! We've kinda made it complicated but a simplification wave is likely coming.
  The big unlock for 3d though will have to be some better compression technology. Games and 3d experiences are absolutely massive.
- catgary 2 days ago
  
  There’s a middle ground in animation where you use techniques like text2motion with pre-existing assets that, I think, has a lot of appeal for games.
- fartfeatures 2 days ago
  
  Doesn't Nvidia's USD format solve the multiple formats issue?
  - unconed 2 days ago
    
    I think you mean Pixar's.
    USD was designed for letting very large teams collaborate on making animated movies.
    It's actually terrible as an interchange format. e.g. The materials/shading system is both overengineered and underdefined. It is entirely application specific.
    For the parts of USD that do work well, we have better and/or more widely supported standards, like GLTF and PBR.
    It was a very dumb choice to push this anywhere.
slidehero 3 days ago

>I've been bullish[1] on this as a major aspect of generative AI for a while now, so it's great to see this paper published.
Me too. My first thought when seeing 2D AI generated images was that 3D would be a logical next step. Compared to pixels, there's so much additional data to work with when training these models I just assumed that 3D would be an easier problem to solve than 2D image generation. You have 3D point data, edges, edge loops, bone systems etc and a lot of the existing data is pretty well labeled too.
lallysingh 3 days ago

I'm excited to see the main deterrents to Indy gave dev: art and sound, get AI'd away. A single developer could use an off-the-shelf game engine and some AI generated assets (perhaps combining with whatever they can buy cheap) to develop some fun games.
Still, getting from still models to something that animates is necessary:(
- vermilingua 3 days ago
  
  This is exactly what I’m not excited for: indie game discovery is already hard enough, Steam widening the floodgates has not been a positive experience. Reducing the effort to create games even further is going to DoS the indie game market as we see the same studios that pump out hundreds of hentai games suddenly able to broaden their audiences significantly.
  - livrem 2 days ago
    
    I think having modern game engines reducing the need for game programmers to almost zero caused much of this, but it also resulted in some interesting games when artists could create games without a need to hire programmers.
    It will be interesting to see if AI art (and AI 3D models) will mean that we see interesting games instead created by programmers without having to hire any artists.
    What I do not look forward to is the predictable spam flood of games created without both artists and programmers.
    
    l33tman 2 days ago
    
    To be fair, this is already the case on all the platforms, as you can easily put together a game with free assets from the assetstores (or pay a few dollars for pretty high quality assets). For every standard game genre you can imagine I'm sure there are thousands of generic games released every year on every playform (don't have any real numbers but I get that feeling...)
    Rendering the assets by AI or buying them from the asset store is not going to change the number of generic games put out there I think, maybe AI gen can make some of them a bit more unique at best.
  - noduerme 2 days ago
    
    Passable art is common. Original and interesting game mechanics are exceedingly rare, and will continue to be. The relationship between passable art and throwaway games is like that between bland AI content and marketing blogs.
    Really good games will still employ really good artists.
    
    vermilingua 2 days ago
    
    This is my point exactly, but even passable art takes some time to create. I’m not excited for the very-soon-to-arrive tide of VNs, deckbuilders, and JRPGs made with effectively 0 time or effort.
    
    brookst 2 days ago
    
    I’ve never understood the effort = quality view of art. Just because someone spent thousands of hours does not mean it is good art. And plenty of great art is executed quickly.
    It seems as odd to me as bemoaning the way word processors let people write novels without even being good typists.
    
    noduerme 2 days ago
    
    What is an example of some great art that was executed quickly and/or without a great many hours of prior experience on the part of the artist?
    
    brookst a day ago
    
    With apologies for the BuzzFeed listicle: https://www.buzzfeed.com/imaraoshibanjo1/famous-songs-writte...
    Picasso produced 50,000 paintings in his career[1], about two per day every day. So probably considerably more on some days.
    It’s harder to find data on great art from relative novices. But consider the opposite — how much bad art is there from people who put their 10,000 hours or whatever in? I’m willing to believe some correlation between time spent and quality, but I am not willing to believe that tools that make artists more efficient necessarily reduce quality.
    1. https://www.guggenheim.org/teaching-materials/selections-fro...
    
    noduerme an hour ago
    
    I mean, part of my job is hiring illustrators and designers. I can tell by looking at a portfolio whether someone has put in their (slightly metaphorical) "10,000 hours". And much of that has nothing to do with execution or the tools they used. In fact, thinking that execution and tooling make them better is often a red flag.
    What I look for is that the artist knows what they want and that the ideas they're putting on the page are thoughtful, coherent, original, and well-executed in a style that's unique enough to justify hiring them personally. And the ability to hone ideas into visual form is not innate, nor have I ever seen it successfully done by someone who didn't spend countless hours trying and failing first.
    For example, upper management, who spend time looking at and approving art pieces, almost never understand that altering them is going to make them worse. "Add something here" or "take this out", generally undermine the piece when coming from someone not trained and experienced. Writing prompts is much the same as being a manager. You never get exactly the result you expect for what you asked, but that is also because you did not have the exact vision in your own mind of how it would look before it was executed.
    Practice is about developing that vision. Once you have that vision, execution is the easy part, and you don't really need a tool to draw it for you. In any case, the tool will not draw it the way you see it.
    So yes, a songwriter who's written tons of songs can suddenly write a good one in 30 minutes. Most of my best songs were written longhand with no edits. That happens sometimes after writing hundreds of songs that you throw away.
    Similarly, I've been coding for 25 years. Putting my fingers on the keys and typing out code is the easy part. I don't need copilot to do that for me. I don't really need a fancy IDE. What practice gives is the ability to see the best way to do something and how it fits into a larger project.
    If a tool could read the artist's mind and draw exactly what the artist sees, it would be crystal clear that 10,000 hours of trial and error in image-making results in a thought process that makes great art possible (if the artist is capable of it at all). The effort is mostly in the process of developing that mental skill set.
    
    noduerme 2 days ago
    
    But this is a category that didn't exist yet. So who knows what people without art skills or budgets might do? Probably nothing, but maybe one in ten thousand actually isn't garbage. Just like music at the advent of digital home recording. The market is already so flooded it hardly matters.
    I'm an artist and a gentleman coder and I'm disgusted and offended by careless work. But I don't think I need to die on the hill of stopping infinite crappy game mills from having access to infinite crappy art.
    [edit] I'm also just bitter after years working on pretty great original art / mechanics driven casual games that only garnered tiny devoted fan bases, and so I assume that when it comes to the kinds of long tail copycat games you're talking about, especially with AI art, no one's going to bother playing them anyway.
  - drschwabe 3 days ago
    
    That's the idea, you won't have to discover - instead you can just create the game you want.
    
    latentsea 2 days ago
    
    Be careful what you wish for.
  - kilpikaarna 2 days ago
    
    Lol, yeah, the main deterrent/obstacle to indie game dev has little to do with actual development, and machine generated content is actively making that worse.
  - DrSiemer 2 days ago
    
    So we should not improve production methods, because it will give us more things for less effort?
    Just let the market sort it out. I for one can't wait for the next Cyriak or Sakupen, that can wield the full power of AI assistance to create their unique vision of what a game can be.
milofeynman 3 days ago

Autodesk has been building practical 3d models for years with generative design. I have to imagine it's only getting better with these recent advances, but I'm not immersed in the space.

iamleppert 3 days ago

I tried all the recent wave of text/image to 3D model services, some touting 100 MM+ valuations and tens of millions raised and found them all to produce unusable garbage.

dudus 3 days ago

SOTA text-to-image 5 years ago was complete garbage. Most people would think the same. Look how good it got now.
You have to look at this as stepping stone research.
- raincole 3 days ago
  
  Did they got such high valuation 5 years ago? Genuine question.
  - gpm 3 days ago
    
    I'm not sure I'd expect valuations to be at all similar.
    The potential target market is significantly different in scale (I assume, I haven't tried to estimate either). The potential competitors are... already in existence. It seems more likely now that we'll succeed at good 3d-generative-AI then it seemed before we got good 2d-generative-AI that we would succeed at that...
  - dinglestepup 3 days ago
    
    No. With one partial exception being OpenAI that got $1B investment ~5 years ago from MS before they launched DALL-E v1 (and even before GPT-3).
architango 3 days ago

I have too, and you’re quite right. Also the various 2D-to-3D face generators are mostly awful. I’ve done a deep dive on that and nearly all of them seem to only create slight perturbations on some base model, regardless of the input.
ddtaylor 3 days ago

We tried them too. My wife is a 3D artist, but we needed a lot of assets that frankly weren't that important. The plan was to use the output as a starting point and improve as needed manually.
The problem is that the output you get is just baked meshes. If the object connects together or has a few pieces you'll have to essentially undo some of that work. Similar problems with textures as the AI doesn't work normally like other artists do.
All of this is also on top of the output being basically garbage. Input photos ultimately fail in ways that would require so much work to fix it invalidates the concept. By the time you start to get something approaching decent output you've put in more work or money than just having someone make it to begin with while essentially also losing all control over the art pipeline.
guyomes 2 days ago

The field evolves quickly. The Meta 3D Gen paper states that Rodin Gen-1 [1,2,3] has a clean topology. As a non professional, the wireframes from some examples indeed look nice.
[1]: https://github.com/CLAY-3D/OpenCLAY
[2]: https://hyperhuman.deemos.com/
[3]: https://huggingface.co/spaces/DEEMOSTECH/Rodin
jampekka 3 days ago

The gap from demos/papers to reality is huge. ML has a bad replication crisis.
- freeone3000 3 days ago
  
  This is not a “replication crisis”. Running the paper gets you the same results as the author; it’s uniquely replicable. The results not being useful in a product is not the same as a fundamental failure in our use of the scientific process.
  - jampekka 3 days ago
    
    That is reproducibility. Replicability means that the results hold for replication outside the specific circumstances of one study.
    
    fngjdflmdflg 3 days ago
    
    >Replicability means that the results hold for replication outside the specific circumstances of one study.
    If by "hold for replication outside the specific circumstances of one study" you mean "useful for real world problems" as implied by your previous comment then I don't think you are correct.
    From a quick search it seems there are multiple definitions of Reproducibility and Replicability with some using the words interchangeably but the most favorable one I found to what you are saying is this definition:
    >Replicability is obtaining consistent results across studies aimed at answering the same scientific question, each of which has obtained its own data.
    >[...]
    >In general, whenever new data are obtained that constitute the results of a study aimed at answering the same scientific question as another study, the degree of consistency of the results from the two studies constitutes their degree of replication.[0]
    However I think this holds true for a lot of ML research going on. The issue is not that the solutions do not generalize. It's that the solution itself is not useful for most real world applications. I don't see what replicability has to do with it. you can train a given model with a different but similar dataset and you will get the same quality non-useful results. I'm not sure exactly what definition of replicability you are using though if there is one I missed please point it out.
    [0] https://www.ncbi.nlm.nih.gov/books/NBK547546/
- SV_BubbleTime 3 days ago
  
  >The gap from demos/papers to reality is huge.
  SAI showed Stable Diffusion 3 pictures of women laying on grass. If you haven’t been following SD3…
  https://arstechnica.com/information-technology/2024/06/ridic...
dgellow 3 days ago

Haven’t tried all, but yeah, pretty bad so far

LarsDu88 3 days ago

This is crazy impressive, and the fact they have the whole thing running with a PBR texturing pipeline is really cool.

That being said, I wonder if the use of signed distance fields (SDFs) results in bad topology.

I saw a paper earlier this week that was recently released that seems to build "game-ready" topology --- stuff that might actually be riggable for animation. https://github.com/buaacyw/MeshAnything

jsheard 3 days ago

The obvious major caveat with MeshAnything is that it only scales up to outputs with about 800 polygons, so even if their claims about the quality of their topology hold up it's not actually good for much as it stands. For reference a modern AAA game character model can easily exceed 100,000 polygons, and models made to be rendered offline can be an order of magnitude bigger still.
- LarsDu88 3 days ago
  
  I do some 3d modeling on the side with my side project (https://roguestargun.com), and I suspect those 800 polygons with good topology may be more useful to a lot of 3d artists than blobby fully textured SDF derived models.
  A low poly model with good topology can be very easily subdivided and details extruded for higher definition ala Ian Hubert's famous vending machine tutorial: https://www.youtube.com/watch?v=v_ikG-u_6r0
  And of course I'm sure those folks in Shanghai making the Mesh Anything paper did not have access to the datasets or compute power the Meta team had.

explaininjs 3 days ago

Looks fine, but you can tell the topology isn’t good based on the lack of wireframes.

tobyjsullivan 3 days ago

They seem to admit as much in Table 1 which indicates this model is not capable of "clean topology". Somewhat annoyingly, they do not discuss topology anywhere else in the paper (at least, I could not find the word "topology" via Ctrl+F).
jsheard 3 days ago

Credit where it's due, unlike most of these papers they do at least show some of their models sans textures on page 11, so you can see how undefined the actual geometry is (e.g. none of the characters have eyes until they are painted on).
- SV_BubbleTime 3 days ago
  
  Sans texture is not wireframe though. They have a texture, it’s just all white.
  The wire frame is going to be unrecognizable-bad.
  Still a ways to go.
torginus 3 days ago

Afaik, there's no topology - it outputs signed distance fields, not meshes.
- explaininjs 3 days ago
  
  This is incorrect.
  > Given a text prompt provided by the user, Stage I creates […] a 3D mesh.
dyauspitr 3 days ago

That doesn’t matter for things like 3D printing and CNC machining. Additionally, there are other mesh fixer AI tools. This is going to be gold for me.
- jsheard 3 days ago
  
  However if you 3DP/CNC these you'll only get the base shape, without any of the fake details it painted over the surface.
  Expectation vs. reality: https://i.imgur.com/82R5DAc.png
  - dyauspitr 3 days ago
    
    That’s still not bad. I can use the normal and texture maps to generate appropriate depth maps to put the details in and do some final Wacom touch ups. Way better than making the whole thing from scratch.
- eropple 3 days ago
  
  > That doesn’t matter for things like 3D printing and CNC machining
  It absolutely does. But great, let's look forward to Printables being ruined by off-model nonsense.
  - dyauspitr 3 days ago
    
    Why does it matter. As long as there are no holes, my vectric software doesn’t care.
    
    TylerE 3 days ago
    
    If your normals are flipped, your cnc cutter is going to try to cut from inside up to the surface. That's no bueno.
    
    dyauspitr 3 days ago
    
    Inverting the normals is pretty straightforward.
    
    TylerE 3 days ago
    
    If ALL of them are inverted, yes.
    If the topology is a disaster...no.
    If you're hand massaging every poly you're rather defeating the purpose.
    
    slidehero 3 days ago
    
    >If you're hand massaging every poly you're rather defeating the purpose.
    That's a bit of an overstatement. Fixing normals is far less time consuming than creating a mesh from scratch. This is particularly a win for people who lack the artistic skill to create the meshes in the first place.
    I've got a lot of technical 3D skills after using 3DSMax for years as a hobby. Unfortunately I lack the artistic skills to create good looking objects. This would definitely allow me to do things I couldn't before.
  - SV_BubbleTime 3 days ago
    
    It matters so much more, GP is just being hopeful and soon to be disappointed.
nuz 3 days ago

Such a silly argument. Fixing topology is a nearly solved problem in geometry processing. (Or just start with a good topology and 'paste' a texture onto it like they develop techniques for here.)
- RicoElectrico 3 days ago
  
  It's an essential skill for reading scientific papers to notice what isn't there. It's as important as what is there.
  In my field, analog IC design, if we face a wall, we often do some literature review with a colleague and more often than not, results are not relevant for commercial application. Forget about Monte Carlo, sometimes even there aren't full PVT corners.
  - jampekka 3 days ago
    
    This is indeed a side effect from research papers being read more outside academia (which is strictly a good thing in itself).
    In research one learns that most (almost all) papers oversell their results and a lot of stuff is hidden in the "Limitations" section. This is a significant problem, but not that big a problem within academia as everybody, at least within the field, knows to take the results with a grain of salt. But those outside academia, or outside the field, often don't take this into account.
    Academic papers should be read a bit like marketing material or pitch decks.
- zemo 3 days ago
  
  depends what you're talking about and what your criteria is. In gamedev, studios typically use a retopology tool like topogun (https://www.topogun.com/) to aid in the creation of efficient topologies, but it's still a manual task, as different topologies have different tradeoffs in terms of poly count, texture detail, options for how the model deforms when animated, etc. For example you may know that you're working on a model of a player character in a 3rd person game where the camera is typically behind you, so you want to spend more of your budget on the _back_ of the model than the _front_, because the player is typically looking at their character's back. If your criteria is "find the minimum number of polygons", sure, it's solved. That's just one of many different goals, and not the goal that is typically used by gamedev, which I assume to be a primary audience of this research.
  - efilife 3 days ago
    
    Fyi, we use asterisks to put emphasis on text on HN
- explaininjs 3 days ago
  
  No… it’s not. But if you know something I don’t the 5 primes will certainly be happy to pay you handsomely for the implementation!
  - nuz 3 days ago
    
    https://github.com/wjakob/instant-meshes
    
    spookie 3 days ago
    
    I love that tool but it really doesn't fix bad topology.
    It gets you somewhere closer, but not a fix.
    Moreover, depending on what you have at hand, the resolution of your remeshing might destroy a LOT of detail or is unable to accomodate thin sections.
    Retopo isn't a solved problem. It only is for really basic, convex meshes.
    
    explaininjs 3 days ago
    
    A piece of software that hasn’t been touched in 5 years, let alone adopted in any professional production environment? Cool…
    
    portaouflop 3 days ago
    
    AFAICT it’s used in professional applications and software does not need to be constantly updated, especially if it’s not for the web.
    
    explaininjs 3 days ago
    
    If the claim was that the problem was solved, sure it might make sense that the package does not need to be touched (in reality the field isn’t as slow as you presume, but I digress).
    Instead, the claim is that it’s “nearly^{TM}” solved, so the proof being an abandoned repo from half a decade ago actually speaks volumes: it’s solved except for the hard part, and nobody knows how to solve the hard part.
    
    portaouflop 2 days ago
    
    Well yea I wasn’t claiming the problem is solved, that was GP.
    I don’t think you can truly “solve” any problem if you think about it.
    
    explaininjs 2 days ago
    
    I'm talking satisfying business needs not philosophical mumbo jumbo.
    If you need a model to move, you give it to a rigger. If the mesh is bad, they will first need to remesh it or the rigging wont work right. This is the problem. They will solve this problem using a manual, labor intensive, process. It's not particularly difficult, any 3D artist considering themselves a professional ought to be able to do it. But it's not the sort of thing where you just press a button and turn some knobs and you're done either. It takes a lot of work, and in particularly bad cases it's easier to just start from scratch - remeshing a garbage mesh is indeed harder than modeling from scratch in many cases. Once either of these mechanisms have been applied, the problem will be solved and the rigger can move on to rigging.
    So yes, algorithms exist that pretend to remesh, and every professional modeling systems has one built in (because your sales guys don't want to be the ones without one), but professionals do not use them in production environments (my original claim, if you recall) because their results are so bad. Indeed I'm told several meme accounts exist dedicated to how badly they screw things up when folks do try to take the shortcuts.
    If this project was aiming to solve the problem (which is possible, as I have just explained), they would not have given up 5 years ago. Because it sure isn't solved now.
- TrevorJ 3 days ago
  
  Hard disagree, as someone in the industry.

w_for_wumbo 3 days ago

I think this is another precursor step in recreating our reality digitally. As long as you're able to react to the persons' state, with enough metrics you're able to recreate environments and scenarios within a 'safe environment' for people to push through and learn to cope with the scenarios they don't feel safe to address in the 'real' world.

When the person then emerges from this virtual world, it'll be like an egg hatching into a new birth, having learned the lessons in their virtual cocoon.

If you don't like this idea, it's an interesting thought experiment regardless as we can't verify, we're not already in a form of this.

floppiplopp 2 days ago

Interesting, but what are the practical uses of 3d assets beyond gaming, where does it create a real advantage over what we already use as visual information and user interfaces? I cannot see VR replacing the interactions we have. It requires cumbersome, expensive hardware, it floods the users with additional mostly useless information (image, sound, 3d itself) they have to process, it's slow and expensive to create and maintain, in short: it's inefficient compared to established tech, which will always run circles around the lame try of imitating real world interactions in a 3d virtual space. The potential availability of very expensively (in terms of computing power) generated assets doesn't change that. It's still hard to do right, and even if done right, it seems like only a gimmick hardly anyone can stomach for more then a couple of hours at best. It's information overload to most people, and they have better alternatives.

Ukv 2 days ago

> Interesting, but what are the practical uses of 3d assets beyond gaming
Probably many areas that we already use 3D assets/texturing for. Maybe objects to fill out an architectural render, CG in movies/TV shows, 3D printing, or just as an inspiration/mock-up to build off of. I'd imagine this generator is less useful for product design/manufacturing at the moment due to lack precise constraints - but maybe once we get the equivalent of ControlNets.
If weights are released, it may also serve as a nice foundation model, or synthetic data generator, for other 3D tasks (including non-generative tasks like defect detection), in the same way Stable Diffusion and Segment Anything have for 2D tasks.
> I cannot see VR replacing the interactions we have. It requires cumbersome, expensive hardware
Currently sure, but it's been a reasonably safe bet that hardware will get smaller and cheaper. Something like the Bigscreen Beyond already has a fairly small form factor.
But, I feel you're basing judgement of a 3D generator on one currently-niche potential use of 3D assets, that being VR/AR user interfaces (and in particular ones intended to replace a phone rather than, for instance, the interactive interfaces within VR games/experiences).
> The potential availability of very expensively (in terms of computing power) generated assets doesn't change that
Even just comparing computing power and not the human labour required, this is probably going to be an extremely cheap way to generate assets. The paper reports 30 seconds for AssetGen, then a further 20 seconds for TextureGen - both being feed-forward generators. They don't mention which GPU, but previous similar models have ran in a couple of minutes on consumer GPUs.

vletal 3 days ago

Seeems like simple enough 3D-to-3D will be possible soon!

I'll use it to upscale 8x all meshes and textures in the original Mafia and Unreal Tournament, write a good bye letter to my family and disappear.

I think the kids will understand when they grow up.

GaggiX 3 days ago

In the comparison between the models only Rodin seems to produce clean topology, hopefully in the future we will see a model with the strength of both, hopefully from Meta as Rodin is a commercial model.

cchance 3 days ago

Ya would be cool if we had something open that competed with rodin, but just like elevenlabs for voice, seems closed is gonna be ahead for a while

999900000999 3 days ago

Would love for an artist to provide some input, but I imagine this could be really good if it generates models that you can edit or start from later .

Or, just throw a PS1 filter on top and make some retro games

dorkwood 3 days ago

Sure.
The paper doesn't show topology, UVs or the output texture, so we're left to assume the models look something like what you'd find when using photogrammetry: triangulated blobs with highly segmented UV islands and very large textures. Fine for background elements in a 3D render, but unsuitable for use in a game engine or real-time pipeline.
In my job I've sometimes been given 3D scans and asked to include them in a game. They require extensive cleanup to become usable, if you care about visual quality and performance at all.
raytopia 3 days ago

Unless the topology is good it may not be worth it.
doctorpangloss 3 days ago

> for an artist to provide some input
Sure, the results are excellent.
> Or, just throw a PS1 filter on top and make some retro games
There's so many creative ways to use these workflows. Consider how much people achieved with NES graphics. The biggest obstacles are tools and marketplaces.
- testfrequency 3 days ago
  
  I question that you’re actually an 3D artist. I’m an artist (as is my partner) and we both agree this looks better than most examples..but it still looks incredibly lackluster, poorly colored, and texturally continues to have weird uncanny smoothness to it that is distracting/obviously generated.
  I don’t have time to leave a longer reply, and I still need to read over their entire white paper later tonight, but I’m surprised to see someone who claims to be an artist be convinced that this is “incredible”.

anditherobot 3 days ago

Can this potentially support :

- Image Input to 3D model Output

- 3D model(format) as Input

Question: What is the current state of the art commercially available product in that niche?

egnehots 3 days ago

This a pipeline for text to 3D.
But it's using for 3D gen, a model that is more flexible:
https://assetgen.github.io/
It can be conditioned on text or image.
moffkalast 3 days ago

Meshroom, if you have enough images ;)

localfirst 3 days ago

can somebody please please integrate SAM with 3d primitive RAGging? This is the golden chalice solution as a 3d modeler, having one of those "blobs" generated by Luma and likes aren't very useful

tiborsaas 2 days ago

Probably this is the best way to build the Metaverse. Publish all the research, let people build products over it and soon we'll in need for a place and platform to make use of all the instant assets in virtual spaces.

Well played, Meta.

rebuilder 3 days ago

I’m puzzled by the poor texture quality in these. The colours are just bad - it looks like the textures are blown out (the detail at the bright end clip into white) and much too contrasty ( the turkey does that transition from red to white via a band of yellow). I wonder why that is - was the training data just done on the cheap?

firtoz 3 days ago

It seems to be very well compared to the alternatives, however there's a long way to go forward indeed

f0e4c2f7 3 days ago

Is there a way to try this yet?

polterguy1000 3 days ago

Meta 3D Gen represents a significant step forward in the realm of 3D content generation, particularly for VR applications. The ability to generate detailed 3D models from text inputs could drastically reduce the labor-intensive process of content creation, making it more accessible and scalable. However, as some commenters have pointed out, the current technology still faces challenges, especially in producing high-quality, detailed geometry that holds up under the scrutiny of VR’s stereoscopic depth perception. The integration of PBR texturing is a promising feature, but the real test will be in how well these models can be refined and utilized in practical applications. It’s an exciting development, but there’s still a long way to go before it can fully meet the needs of VR developers and artists.

guiomie 3 days ago

That would be great. I've learnt some Unity, building my own little VR game, and I dread having to learn Blender or any other tool to make more detailed shapes/models. I've tried a few GenAI tool to create 3D models and the quality is not useable.
xena 3 days ago

Generally these things are useless for 3d artists because the wireframe is useless for them.

kgraves 3 days ago

Can this be used for image to 3D generation? What is the SOTA in this area these days?

Fripplebubby 3 days ago

I think what they did here was go text prompt -> generate multiple 2d views -> reconstruction network to go multiple 2d images to 3d representation -> mesh extraction from 3d representation.
That's a long way of saying, no, I don't think that this introduces a component that specifically goes 2d -> 3d from a single 2d image.
tobyjsullivan 3 days ago

The paper suggests Rodin Gen-1 [0] is capable of image-to-shape generation.
[0] https://hyperhuman.deemos.com/rodin

carbocation 3 days ago

For starters, I'd love to just see a rock-solid neural network replacement for screened poisson surface reconstruction. (I have seen MeshAnything and I don't think that's the end-game.)

ziofill 2 days ago

Is this going toward 3D games entirely "hallucinated"? That would be amazing.

Simon_ORourke 3 days ago

Are those guys still banging on about that Metaverse? That's taken a decided back seat to all the AI innovation in the past 18 months.

yieldcrv 3 days ago

Meta has spent like $50bn on their Metaverse line item since 2021 and hasn't stopped
that probably means a bunch of H100's now for this Meta 3D Gen thing, and other yet unnannounced things still incubating in a womb of datasets
dvngnt_ 3 days ago

zuck has said before that ML will help make the "metaverse" more viable.
he still needs a moat with its own ecosystem like the iphone

nightowl_games 2 days ago

Is this just a paper or can I run the program and generate some stuff?

timeon 3 days ago

Why these pages want to bother visitors with popups? Just use "only essential" as default.

antman 2 days ago

Work like this is the only way to revive the now defunct Metaverse, I was wondering whether Meta would fund research such as this that could lower the financial barrier to entry for Metaverse participants

surfingdino 3 days ago

Not sure how adding Gen AI is going to make VR any better? I wanted to type "it's like throwing good money after bad", but that's not quite right. Both are black holes where VC money is turned into papers and demos.

Filligree 3 days ago

The ultimate end goal is a VR game with infinite detail. Sword Art Online, however, remains fiction. Perhaps for the best.