blopker 2 days ago

I took a look at a project I maintain[0], and wow. It's so wrong in every section I saw. The generated diagrams make no sense. The text sections take implementation details that don't matter and present them to the user like they need to know them. It's also outdated.

I hope actual users never see this. I dread thinking about having to go around to various LLM generated sites to correct documentation I never approved of to stop confusing users that are tricked into reading it.

[0]: https://deepwiki.com/blopker/codebook

  • andybak 2 days ago

    I just tried it on several of my repos and I was rather impressed.

    This is another one of those bizarre situations that keeps happening in AI coding related matters where people can look at the same thing and reach diametrically opposed conclusions. It's very peculiar and I've never experienced anything like it in my career until recently.

    • DrewADesign 2 days ago

      > at the same thing

      But you’re not looking at the same thing — you’re looking at two completely different sets of output.

      Perhaps their project uses a more obscure language, has a more complex architecture, resembles another project that’s tripping up the interpretation of it. You have have excellent results without it being perfect for everything. Nothing is perfect and it’s important for people making these things to know how, right?

      In my career I’ve never seen such aggressive dismissal of people’s negative experiences without even knowing if their use case is significantly different.

    • statusfailed 2 days ago

      Which repos worked well? I've had the same experience as op- unhelpful diagrams and bad information hierarchy. But I'm curious to see examples of where it's produced good output!

    • esperent 2 days ago

      > people can look at the same thing and reach diametrically opposed conclusions. It's very peculiar and I've never experienced anything like it in my career until recently

      React vs other frameworks (or no framework). Object oriented vs functional. There's loads of examples of this that predate AI.

      • alansammarone 2 days ago

        I dont think it's quite the same. The cases you mention are more like two alternative but roughly functionally equivalent things. People still argue and use both, but the argument is different. Even if people don't explicitly acknowledge it, at some level they understand it's a difference in taste.

        This feels to me more like the horses vs cars thing, computers vs... something (no computers?), crypto vs "dollar-pegged" money, etc. It's deeper. I'm not saying the AI people are the "car" people, just that...there will be one opinion that will exist in 5-20 years, and the other will be gone. Which one... we'll see.

        • esperent 2 days ago

          > People still argue and use both, but the argument is different

          React vs no framework is at least in the same ballpark as AI vs no AI. Some people are determined to prove to the world that React/AI/functional programming solves everything. Some people are determined to prove the opposite. Most people just quietly use them without feeling like they need to prove anything.

      • Xss3 2 days ago

        This is such an apples to oranges comparison that it makes me suspicious of your motives here.

        Bad documentation full of obvious errors and nonsense is very different to having an opinion on OO vs Functional programming.

        Even that sentence sounds insane because who would ever compare the two?!

  • frumiousirc 2 days ago

    I have a fairly large code base that has been developed over a decade that deepwiki has indexed. The results are mixed but how they are mixed gives me some insight into deepwiki's usefulness.

    The code base has a lot of documentation in the form of many individual text files. Each describe some isolated aspect of the code in dense, info-rich and not entirely easily consumable (by humans) detail. As numerous as these docs are, the code has many more aspects that lack explicit documentation. And there is a general lack of high-level documentation that tie each isolated doc into some cohesive whole.

    I formed a few conclusions about the deepwiki-generated content: First, it is really good where it regurgitates information from the code docs while being rather bad or simply missing for aspects not covered by the provided docs. Second, deepwiki is so-so for providing a high layer of documentation that sort of ties things together. Third, it is highly biased about the importance of various aspects by their code docs coverage.

    The lessons I take from this are: deepwiki does better ingesting narrative than code. I can spend less effort on polishing individual documentation (not worrying about how easy it is for humans to absorb). I should instead spend that effort to fill in gaps, both details and to provide higher-level layers of narrative to unify the detailed documentation. I don't need to spend effort on making that unification explicit via sectioning, linking, ordering, etc as one may expect for a "manual" with a table of contents.

    In short, I can interpret deepwiki's failings as identifying gaps that need filling by humans while leaning on deepwiki (or similar) to provide polish and some gap putty.

    • Xss3 2 days ago

      If documenting the why rather than the how you often end up tying high level concepts together.

      E.g. If you describe how the user service exists you wont necessarily capture where it is used.

      If you document why the user service exists you will often mention who or what needs it to exist, the thing that gives it a purpose. Do this throughout and everything ends up tied together at a higher level.

  • NewsaHackO 2 days ago

    > The text sections take implementation details that don't matter and present them to the user like they need to know them. It's also outdated.

    The point of the wiki is to help people learn the codebase so they can possibly contribute to the project, not for end users. It absolutely should explain implementation details. I do agree that it goes overboard with the diagrams. I’m curious, I’ve seen other moderately sized repo owners rave about how DeepWiki did very well in explaining implementation details. What specifically was it getting wrong about your code in your case? Is it just that it’s outdated?

    • blopker 2 days ago

      I dunno, it seems to be real excited about a VS Code extension that doesn't exist and isn't mentioned in the actual documentation. There's just too many factual errors to list.

      • NewsaHackO 2 days ago

        >I dunno, it seems to be real excited about a VS Code extension that doesn't exist and isn't mentioned in the actual documentation. There's just too many factual errors to list.

        There is a folder for a VS Code extension here[0]. It seems to have a README with installation instructions. There is also an extension.ts file, which seems to me to be at least the initial prototype for the extension. Did you forget that you started implementing this?

        [0] https://github.com/blopker/codebook/blob/c141f349a10ba170424...

        • curl-up 2 days ago

          This thread should end up in the hall of fame, right next to the Dropbox one.

          From a fellow LLM-powered app builder, I wish you best of luck!

          • raincole 2 days ago

            Yeah, this is a thread worth saving. Even just as an example of multiple people who can't read as well as an LLM.

          • oblio 2 days ago

            Plot twist, OP has a doc mentioning it as unreleased.

        • sceptic123 2 days ago

          In that folder is CHANGELOG.md[0] that indicates that this is unreleased. I'd say that including installation instructions for an unreleased version of the extension is exactly the issue that is being flagged.

          [0] https://github.com/blopker/codebook/blob/main/vscode-extensi...

          • NewsaHackO 2 days ago

            You are going to want to reread the file you are quoting buddy. That changelog is indicative that the extension has been released. The Unreleased section seems to list features that are not yet included in the released version of the VS Code extension, and the future plans are features that have not been developed yet.

        • throwaway290 2 days ago

          here the maintainer says it doesn't exist. there's basically no way another interpretation is "more correct". presence or files can be not intended for use, deprecated, internal, WIP, etc. this is why we need maintainers.

          • NewsaHackO 2 days ago

            Maintainers are not gods, and don't get to rewrite plainly true facts. In the Changelog, it actually says it is a "Initial release of Codebook VS Code extension".

            • throwaway290 an hour ago

              compared to an llm they are an authoritative source...

        • blopker 2 days ago

          I brought up this issue because I thought it illustrated my previous points nicely.

          Yes, there is a VS Code folder in that repo. However, it doesn't exist as an actual extension. It's an experiment that does not remotely work.

          The LLM generated docs has confidently decided that not only does it exist, but it is the primary installation method.

          This is wrong.

          Edit: I've now had to go into the Readme of this extension to add a note to LLMs explicitly to not recommend it to users. I hate this.

          • ninininino 2 days ago

            Is it possible that a random person who discovered your repo from Google search would make the same mistake the LLM did and assume it works and not realize it was an unfinished experiment?

            • rng-concern 2 days ago

              Yes, and so the value of the persons opinions on the repo is low. Far lower than real documentation written by someone who knows more, that would not have made that mistake.

              The value proposition here is that these llm docs would be useful, however in this case they were not.

              • NewsaHackO 2 days ago

                >Far lower than real documentation written by someone who knows more, that would not have made that mistake.

                But his own documentation did said that there was a VSCode extension, with installation instructions, a README, changelog, etc. From what he said, it doesn't even compile or remotely work. It would be extremely aggravating to attempt to build the project with the maintainer's own documentation, spend an hour trying to figure out what's wrong, and then contact the maintainer for him to say, "oh yeah, that documentation not correct, that doesn't even compile even though I said it did 2 months ago lol." It is extremely ironic that he is so gungho about DeepWiki getting this wrong.

                • ninininino 2 days ago

                  Yes, this is my point. It seems like the creator was a little bit lazy to create such a full fledged readme.md with so much polish but -entirely neglect to mention the whole thing is broken and unfinished-.

                  That seems about as annoying as a random wiki mis-explaining your system.

                  That being said, I am still biased towards empathizing with the library author since contributing to open source should be seen as being a great service already in and of itself, and I'd default to avoiding casting blame at an author for not doing things "perfectly" or whatever when they are already doing volunteer work/sharing code they could just keep private.

                  • blopker 2 days ago

                    This.

                    The WIP code was committed with the expectation that very few people would see it because it was not linked anywhere in the main readme. It's a calculated risk, so that the code wouldn't get out of date with main. The risk changed when their LLM (wrongly) decided to elevate it to users before it was ready.

                    It's clear DeepWiki is just a sales funnel for Devin, so all of this is being done in bad faith anyway. I don't expect them to care much.

                  • NewsaHackO 2 days ago

                    >That being said, I am still biased towards empathizing with the library author since contributing to open source should be seen as being a great service already in and of itself, and I'd default to avoiding casting blame at an author for not doing things "perfectly" or whatever when they are already doing volunteer work/sharing code they could just keep private

                    This is true, and the only reason for this was more so his dismissive view of DeepWiki than a criticism of the project itself or of the author as a programmer. LLMs hallucinate all the time, but there is usually a method to the way they do so. Particularly, for it to just say a repo had a VSCode extension portion with nothing pointing to it would not be typical at all for an LLM like DeepWiki.

        • Phelinofist 2 days ago

          What a plot twist

          • NewsaHackO 2 days ago

            It’s funny, I accidentally put a link to the commit instead of the current repo file because I was investigating whether or not he committed it versus he recently took over the project and didn’t realize the previous owner had started one. But he is the one who actually committed the code. I guess LLMs are so good now that they’re stopping developers from hallucinating about code they themselves wrote.

        • raincole 2 days ago

          Wow. Better advertisement for LLM in three comments than anything OpenAI could come up with.

          • lionkor 2 days ago

            It might be internal, unfinished, a prototype, in testing and not yet for public use. It might exist but do something else.

            This is not an ad for LLMs. If you think this is good, you should probably not ever touch code that humans interact with.

  • rmnclmnt 2 days ago

    I fear the consequences will be even darker:

    - Users are confused by autogenerated docs and don’t even want to try using a project because of it

    - Real curated project documentation is no longer corrected by users feedback (because they never reach it)

    - LLMs are trained on wrong autogenerated documentation: a downward spiral for hallucinations! (Maybe this one could then force users go look for the official docs? But not sure at this point…)

    • vissi 2 days ago

      > LLMs are trained on wrong autogenerated documentation: a downward spiral for hallucinations! (Maybe this one could then force users go look for the official docs? But not sure at this point…)

      On this, I think, we should have some kind of AI-generated meta-tag, like this: https://github.com/whatwg/html/issues/9479

      • bt1a 2 days ago

        I wonder what incentives for adherence to the use of this meta-tag might exist? For example, imagine I send you my digital resume and it has an AI-generated footer tag on display? Maybe a bad example- I like the idea of this in general, but my mind wanders to the fact that large entities completely ignored the wishes of robots.txt when collecting the internet's text for their training corpuses

        • mrdevlar 2 days ago

          Large entities aside, I would use this to mark my own generated content. Would be even more helpful if you could get the LLM to recognise it which would allow you to prevent ouroboros situations.

          Also, no one is reading your resume anymore and big corps cannot be trusted with any rule as half of them think the next-word-machine is going to create God.

  • onion2k 2 days ago

    I went to the lodash docs and asked about how I'd use the 'pipeline' operator (which doesn't exist) and it correctly pointed out that pipeline isn't a thing, and suggested chain() for normal code and flow() for lodash fp instead. That's pretty much spot on. If I was guessing I'd suggest that the base model has a lot more lodash code examples in the training data, which probably makes a big difference to the quality of the output.

    • billyp-rva 2 days ago

      The lack of a pipeline operator in JS (and JS libraries like lodash) has also been discussed online a lot.

      • onion2k 2 days ago

        Exactly the point. If there's a lot of data in the training set the results will be better.

        • billyp-rva 2 days ago

          I guess I'm trying to emphasize the distinction between information in the repo (code) vs. information elsewhere (discussions) that the model looks at.

  • rwmj 2 days ago

    I tried it on a big OCaml project (https://deepwiki.com/libguestfs/virt-v2v) and it seems correct albeit very superficial. It helps that the project is extensively documented and the code well commented, because my feeling is that it's digesting those code comments along with the documentation to produce the diagrams. It seems decent as a starting point to understanding the shape of the project if I'd never seen it before. This is the sort of thing you could do yourself but it might take an hour or more, so having it done for you is a productivity gain.

  • skissane 2 days ago

    > It's so wrong in every section I saw.

    Not talking about this tool, but in general-incorrect LLM-generated documentation can have some value - developer knows they should write some docs, but are starring at a blank screen and not sure what to write so they don’t. Then developer runs an LLM, gets a screenful of LLM-generated docs, notices it is full of mistakes, starts correcting them-suddenly, a screenful of half-decent docs.

    For this to actually work, you need to keep the quantity of generated docs a trickle rather than a flood-too many and the developer’s eyes glaze over and they miss stuff or just can’t be bothered. But a small trickle of errors to correct could actually be a decent motivator to build up better documentation over time.

    • aswegs8 2 days ago

      At some point it will be less wrong (TM) and it'll be helpful. Feels generally like a good bet.

      • Xss3 2 days ago

        Will it though?

        Fundamentally this is an alignment problem.

        There isnt a single AI out there that wont lie to your face, reinterpret your prompt, or just decide to ignore your prompt.

        When they try to write a doc based off code, there is nothing you can do to prevent them from making up a load of nonsense and pretending it is thoroughly validated.

        Do we have any reason to believe alignment will be solved any time soon?

        • aswegs8 5 hours ago

          Why should this be an issue? We are producing more and more correct training data and at some point the quality will be sufficient. To me its not clear what speaks against this.

  • vultour 2 days ago

    > I hope actual users never see this

    I have bad news for you, this website has been appearing near the top of the search results for some time now. I consciously avoid clicking on it every time.

  • bulbar a day ago

    Please don't correct the AI documentation. Just let those projects die as they deserve.

  • NicoJuicy 2 days ago

    What model did you use?

  • ewoodrich 2 days ago

    > The text sections take implementation details that don't matter and present them to the user like they need to know them.

    Yeah this seems to be a recurring issue on each of the repos I've tried. Some occasionally useful tables or diagrams buried in pages of distracting irrelevant slop.

  • blibble 2 days ago

    they will

    it's the first result on google for just about anything technical I search for

  • bn-l 2 days ago

    This is made by “Devin” I believe.

jasonjmcghee 2 days ago

This gets posted pretty frequently.

231 points | 77 days ago | 53 comments

https://news.ycombinator.com/item?id=45002092

  • cuuupid 2 days ago

    YMMV, my experience with DeepWiki is that it’s decent but the DX of the documentation is horrible and the diagrams are often just incorrect.

    Worth mentioning this is a Cognition / Devin on-ramp and has been posted on HN a few times in just a couple months, feels a little sales-y to me.

    • jorvi 2 days ago

      From the title I assumed it would generate docs to put in the repo.

      But it's docs outside the dev's purview on a deepwiki url, used to shepherd people into Devin. Wow. Talk about slimy.

      • 63stack 2 days ago

        Just another parasitic way of extracting value out of open source

    • oblio 2 days ago

      These comments should be pushed to the top of the pile.

63stack 2 days ago

This is one of those sites I filtered out from my Kagi search results. Too often I stumble onto this when I'm looking for something, and it's never ever useful. There is never a time I want to look at flowcharts when looking for documentation, a solution to an error message I'm facing, or a syntax for something.

  • eloisius a day ago

    Same. This is just the next iteration of all those spam sites back in the 2010s that used to mirror GitHub issues, but wowee it uses AI and you can chat with it! Forever grateful to Kagi for the ability to block sites from my results.

WhyNotHugo 2 days ago

I tried a few different repositories (both my own and various other people’s projects). They all yield the same:

    No repositories found

    No repositories matching "https://git.sr.ht/~whynothugo/ImapGoose" were found.
Probably broken/down right now?
  • frumiousirc 2 days ago

    deepwiki doesn't spider. Repos are indexed upon request. The request dialog accepts a non-github URL.

    • moffkalast 2 days ago

      Now I'm wondering who requested my repo lmao.

  • h4ck_th3_pl4n3t 2 days ago

    Maybe they only support github?

    • Vinnl 2 days ago

      Yeah, I wanted to try it on my (GitLab) repo as well, but it also said "No repositories found". Clicking "Index any public repo" pops up a dialog that says "Search for a GitHub repository" and "or Enter the URL of a public GitHub repository".

      So looks like it's not actually any repository.

      • WhyNotHugo a day ago

        Yeah, the site is pretty ambiguous. It says “any repository”, but they require a GitHub URL.

        That explains why none of the projects which I tried worked.

        I wonder why they’d use a descentralised protocol but then only support a single host.

  • dataviz1000 2 days ago

    I've looked at mine and it take 10 to 15 minutes to process.

ofalkaed 2 days ago

I am quite impressed, even if it was not completely right and provided a few rather humorous charts, it is close enough assuming you are following along with the code. A great improvement over the alternatives for getting oriented in unfamiliar code, should save me a great deal of time.

The only issues I have with it are that they layout is not great on small screens, poor experience on my 13" laptop, and I really wish you could hide the "Ask Devin" dialog. The experience is pretty good on my tablet though, I would prefer to use the tablet for reading/annotating the code and have deepwiki on the laptop but not that big of a deal.

tylerrecall 2 days ago

Interesting approach. The challenge I keep hitting with AI-generated documentation is that it lacks the persistent context of how the codebase actually evolved - the decisions, the "why we didn't do X" knowledge, the patterns that emerged over time.

I'm working on RecallBricks (memory infrastructure for AI coding tools) and seeing similar problems: AI tools are great at answering questions about code right now, but they don't remember the conversation you had last week about why you chose this architecture over that one.

For documentation specifically, have you thought about combining the AI-generated docs with a memory layer that captures decision history? Like "this API endpoint exists because of issue #247 where users needed X functionality." That context makes docs way more useful than just describing what the code does.

Curious how you're handling the "outdated docs" problem mentioned above - do you have triggers to regenerate when code changes significantly?

shevy-java 2 days ago

How many errors does that contain - anyone knows stats for that?

I see "AI summaries" on github all the time. It's like a wall of text and seems to be designed to be super-verbose but without seemingly being very informative.

  • portaouflop 2 days ago

    It’s very bad. So bad that you need to filter it out of search results; but it’s being pushed hard on HN, I wonder if there is some concerted bot action that influences this

  • sceptic123 2 days ago

    "If I had more time I would have written a shorter letter"

dvt 2 days ago

As always, these kinds of things are good for "simple" stuff (e.g. stuff you don't really need AI for) but totally suck for "complicated" or "weird" things. For example, I curiously ran it on one of my OSS projects: https://github.com/dvx/lofi

It's a cute little Electron-based mini Spotify player that gets maybe like 200 users a day and has 1.3k stars on GitHub. Code quality is pretty high and it's more or less "feature-complete." There's a lot of simple/typical React stuff in there, but there's also some weird stuff I had to do. For example, native volume capture is weird. But even weirder is having to mess with the Electron internal window boundaries (so people can move their Lofi window where-ever they want to).

We're essentially suppressing window rect constraints using some funky ObjectiveC black magic[1]. The code isn't complicated[1], but it's weird and probably very specific to this use case. When I ask what "constraints" does, DeepWiki totally breaks, telling me it doesn't even have access to those source files[2] (which it does).

Visualizations were also actually disabled on MacOS a few versions ago (because of the janky way you need to hook into the audio driver), but, again DeepWiki doesn't really notice[3]. There have been issues/patch notes about this, so I feel those should be getting crawled.

[1] https://github.com/dvx/lofi/blob/master/src/native/black-mag...

[2] https://deepwiki.com/search/what-is-constraints_cc5c0478-e45...

[3] https://deepwiki.com/search/how-do-macos-visualizations-wo_d...

afro88 2 days ago

Do we need this, when we have tools like Claude Code, Codex etc that you can talk to about the codebase they are started in?

  • ijustlurk 2 days ago

    Agreed, nice idea in theory. But as a codebase owner I’d rather build tailored markdown files with a CLI agent to publish as my docs. And as a codebase consumer I probably only care about a codebase if I’m modifying or running it, which means a CLI agent makes the most sense and I can ask questions/generate .md files as we go.

  • anuramat 2 days ago

    > codebase they are started in

    what about the dependencies? you could just clone them as well (which is what I do occasionally), but deepwiki is faster (for indexed repos) and free

aDyslecticCrow 2 days ago

This is a nice idea in theory. But you need excellent docs in the firstplace for it to work.

And if a human spent painstaking effort writing excellent docs, the least bit of respect i can give them is read it.

  • andybak 2 days ago

    > But you need excellent docs in the first place for it to work.

    Are you sure? I just tried it on projects of mine that have almost zero documentation it did a fairly good job.

    • aDyslecticCrow 2 days ago

      Really? How large is your project?

      There is a very clear point in codebase size where LLMs tend to falter without very clear written down overview descriptions of the system structure. I have a hard time seeing that this system would be immune to that.

      i have encountered LLMs seeminly knowing more about a system than it should because there are many similar in its training set; but that just lead me to be extra sceptical when it pulls up functions that dont exist. (Ive fought LLMs about json libraries quite a bit)

cyberax 2 days ago

I insta-banned this site in Kagi. The trigger for me: utter disrespect for the user with unhideable glassy floating chatbox at the bottom of the page.

And WTF with these floating boxes popping up everywhere?!? They are tailor-made to trigger anxiety in people with OCD. They look like a notification that keep grabbing your attention as you scroll the text. Example: https://aws.amazon.com/blogs/aws/secure-eks-clusters-with-th...

CyberShadow 2 days ago

Looks like it's impossible for me to use this service - when I try to submit the form, I get a reCAPTCHA challenge. By the time I complete it (Google requires me to make several attempts, each one being several pages), the page errors out in the background with "reCAPTCHA execution timeout".

  • lionkor 2 days ago

    Try solving it slowly, some captchas love that.

theletterf 2 days ago

You need great pre-existing docs for something like this to work properly.

AI must RTFM. https://passo.uno/from-tech-writers-to-ai-context-curators/

  • alansammarone 2 days ago

    It certainly helps, but in my experience you get 60-80% of the benefit just with code (except in legacy or otherwise terrible code, for example with misleading/outdated comments everywhere, bad variable/function names, etc - in that case more like 40%).

dkersten 2 days ago

I don’t want to talk to my documentation. I just want the facts searchable and easily readable.

  • input_sh 2 days ago

    I agree wholeheartedly, at best I want a "smarter" search bar where I don't have to guess the exact wording of what I'm looking for, but the reply should still be a verbatim quote from the docs, not something regurgitated to be less accurate.

rckt 2 days ago

This doesn't work. It's better to prompt an agent with specific questions per subject. Having this general AI interpretation of a doc can be amazingly misleading. Nice idea, but unfortunately absolutely useless and even time wasting at the moment.

juliangmp 2 days ago

I mean no offense to the people that created this, but this has been a domain I blocked in duckduckgo's search results for a while now.

I really don't like how AI summaries creep up in SEO rankings and make it harder for me to find the actual, official documentation.

df0b9f169d54 2 days ago

I wanted to try the tool with a repo I know. After a few attempts to select cars, bus,crosswalks, I got "capchat timeout error".

alansammarone 2 days ago

This is an interesting threads. There are many instances of "this is bad, doesn't work, don't like it", and many instances of "it works reasonably well here, look: <url>".

Seems like a consistent pattern.

  • portaouflop 2 days ago

    It’s a propaganda and psyop operation on HN if you ask me. This stuff is laughably bad and I wonder who would actually use it for real work beyond a “huh this is cool” at first glance.

    HN is super susceptible to propaganda in the AI age unfortunately; I think at this point a lot of the comments and posts on here are from bots as well

  • internet_points 2 days ago

    There was some article here on how llm's are like gambling, in that sometimes you get great payouts and oftentimes not, and as psych 101 taught us, that kind of intermittent reward is addictive.

    • alansammarone 2 days ago

      Interesting point, never thought of it like that, and I think there is some truth to that view. On the other hand, IIRC, this works best in instances where it's pure chance (you have no control over the likelihood of reward) and the probability is within some range (optimal is not 50%, I think, could be wrong).

      I don't think either of this is true of LLMs. You obviously can improve its results with the right prompt + context + model choice, to a pretty large degree. The probability...hard to quantify, so I won't try. Let's just say that you wouldn't say you are addicted to your car because you have a 1% chance of being stuck in the middle of nowhere if it breaks down and 99% chance of a reward. The threshold I'm not sure.

Ultimatt 2 days ago

This worked well for me for some things I've recently been learning/working on. One improvement I'd add is the citations of where information have come from aren't hyperlinks it would be very useful if they were!

1317 2 days ago

I tried it with my repo, it was impressive at first

but then as i kept going along it just got tiring, it kept calling everything sophisticated even when it wasn't

it's the same as all the other AI slop, it's really impressive the first time you see it

and then you keep seeing it and get tired of its patterns of speech etc and oh it's just making up nonsense

and now the ai slop "documentation" is up on the public internet for all to see with no way for me to remove it :)

vijaybritto 2 days ago

The diagrams generated are arbitrary and make no sense. This needs improvements

typpilol 2 days ago

I find it's better than context7, but that's not saying much

  • bn-l 2 days ago

    Context7 uses the real documentation of I’m not mistaken and just provides you a RAG mcp

    • typpilol 2 days ago

      I don't think so as it generates stuff even for projects without any documents

ramon156 2 days ago

I've seen this idea before claude code gemini cli etc were a thing. This is not relevant anymore (unless you surpass these tools).

Cool idea, bad timing

  • alansammarone 2 days ago

    I don't know the specifics of this particular tool, I assume it's at most using a couple of passes of (some frontier model with specific system prompt + custom tools, for example code-specific rag + some form of "summarize"). By at most I mean "probably isn't doing anything crazier than that".

    But it seems to be producing docs that are better than I tend to see with basic "summarize this repo for me"-style prompts, which is what I usually use on a first pass.

virajk_31 2 days ago

Is the documentation generated using LLMs? Anyway this would only work if the documentation is truly top notch and completely accurate

bittermandel 2 days ago

I use this heavily to navigate the neondatabase/neon repo and it has been invaluable

twp 2 days ago

deepwiki.com is untrustworthy AI slop. A true cancer.

deepwiki.com's generated page on my project contains several glaring errors. I hate to think of the extra support burden I will have to bear because of deepwiki.com publishing wrong information.

I asked the authors of the site (Andrew Gao) to remove their page on my project, but they ignored my request.

killerstorm 2 days ago

Tangentially related: AI assistants (ChatGPT, Claude and Gemini) are unable to get public code from GitHub. (I.e. specifically assistants you use via web site, not Codex, Claude Code, etc.)

Again, as it might be hard to believe, as situation is rather insane: flagship AI assistants cannot get publicly available code. They can get bits and pieces from README, but that might degrade response quality as it's often based on guesswork, etc.

Example via GPT-5 Thinking, my request:

``` Can you read code from https://github.com/killerstorm/auto-ml-runner/blob/master/ru... ?

If yes, show me some port of code and how you got it. ```

(That's normal URL user can access from the browser, you also get same result if you post top-level repo URL https://github.com/killerstorm/auto-ml-runner/).

Thinking: 2 minutes. (IT WAS THINKING FOR TWO MINUTES JUST TO ACCESS ONE FILE!)

``` Short answer: yes...

Why I’m not pasting a snippet right this second:

In this environment, GitHub’s code pages are loading their chrome but not returning the file body ```

So, actually, no, it cannot read it, but it believes it can. That's rather problematic.

Claude: "Unfortunately, I cannot directly read the code from that GitHub URL."

Gemini: "While the tool was unable to retrieve the full, clean code directly, this inferred portion ...". I.e. it just imagined the code. The snippet has nothing to do with code in repo.

This is a rather fucktacular situation as agents are not sure if they read the code, and they might hallucinate subtly wrong code trying to be helpful.

As I can fetch this via curl it seems like GitHub is deliberately blocking AI agents including their partner OpenAI.

marginalia_nu 2 days ago

So I gave it a spin on two of my repos.

One is the extremely sprawling MarginaliaSearch repo[M1].

Here it did a decent job of capturing the architecture, though it is to be fair well documented in the repo itself. It successfully identifies the most important components, which is also good.

But when describing the components, it only really succeeds where the components themselves are very self-contained and easy to grok. It did a decent job with e.g. the buffer pool[M2], but even then fails to define some concepts that would have made it easier to follow, e.g. what is a pin count in buffer management? This is standard terminology and something the model should know.

I get the impression it lifts a lot of its fact from the comments and documentation that already exists, which may lead it to propagate outdated falsehoods about the code.

[M1] https://deepwiki.com/MarginaliaSearch/MarginaliaSearch

[M2] https://deepwiki.com/MarginaliaSearch/MarginaliaSearch/5.2-b...

The other is the SlopData[S1] repo, which contains a small library for columnar data serialization.

This one I wasn't very impressed with. It produced more documentation than was necessary, mostly amending what was already there with incorrect statements it seems to have pulled out of its posterior[2][3].

The library is very low-abstraction, and there simply isn't a lot of architecture to diagram, but the model seems to insist that there must be a lot of architecture and then produces excessive diagrams as a result.

[S1] https://deepwiki.com/MarginaliaSearch/SlopData

[S2] https://deepwiki.com/MarginaliaSearch/SlopData#storage-types (performance numbers are completely invented, in practice reading compressed data is typically faster than plain data)

[S3] https://deepwiki.com/MarginaliaSearch/SlopData/6.3-zip-packa... (the overview section is false, all these tables are immutable).

So overall it gives me a bit of a broken clock vibe. When it's right, it's great. When it isn't, it's not very useful. Good at the stuff that is already easy, borderline useless for the stuff that isn't.

voodooEntity 2 days ago

So i just tried this on 2 repositories.

1. On (https://github.com/voodooEntity/gits) -> https://deepwiki.com/voodooEntity/gits

This is a longterm golang project i work on and it has a very very detailed documentation already.

While going through the AI docs of deepwiki, i could see how it profitted from my existing documentation, most stuff is just different words same content. What i liked about it was the visualisations (even if some of them are well "special") it shows some insides in workflows that i have in my mind but might give a benefit to others not beein the author

While trying out the search/chat i have to admit it gave better answers than i expected.

Due to having a very fond knowledge of how to do stuff efficiently with the lib, i tested the chat on telling me whats the most efficient way to achieve XYZ. While it listed me all possibilities (all of them correct) it also correctly pointed out whats the most "efficient" way.

Also i gave it some question that, i know from experience when others first tried the lib, could be confusing. But it was resolved correctly.

Allover a pleasant result

2. On (github.com/electronicarts/CnC_Renegade/) -> https://deepwiki.com/electronicarts/CnC_Renegade/

For those who dont know , CnC Renegade is a very old game (~2000) which was coded by the original Westwood. Its mainly in C++ (some c) and a through and through plain code. There is no real documentation in the repo other than some base info for dependencies etc.

First of all i saw that the resulting documentation well.... lacked documentation i guess? It just in multiple pages explaind whats in the main Readme (which is not really alot). So from the "docs generating" perspective, no gain here.

Than i tried to chat with it about it - and it seemed like it has a basic understanding of the code. For me its harder to validate the results (tbh i only read over the code once when it was released - curiosity) but it seemed like it was no total loss.

Conclusion: To me it seems like, to get a very good basic documentation out of it, it already must have a good basic documentation. Apart from the graphics it added, i didn't really see a gain compared to the already existing documentation.

Based on the chat results i'd say, those might be decent and helpfull if you dig into a new codebase especially a more complex one and you are searching for a specific thing in 1000s of loc in multiple files.

Would i use it in the future? Ill maybe try, but only the chat feature - for the generated docs as elaborated i don't see any use.

esafak 2 days ago

It works! I love using it for open source repos.