Show HN: Vaev – A browser engine built from scratch (It renders google.com)

github.com

134 points by monax 8 hours ago

We’ve been working on Vaev, a minimal web browser engine built from scratch. It supports HTML/XHTML, the CSS cascade, @page rules for pagination, and print-to-PDF rendering. It even handles calc(), var(), and percentage units—and yes, it renders Google.com (mostly).

This is an experimental project focused on learning and exploration. Networking is basic (http:// and file:// only), and grid layouts aren’t supported yet, but we’re making progress fast.

We’d love your thoughts and feedback.

khimaros 6 hours ago

i find myself requesting this whenever i see a new minimalist browser pop up:

it would be great to standardize alternative browsers on a consistent subset of web standards and document them so that "smolweb" enthusiasts can target that when building their websites and alternative browsers makers can target something useful without boiling the ocean

i personally prefer this approach to brand new protocols like Gemini, because it retains backward compatibility with popular browsers while offering an off ramp.

  • idle_zealot 4 hours ago

    > standardize alternative browsers on a consistent subset of web standards and document them so that "smolweb" enthusiasts can target that

    Could such a standard be based on the subset of HTML/CSS acceptable in emails? Maybe with a few extra things for interactivity.

  • userbinator 5 hours ago

    The subset could just be an older version of the spec, e.g. HTML 4.01 and CSS 2.1.

    (My opinion as another one who has been slowly working on my own browser engine.)

    • Inviz 3 minutes ago

      Cat is out of the bag. nobody wants their CSS without flexbox anymore. It has to include that.

    • robocat 4 hours ago

      Pick a subset aimed directly at accessibility.

      The least-needed features are often accessibility nightmares (e.g. animation - although usually not semantic).

      The accessible subset could then be government standardized and used as a legal hammer against over-complex HTML sites.

      For a while search engines helped because they encouraged sites to focus on being more informative (html DOCUMENTS).

      I think web applications are a huge improvement over Windows applications, however dynamic HTML is a nightmare. Old school forms were usable.

      (edited to improve) Disclosure: wrote a js framework and SPA mid 00's (so I've been more on the problem side than the solution side).

    • poisonborz 4 hours ago

      That's easy to specify but contains a lot of bloat and unused features. A slimmer but more modern set would be useful.

    • ghayes 4 hours ago

      I feel like some of the newer standards like CSS Grid instead of tables might be the best way to go. Many HTML/CSS improvements were not just bloat but actually better standards to build on.

      • edoceo 4 hours ago

        Right! Crazy fonts or cursors, not on smolweb (as another use put it) but Flex and Grid are almost necessary. There are loads of things that could be dropped (it feels like).

        I just want one of these browsers to give me a proper ComboBox (text, search and drop-down thing)

      • userbinator 3 hours ago

        You still need to have tables.

        • dmd 3 hours ago

          And <marquee>, of course.

    • stevage 3 hours ago

      But older versions contain lots of crap we don't need (eg <blink> tags) and miss out on useful stuff (grid layout).

    • 5- 4 hours ago

      > slowly working on my own browser engine

      care to tell us more?

  • graypegg 5 hours ago

    I think that would be really neat for small scale web publishing, but making it a subset of browser standards could be a really difficult sell to the people making browsers. While it's easier to build a browser to a subset of such a massive set of specs, the subset will drift towards a "similar but slightly incompatible standard" pretty soon after it's decided on. Following the development of Ladybird has given me an appreciation for just how often the "spec" for the web changes. (in small ways, daily.) That locks new browser implementations into a diverging standards track that would be very difficult to get off of.

    I think something like a reference implementation (Ladybird, Servo or even Vaev maybe?) getting picked up as the small-web living standard feels like the best bet for me since that still lets browser projects get the big-time funding for making the big-web work in their browser too. "It's got to look good in Ladybird/Vaev/etc".

    An idea: a web authoring tool built around libweb from Ladybird! (Or any other new web implementation that's easily embeddable) The implied standard-ness of whatever goes in that slot would just come for free. (Given enough people are using it!)

    • userbinator 5 hours ago

      small-web living standard

      The phrase "living standard" is an oxymoron, invented by the incumbents who want to pretend they're supporting standards while weaponising constant change to keep themselves incumbent.

    • shiomiru 4 hours ago

      > I think something like a reference implementation (Ladybird, Servo or even Vaev maybe?) getting picked up as the small-web living standard feels like the best bet for me since that still lets browser projects get the big-time funding for making the big-web work in their browser too.

      A "standard" should mean there is a clear goal to work towards to for authors and browser vendors. For example, if a browser implements CSS 2.1 (the last sanely defined CSS version), its vendor can say "we support CSS 2.1", authors who care enough can check their CSS using a validator, and users can report if a CSS 2.1 feature is implemented incorrectly.

      With a living standard (e.g. HTML5), all you get is a closed circle of implementations which must add a feature before it is specified. Restricting the number of implementations to one and omitting the descriptive prose sounds even worse than the status quo.

DarkmSparks 2 hours ago

I know its a tangent, but the idea that maybe ripping out android webview into a standalone cross platform project in its own right pops into my head everytime this problem arises. Keep meaning to check if anyones actually done it already.

  • flexagoon 2 hours ago

    What do you mean by that? WebView is just Chrome embedded inside of an Android app. Same thing already exists on Windows (Edge WebView2), macOS (WKWebView) and Linux (WebKitGTK). There's also a library that wraps all of them into a single interface:

    https://github.com/webview/webview

    The entire point of WebView is that it's a browser embedded inside of a different application, how do you expect it to be a "standalone project"?

abhisek 8 hours ago

What’s the long term goal of this project beyond learning? Building a browser to support the modern web is a humongous work IMHO.

  • monax 7 hours ago

    The main goal is great support for static documents rendering as it's being used at the core of the paper-muncher [1] PDF rendering engine, meant to replace wkhtmltopdf at odoo. But we don't exclude general web browsing and JavaScript support at some point.

    [1] https://github.com/odoo/paper-muncher

    • dmkolobov 7 hours ago

      Ooh blast from the past!

      At a previous company we moved off of wkhtmltopdf to a nodejs service which received static html and rendered it to pdf using phantomjs. These days you probably use puppeteer.

      The trick was keeping the page context open to avoid chrome startup costs and recreating `page`. The node service would initialize a page object once with a script inside which would communicate with the server via a named Linux pipe. Then, for each request:

      1. node service sends the static html to the page over the pipe

      2. the page script receives the html from the pipe, inserts it into the DOM, and sends an “ack” back over the pipe

      3. the node service receives the “ack” and calls the pdf rendering method on the page.

      I don’t remember why we chose the pipe method: I’m sure there’s a better way to pass data to headless contexts these days.

      The whole thing was super fast(~20ms) compared to WK, which took at least 30 seconds for us, and would more often than not just time out.

      • sshine 5 hours ago

        Sounds like fun considering how real the problem is.

        • dmkolobov 3 hours ago

          It was!

          I remember the afternoon I had the idea: it was beer Friday -and it took a few hours to write up a basic prototype that rendered a PDF in a few hundred milliseconds. That was the first time I’d written a 100x speed improvement. Felt like a real rush.

    • giovannibonetti 3 hours ago

      At work we recently switched from Wkhtmltopdf to Typst, which is a breath of fresh air. It is very fast and generates PDFs from scratch without needing to involve HTML or a browser engine. It is implemented in Rust and distributed as a self-contained binary.

      This blog post convinced us that the switch was worth it: https://zerodha.tech/blog/1-5-million-pdfs-in-25-minutes/

      • stevage 2 hours ago

        Oh interesting. I use their "old stack" for a couple of much smaller projects and it works fine, but it does seem a bit ridiculous to be starting up a whole chrome instance just to convert one file format to another.

    • Teever 4 hours ago

      So cool to see Odoo mentioned on HN. I've worked with it before and like it a lot.

      I've made posts about it on HN before but they've never gained traction. I hope that this takes off.

      You guys make neat software.

  • pierrelf 7 hours ago

    Looks like skift is a hobby os like Serenity OS which Ladybird is spun out from. Maybe they intend to follow the same path?

    • monax 7 hours ago

      I intend to keep Skift and Vaev together for as long as possible since everything is meant to be cross-platform. I don’t see any architectural conflict that would motivate such a change.

danpalmer an hour ago

I'm interested in why C++ was chosen for this? Browsers are notoriously hard to secure, they're effectively mean to be RCE vulnerabilities! Securing C++ binaries is hard and has in recent years been called out by numerous organisations and companies as being the root cause of many classes of security vulnerability. With languages like (but not limited to) Rust, we now have better options.

  • userbinator 16 minutes ago

    More people are going to keep doing it just to spite the Rust trolls like you.

  • landr0id an hour ago

    I had the same thought. The project's description:

    >secure HTML/CSS engine

    No offense to these folks, but I see no evidence of any fuzzing which makes it hard to believe there aren't some exploitable bugs in the codebase. Google has world-class browser devs and tooling, yet they still write exploitable bugs :p (and sorry Apple / Mozilla, you guys have world-class browser devs but I don't know enough about your tooling. Microsoft was purposefully omitted)

    Yeah, very few of those bugs are in the renderer, but they still happen!

quibono 4 hours ago

Are you open to contributions? I would love if there was a non-chromium alternative to wkhtmltopdf!

  • flexagoon 2 hours ago

    wkhtmltopdf is not chromium though? "Wk" literally stands for WebKit.

    There's also https://weasyprint.org/ which doesn't use any browser engine, but rather a custom renderer.

    And both of those (and Prince) can be used as a backend by Pandoc (https://pandoc.org/)

  • 5- 4 hours ago
    • edoceo 4 hours ago

      Yea, Prince is awesome. Not FOSS tho. I make some GPL or MIT licensed software and wish there was something as good a Prince with more open license.

est an hour ago

oh the irony. I remember decades ago when google.com was branded as example of minimal html design, to save bandwidth as much as possible, they don't even enclose html tags.

Now google.com is loads of js crap. The SERP refuse to render without full blown js, css and cookie.

busymom0 7 hours ago

I wish one of these projects would make a browser which only renders text (so texts and links) and no additional support for media (images, videos, audio etc).

I know there is Lynx but having a non-terminal based browser which could do it would be cool.

  • Telemakhos 5 hours ago

    You might be interested in Richard Stallman's method of browsing the web:

    > I generally do not connect to web sites from my own machine, aside from a few sites I have some special relationship with. I usually fetch web pages from other sites by sending mail to a program (see https://git.savannah.gnu.org/git/womb/hacks.git) that fetches them, much like wget, and then mails them back to me. Then I look at them using a web browser, unless it is easy to see the text in the HTML page directly. I usually try lynx first, then a graphical browser if the page needs it. [0]

    I know you wanted something other than lynx, but you could do this with EWW (Emacs web browser or any graphical browser, provided that your proxy wget dropped the images.

    [0] https://www.stallman.org/stallman-computing.html

    • stevage 2 hours ago

      Wow. At a certain level, I'm glad that people so peculiar exist.

  • tos1 7 hours ago

    Something like Dillo? (You can disable image rendering in Dillo).

  • monax 7 hours ago

    For distraction-free reading?

  • dymk 7 hours ago

    Something like Reader View in safari / firefox?

  • revskill 6 hours ago

    Then google will use text to show ads.

    • busymom0 6 hours ago

      Text based ads would be less distracting.

      • jdironman 40 minutes ago

        Not if they're inline and you've read an ad before you realized it. Like a YouTube sponsor segment. lol

        • busymom0 29 minutes ago

          Fair. However, text based ads are also very easy to filter out using some sort of extension no?

    • II2II 5 hours ago

      Remembering when Google only served text based ads.

mdaniel 6 hours ago

[flagged]

  • n2d4 6 hours ago

    The fact that other browsers are huge engineering efforts only makes it more interesting to many. It's arguably one of the hardest things a programmer could build, how could you not wanna build one!

    • hawk_ 5 hours ago

      Yes but why do it in C++? There is no compiler enforced safe mode and you're by definition implementing an engine to run hostile code in it.

      • npalli 4 hours ago

        Since 2012, all future browsers will be written in Rust and looks like it will always be that case. Perhaps, programming a browser in Rust is a painful activity that nobody seems to have managed to complete (writing parts of it since the Servo days). Talking about safety though, nonstop, yeah no shortage of that.

        • umanwizard 2 hours ago

          FWIW nobody has written a new complete browser engine in any language since then, not just Rust.

      • userbinator 5 hours ago

        I personally have had enough of the "security" bullshit after seeing what it's done to "secure" control over the population and put that in the hands of the enemy.