• stetech@lemmy.world
    link
    fedilink
    arrow-up
    10
    ·
    edit-2
    23 hours ago

    Honest question, since I have no clue about web/browser engines other than being able to maybe name 4-5 of them (Ladybird, Servo, Webkit, Gecko, … shit, what was Chromium’s called again?):

    What makes browsers/browser engines so difficult that they need millions upon millions of LOC?

    Naively thinking, it’s “just” XML + CSS + JS, right? (Edit: and then the networking stack/hyperlinks)

    So what am I missing? (Since I’m obviously either forgetting something and/or underestimating how difficult engines for the aforementioned three are to build…)

    • qqq@lemmy.world
      link
      fedilink
      arrow-up
      26
      ·
      edit-2
      22 hours ago

      JavaScript alone is not a simple beast. It needs to be optimized to deal with modern JavaScript web apps so it needs JIT, it also needs sandboxing, and all of the standard web APIs it has to implement. All of this also needs to be robust. Browsers ingest the majority of what people see on the Internet and they have to handle every single edge case gracefully. Robust software is actually incredibly difficult and good error handling often adds a lot more code complexity. Security in a browser is also not easy, you’re parsing a bunch of different untrusted HTML, CSS, and JavaScript. You’re also executing untrusted code.

      Then there is the monster that is CSS and layout. I can’t imagine being the people that have to write code dealing with that it’d drive me crazy.

      Then there are all of the image formats, HTML5 canvases, videos, PDFs, etc. These all have to be parsed safely and displayed correctly as well.

      There is also the entire HTTP spec that I didn’t even think to bring up. Yikes is that a monster too, you have to support all versions. Then there is all of that networking state and TLS + PKI.

      There is likely so much that I’m still leaving out, like how all of this will also be cross platform and sometimes even cross architecture.

      • stetech@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        14 hours ago

        Thanks for these explanations, that makes a lot more sense now. I didn’t even think to consider browsers might be using something else than an off-the-shelf implementation for image/other file formats…, lol

        • qqq@lemmy.world
          link
          fedilink
          arrow-up
          4
          ·
          13 hours ago

          Sorry I didn’t mean to imply they don’t use shared libs, they definitely do, but they have to integrate them into the larger system still and put consistent interfaces over them.

          • stetech@lemmy.world
            link
            fedilink
            arrow-up
            2
            ·
            edit-2
            13 hours ago

            Yeah I realize that. My go-to comparison would be PDF. Where Firefox has PDF.js (I think?), Chromium just… implements basically seemingly the entire (exhaustive!) standard.

      • vaguerant@fedia.io
        link
        fedilink
        arrow-up
        14
        ·
        22 hours ago

        Adding on to this, while this article is fast approaching 20 years old, it gets into the quagmire that is web standards and how ~10 (now ~30) years of untrained amateurs (and/or professionals) doing their own interpretations of what the web standards mean–plus another decade or so before that in which there were no standards–has led to a situation of browsers needing to gracefully handle millions of contradictory instructions coming from different authors’ web sites.

        Here’s a bonus: the W3C standards page. Try scrolling down it.