Skip to main content

Command Palette

Search for a command to run...

How a Browser Works: A Beginner-Friendly Guide to Browser Internals

Updated
5 min read
M
Backend-focused developer learning. I write about internal workings, fundamentals, and real project learnings. Sharing my journey, insights, and mistakes while building in public.

What happens after I type a URL and press Enter?

When you type a URL into the address bar and press Enter, the browser doesn’t just “open a website.” It starts a carefully coordinated process involving many internal components working together. Think of the browser as a small operating system whose job is to take text (HTML, CSS, JavaScript) from the network and turn it into pixels on your screen. From the moment you press Enter, the browser decides where to send the request, fetches files from the internet, understands what those files mean, calculates how everything should look, and finally displays the result.

What a browser actually is (beyond “it opens websites”)

At its core, a browser is a document reader, interpreter, and renderer. It reads documents written in web languages (HTML, CSS, JavaScript), interprets their meaning, applies rules about layout and styling, and renders the final visual output. Unlike a PDF reader or image viewer, a browser deals with interactive, dynamic content. The page can change after loading, respond to clicks, animate, and fetch more data—all of which the browser must continuously manage.

Main parts of a browser (high-level overview)

A browser is best understood as a collection of cooperating components. There is the User Interface, which you directly interact with. There is a Browser Engine, which acts as a coordinator between the UI and the rest of the system. The Rendering Engine is responsible for turning HTML and CSS into visual content. A Networking layer handles downloading resources from the internet. A JavaScript Engine executes JavaScript code. Together, these components form a pipeline that converts URLs into visible, interactive pages.

User Interface: address bar, tabs, buttons

The User Interface is everything you see and click: the address bar, back and forward buttons, reload button, tabs, bookmarks, and menus. While it looks simple, its role is crucial. When you enter a URL or click a link, the UI passes that action to the browser engine. The UI itself does not understand HTML or CSS; it simply gathers user input and displays the final output produced by other components.

Browser Engine vs Rendering Engine (simple distinction)

The Browser Engine is like a manager. It decides when to start loading a page, when to stop it, and how different components should communicate. The Rendering Engine, on the other hand, is the worker that actually understands web content. It takes HTML and CSS and figures out what should appear on the screen. In simple terms, the browser engine coordinates the process, while the rendering engine does the visual interpretation.

Networking: how a browser fetches HTML, CSS, JS

Once the browser knows which URL to load, the networking component steps in. It sends requests over the network to fetch the main HTML file and then additional resources like CSS files, JavaScript files, images, and fonts. These requests may happen in parallel to improve speed. The networking layer handles protocols, responses, errors, and caching, but from the perspective of the rest of the browser, it simply delivers raw data to be processed.

HTML parsing and DOM creation

When the HTML file arrives, the rendering engine does not treat it as plain text. Instead, it parses it—meaning it reads the HTML character by character and understands its structure. During this process, the browser builds the DOM (Document Object Model). The DOM is a tree-like structure where each HTML element becomes a node. Just like a family tree, elements have parents, children, and siblings. This tree representation makes it easier for the browser and JavaScript to understand and manipulate the page.

CSS parsing and CSSOM creation

CSS files go through a similar process. The browser parses the CSS rules and builds the CSSOM (CSS Object Model). The CSSOM represents all styling rules in a structured form, including how styles inherit and which rules apply to which elements. While the DOM answers “what elements exist,” the CSSOM answers “how should each element look.”

How DOM and CSSOM come together

The browser cannot display the page using only the DOM or only the CSSOM. It combines both to create the Render Tree. The render tree includes only the elements that will actually be displayed (for example, elements with display: none are excluded) and attaches the computed styles to each visible node. This structure is the bridge between content and appearance.

Layout (reflow), painting, and display

Once the render tree is ready, the browser performs layout, also known as reflow. During layout, the browser calculates the exact position and size of each element: where it goes on the screen, how wide it is, how tall it is, and how it relates to other elements. After layout comes painting, where the browser fills in pixels: colors, borders, text, shadows, and images. Finally, the result is sent to the screen, and you see the page. If something changes later—like resizing the window or updating content—the browser may repeat parts of this process.

A very basic idea of parsing (simple math example)

Parsing is not unique to browsers; it’s a general concept of turning raw input into structure. Imagine the expression 2 + 3 × 4. A parser does not just see characters; it understands that multiplication has higher priority than addition. Internally, it might represent this as a tree where × is evaluated before +. Similarly, when a browser parses HTML or CSS, it doesn’t just read text—it builds structured trees that capture meaning and relationships.

Seeing the browser as a flow, not a list of parts

The most important thing for beginners is not memorizing component names, but understanding the flow. A URL leads to network requests. Network responses lead to parsing. Parsing leads to DOM and CSSOM. DOM and CSSOM lead to the render tree. The render tree leads to layout, paint, and finally pixels on the screen. Each step builds on the previous one.


-—» THE END, see you in Next Article😁👍

Byte by Byte: Web Browser Internals

Part 2 of 6

Byte by Byte: Web Internals is a beginner-friendly series, explains how the internet and browsers work behind the scenes. From what happens when you type a URL -> web pages on your screen—each article breaks down complex concepts into simple lang.

Up next

TCP Explained: How Reliable Communication Works on the Internet

What Happens If Data Is Sent Without Rules If computers send data without agreed-upon rules, communication quickly breaks down. Some pieces of data may never arrive, others may arrive multiple times,