In an attempt to reach the lowest level of software I could, I decided it would be a great idea to build an operating system from scratch. Honestly, this was a very naïve decision on my part because I didn’t realize how difficult it would be to build one end to end myself. So I have officially invited someone to help me and we’re cooking something. My friend Madhav has joined me on this adventure, he’s experienced in Embedded Systems and some fun low-latency stuff. I met him last year at orientation. At the beginning of this month, I sat next to him in my CS246 lecture and told him I was attempting to build an operating system from scratch, and he was crazy enough to tell me he wanted to join. So now we’re both cooking it up together.
So this is a quick little update on the plan and what has already been accomplished.
The plan ✨
Learn the basics
What the heck does an operating system even do?
What is a CPU lol?
What is a boot loader, kernel, CPU virtualization, processes, etc.?
Build a boot loader
Requires a pretty good understanding of assembly. Good thing I can slay in ARM V8
Build the kernel
Figure out how to manage system resources such as CPU, memory, and I/O devices
Figure out how to handle the creation, scheduling, and termination of processes, enabling multitasking and ensuring that each process runs efficiently without interfering with others.
Manages memory (scary)
Figure out file manipulation, process control, and networking from the OS.
The end goal: Building an OS that allows users to run simple commands (mkdir, cd, mv, ls, etc.)
Updates:
About a week ago I decided I wanted to start cooking this up. Since then I have brushed up on the OS basics and have built out a simple bootloader myself. This leaves Madhav and I to work on the Kernel, which I’m very excited about. Also a kernel is needed to fully build out the bootloader.
Introduction to OS
I have watched a lot of videos on operating systems within the last week and also read a lot of articles. I’m going to dump everything I’ve learned.
What are operating systems?
The OS on your computer is by far the most important piece of software. It essentially does all the communication with your computer’s hardware. It controls file management, and memory management and manages the CPU, disk, etc. The way that it manages the CPU is quite interesting, it starts by initializing the program counter, then initializes the registers in order to begin execution. It also does this super cool thing with the CPU which is called virtualization, this essentially convinces the CPU that is it the only thing running on the device. This allows you to have MrBeast on in one window and then VS code in another.
What is a CPU and how does it work
Control Processing Units (CPU), you can think of these as the brain of your computer. It executes all of the programs you run on your computer. The CPU plays closely with the main memory, it looks to it for both instructions and data. The CPU interacts closely with primary storage, or main memory, referring to it for both instructions and data. The CPU can be broken down into two parts: The control unit and the arithmetic/logic unit.
Control Unit:
It’s pretty intuitive but this part of the CPU carries out stored program instructions. It doesn’t actually execute any of these instructions, instead, it essentially suggests places for these instructions to go to be executed. If you ever write assembly code this is where it is shipped to.
The way this is down is through the power of ✨registers✨.
These are temporary storage areas for instructions or data. Registers work under the direction of the control unit to accept, hold, and transfer instructions. In assembly, registers look like this
ADD X4, X2, X1
Basically what this is saying is take the contents that is held in the register X2 and the register X1 and store the resulting sum in X4. Also, you can write a lot of fun things in assembly. You can grab values from given memory addresses, you can store things, you can jump around memory, and it’s a fun time.
There are a few different types of registers:
An accumulator, which collects the result of computations.
An address register, which keeps track of where a given instruction or piece of data is stored in memory.
A storage register, which temporarily holds data taken from or about to be sent to memory.
A general-purpose register, which is used for several functions.
The Arithmetic/Logic Unit:
Again, very intuitive, and performs all basic logic and math.
What is memory and does does the CPU interact with it?
Primary memory, main storage, internal storage, main memory, and RAM (Random Access Memory) are all used synonymously.
Memory stores program instructions or data for only as long as the program they pertain to is in operation. Most types of memory only store items while the computer is turned on; data is destroyed when the machine is turned off.
Where is data stored
Data can be stored in RAM , registers, floppy disk or the hard disk.
All of these forms of data storage balance wanting speed and temporary/permanat data storage.
As soon as you turn your computer on the data loads from read-only-memory (ROM) and does something called a power-on self-test (POST). This just checks to see if all major components are functioning properly. The computer then loads the basic I/O system (BIOS) from ROM (this is what the bootloader helps with). After this is loaded your computers loads the OS, which is on the hard drive into the system’s RAM.
Compiling programs
When you go to compile a program the OS makes a file that is a processor to be complied. The OS essentially has a scheduler. Which does the following:
picks which process to run
how to context switch between different processes
Structure of process
The process structure in an operating system includes several key components. Each process is assigned a unique identifier (PID) and contains a memory image, which holds the process's code and data (static), as well as a stack and heap (dynamic). Additionally, the process interacts with the CPU through elements like the program counter (PC), current operands, and the stack pointer.
When the operating system creates a process to compile code, it first allocates memory and generates a memory image. It then loads the code and data from the disk and creates the runtime stack and heap. The OS also opens basic files like standard input, output, and error (STD IN, OUT, ERR), and initializes the CPU registers. The program counter is set to point to the first instruction for execution.
Processes transition through various states during their lifecycle. A process in the "running" state is actively executing on the CPU, while a "ready" process is waiting to be scheduled for execution. A "blocked" process is suspended and not ready to run, usually waiting for an event like a disk interrupt signaling that data is ready. A "new" process is in the creation phase, yet to begin execution, and a "dead" process has terminated its execution.
Building my bootloader
I built my bootloader using QEMU to simulate the software environment needed to run the OS I'm developing, along with NASM to assemble the code. The entire project is written in ARM assembly, which allows me to manage the low-level functionality critical to starting up the system. One of the key milestones I’ve achieved so far is successfully printing text to the screen, signaling that the bootloader has been initialized properly. This output serves as a basic but essential step in verifying that the system is ready for further development.
Currently, my work is focused on building out the bootloader without a kernel, starting with the essential system initialization. This involves setting up the CPU, configuring the registers, and preparing memory spaces like the stack. The goal is to ensure that the system is in a stable state and can handle low-level functionality. Although I haven't yet created or loaded an OS kernel, my immediate focus is on fully developing the bootloader’s ability to manage hardware and memory initialization. The next steps will involve gradually adding more advanced features, such as disk reading routines and memory management, in preparation for future kernel loading once the bootloader is fully functional.
see you soon with more updates :)