Learning The Linux File System

Before we get started, let’s avoid any confusion. There are two meanings to the term “File System” in the wonderful world of computing: First, there is the system of files and the directory structure that all of your data is stored in. Second, is the format scheme that is used to write data on mass storage devices like hard drives and SSD’s. We are going to be talking about the first kind of file system here because the average user will interact with his or her file system every time they use a computer, the format that data is written in on their storage devices is usually of little concern to them. The many different file systems that can be used on storage is really only interesting to hardware geeks and is best saved for another discussion. Now that that’s cleared up, we can press on.

Back to Computing 101

A computer file is a block of discrete information generated by a computer program and saved to a long-term storage device for retrieval later. In Linux, everything is a file. Even the devices hooked to your computer are represented as files. The novice computer user can usually identify some files because there are many standard formats. Files with the .mp3 extension are automatically associated with audio that is encoded in mp3 format, for instance.

There are literally hundreds of thousands of files on even the most lightweight computer system and just throwing all of them onto a storage device in no particular order would cause all kinds of confusion. The concept of directories was developed in the early days of computing to make things more humanly manageable. A directory is really just a kind of file that holds addresses for other files and directories. The root of the Unix/Linux file system is represented by a single /. The term ‘root’ as used here should not be confused with the ‘root user’ which is an entirely different concept. In this sense, ‘root’ refers to the top, base, beginning, start or root of the file system.

A very standard way of teaching these concepts to computer science students has been to symbolize a storage device as being like a filling cabinet, each file is an individual document and directories were like the manilla folders used to separate files into logical groups. This symbolism was so deeply ingrained that developers automatically used pictures of file folders as icons to represent directories in the first Graphic User Interfaces for computers 40 years ago. This is also why a directory is referred to as a ‘folder’ in GUI environments these days. The terms ‘directory’ and ‘folder’ are pretty much interchangeable. If one is working at a command line, you generally refer to directories and, when working in a GUI, they’re folders.

Most operating systems allow programs and users to affix ownership attributes to files that control who can open, modify, execute or remove them. Files can be hidden from plain view and/or encrypted. You can create aliases and links for files that make them go by more than one name or appear in two places at once.

Each distribution of Linux creates a standard set of directories in the / directory to store data in. It varies a bit from distro to distro, but many are universal and serve the same function regardless. I go through many of the most important ones in the video. Other operating systems do the same thing, albeit somewhat differently.

So What?

Depending on your level of experience and education when it comes to computers, this may all seem very familiar to you, even boring. Most folks who’ve been knocking around with computers professionally or as a hobby for many years don’t even give these basic concepts a second thought; so much so that they may just skim right over it when they try and teach a newcomer how to use a computer. This usually leads to confusion and grief for the new user down the road and lots of what might seem like stupid questions for the teacher. Giving a student a good grounding in the basics of the file system, no matter what OS you’re teaching them is essential to their success. Don’t go too fast or assume they’ll understand it in a jiffy.

Most people who don’t get formalized training sort of pick all this info up on the street, as it were. If they’ve learned everything they know on Windows than they’ll be totally confused when they come to Linux.

Doing things the UNIX way…

The file system we use in Linux today comes to us from Unix, developed back in the late 1960’s to run computer systems that filled up not just rooms but entire floors of office buildings. These systems had hundreds of users and a myriad of storage devices and network connections to contend with. The system they came up with to manage files on those systems is incredibly simple yet powerful. There’s literally no limit to the amount of data it can handle. New storage devices or network resources are readily mounted in empty directories anywhere in the file system so adding more space to work with is super easy. Unlike Windows with it’s ABC’s or Mac’s drive icons, Linux users don’t have to bother with exactly what chunk of oxidized metal or which array of flash memory chips the data is stored on. This concept is called obfuscation. Repeat after me, please: “Obfuscation is good!” Once you get your head wrapped around obfuscation, you start to see the power it gives you.

Let’s say you have a machine with a single hard drive in it and you like to make videos with it. You make so many videos that you start to run out of space to store your creations. You could add a hard drive to your system to store more videos on, but then you may worry that you’d have to move data all around and screw up your workflow. Instead of putting them in your convenient ‘Videos’ folder in your home folder, you’ll have to stick them on the new drive. Wrong! Move all of your videos into a temporary folder on your desktop, add the new drive, mount it in the now empty Videos folder and then move all the videos back. Done. Your videos still show up in the same place, but now you have freed up a whole bunch of space on your original drive. Plus, now you have a ton more to work with on the new drive but it all looks and acts just the same as before. Cool, huh? Unix and Linux admins have been doing this sort of thing for 50 years and no user on the system ever knew.

The Windows file system came from a very different place altogether. It has its roots in MS-DOS, the first Microsoft operating system. DOS didn’t care about letter cases. As far as it was concerned, MyFile, myfile and MYFILE all were the same thing. MS developers have added extensions to the system that allow cases to be displayed in file names, but Windows still doesn’t really care about cases. The Linux file system IS case sensitive so MyFile, myfile and MYFILE would be treated as completely separate files. It’s important to know this when trading files back and forth with Windows machines.

Another thing to keep in mind is that the character combinations you are allowed to use for file names are also slightly different between Windows and Linux. Moving files from a Windows system to a Linux system usually isn’t much of a problem but moving files to or storing Linux files on a Windows system can cause some weirdness. Samba is the network program used to allow Windows and Linux machines to talk to one another and it tries to reconcile these sorts of things, sometimes with unexpected results. Samba tries to set certain attributes the same way for each file system. For example, a hidden file is represented by a ‘.’ in front of the file name in Linux while Windows has an attribute that must be set for a file that determines whether it’s hidden or not. Samba will sometimes set the hidden attribute in Windows when you move a Linux ‘dot file’ across the network. It just disappears!

Linux has some very distinct advantages when it comes to how it treats file ownership. A file automatically belongs to you if you create it while logged into your account. The file makes it so other users on the machine cannot navigate to your files and do anything to them unless you specifically allow them to. You can also set a file to be executable or not. Most average users don’t need to worry about this, but if you intend to write your own programs or put together bash scripts then you’ll want to look into it. All of these permissions can be changed easily by clicking in the GUI or there are some really nifty command line tools you can use.

Just the Tip of the Iceberg

All of the above is just an introduction to the power of Linux’s file system. It can do some really amazing things to make managing vast amounts of data simple, secure and easy. Taking the time to get a good grip on what happens when you click ‘Save’ will go a long way to making you a Linux power user.

Have fun!

Also check out...

Joe Collins
Joe Collins worked in radio and TV stations for over 20 years where he installed, maintained and programmed computer automation systems. Joe also worked for Gateway Computer for a short time as a Senior Technical Support Professional in the early 2000’s and has offered freelance home computer technical support and repair for over a decade.

Joe is a fan of Ubuntu Linux and Open Source software and recently started offering Ubuntu installation and support for those just starting out with Linux through EzeeLinux.com. The goal of EzeeLinux is to make Linux easy and start them on the right foot so they can have the best experience possible.

Joe lives in historic Portsmouth, VA in a hundred year old house with three cats, three kids and a network of computers built from scrounged parts, all happily running Linux.