What Happens When You Type “ls -l *.c” In A Linux Shell?

Adam Taylor
8 min readApr 12, 2021

Hello, there! Today, my project partner Anthony Armour and I will be discussing what happens when you enter the command ls -l *.c
in your Linux terminal!

The ls command is one of, if not the most used command located in the /usr/bin directory. Now, what exactly is this command used for and what does it do for the user? What does it display to STDOUT? The ls command is used to list files or directories in Linux and other Unix-based operating systems. If you refer to the picture that is at the top of this article, you can see that that is exactly that ls did for us in that instance.

Now that we understand what ls does and what we can use it for, I will now touch on what flags are and what flags we are using with our command. Flags are a way to set options and pass in arguments to the commands you run. In this instance, the flag that we are using is -l , which lists our files and directories in long format and also shows their respective permissions. The command ls has many, many optional flags that you are able to use to achieve many different types of output to your terminal. Such a simple command to understand, yet it has a nearly endless amount of output possibilities. A couple of notable flags include but are not limited to:

  • -a, which will list all files including hidden files starting with ‘.’
  • -d, which will list directories with ‘*/’
  • -ls, which will list with long format and file size (yes, you read that right! ls -ls is a real thing.)
  • -r, which will list in reverse order
  • -t, that will sort by time & date

Refer to the picture below to get to know ALL of the available flags for the ls command!

Great! We are making good progress on understanding exactly what is going on in the full command! If you ask me, whenever you are tasked with understanding something as “complex” as a command with different flags, wildcards, and whatnot, breaking things down into their own little pieces and understanding each piece by itself and then coming back later to understand the bigger picture is always the way to go.

So, while reading through that previous paragraph, you should have read over the word “wildcard”. If you’re new to programming, I really hope the question “What exactly is a wildcard?” popped in your head. And if so, allow me to supply you with an answer.

A wildcard in C is normally notated as an asterisk ‘*’, but can also be notated as a question mark ‘?’ , as well. Google defines a wildcard as follows:

A wildcard is a symbol used to replace or represent one or more characters. The most common wildcards are the asterisk (*), which represents one or more characters and question mark (?) that represents a single character.

So, in the command that we are using, ls -l *.c , the notation we are using is the asterisk ‘*’ , and what comes after, the .c , is what will be altering our output to STDOUT. Now, what exactly is *.c doing in this context? Well, by now we know that ls lists the files and directories that are stored inside of our present working directory (also known as PWD ), and we also know what flags are and what exactly our -l flag’s functionality is, so all we have left to account for is our wildcard! What our wildcard *.c will be doing is listing all of the files in our PWD that ends with a .c extension.

Here is an example of me utilizing ls -l in Anthony and I’s simple_shell directory.

And here is an example of me utilizing ls — l *.c in that same directory.

As you can see, *.c is behaving exactly as I have described above. It sifts through all of the files and directories in our PWD, and only displays those who have a *.c extension, so we can confirm that we have the exact output that we expected!

Now, I would like to take a step back and put more time into explaining how important wildcards in C are and what they are capable of. Obviously, the wildcard that we used in our command was an asterisk ‘*’, but like I mentioned before, there are others that we have at our disposal.

  • %: The percent symbol is used in SQL to match any character (including an underscore) zero or more times.
  • ?: A question mark matches a single character once. For example, “c?mp” matches “camp” and “comp.” The question mark can also be used more than once. For example, “c??p” would match both of the above examples and “coop.” In MS-DOS and the Windows command line, the question mark also can match any trailing question marks zero or one times. For example, “co??” would match all of the above matches, but because they are trailing question marks would also match “cop” even though its not four characters.
  • Open and close braces ([]): With Unix shells, Windows PowerShell, and programming languages that support regular expressions the open and close bracket wildcards match a single character in a range. For example, [a-z] matches any character “a” through “z,” which means anything not in that range like a number would not be matched.

Each of these wildcards have their own personalized uses. Getting familiar with them right out of the gate will help with many different things. The asterisk wildcard, for example, makes sorting through bigger directories that hold many different files much, much easier for the user. Another great example of wildcards being your friends is the following.

List files in MS-DOS using the dir command that contain c, mp, and any other character in-between. For example, comp, camp, c2mp, and c-mp would all be matched.

Now, to move forward a little and get off of the wildcard topic, I will be touching on what happens in the background when you enter ls -l *.c .

  • When you enter the command ls -l *.cin your terminal, your computer obviously realizes you have typed said command, and so it passes what you typed to the shell. Regardless of what you type, so long as you do not press enter, everything will be sent a single string (string being a collection of characters). Your string is then tokenized by removing any white spaces in your command. So, now our command is split into three different tokens, being ls , -l , and *.c . Our tokens are now stored in a 2D array of strings. This process is known as tokenization.
  • Next, the shell goes through our 2D array to see if there are any defined aliases in the command we passed to it. If an alias is found, the shell will then replace the correlating alias name with it’s counterpart, being what command it is assigned. Aliases are mostly stored in the following locations:
  1. ~/.bashrc
  2. ~/.bash_profile
  3. /etc/bashrc
  4. /etc/profile.
  • The next step is to check if each token is a built in function or not. If the shell runs into one, it executes the command directly, without invoking another program. For example: cd is a built in command, while ls is not. so instead of ls being executed immediately, the system has to find the executable for ls .
  • Next is for the bash to interpret the command. The first search for the command ‘ls’ is performed at $PATH. $PATH is an environmental variable which stores the path locations of all the common executable programs. This search is performed by calling a series of functions like find_user_command() ,find_user_command_in_path ,find_in_path_element. Each location specified in the PATH variable is searched for the executable corresponding to the command ls. BASH calls the function stat() to check for the existence of this executable in each of the paths.
  • Then, when our system finds the file in /usr/bin/ls , BASH performs execve() to execute the file. There are many smaller steps happening when this command is invoked, such as the program reading from the disk, it’s binary format being found, and the appropriate handling code is also invoked which will read the binary into memory.
  • Finally, the program is now stored in memory and ready to execute. But the question is, how does ls read the directories and files from our disk? For this step, a list of different functions are executed internally to achieve the final output. The ls utility uses a function to read the directory contents, which in turn invokes a system call to read the list of files in the directory by consulting the underlying filesystem’s inode entries. Depending on which filesystem the path specified to ls is formatted, the function used to read the directory contents will vary. So once all the entries have been retrieved , the system call returns .The final steps of the shell would be to print the prompt again. The prompt is saved as environment variable PS1.The list of files is then returned to the prompt.

I hope you enjoyed reading Anthony and I’s article. If so, please feel free to leave a like and share with your fellow peers! There is always more to learn when it comes to the C programming language. Keep your head’s high and keep plugging away at it! You will be the next Dennis Ritchie before you know it! Happy programming!

Edit: Many thanks to Christopher Caswell for looking over the article and shelling out some recommendations on how to improve our writing.

--

--