Skip to content

Files and Directories

Python source code is saved in a text file. Older versions of Python could only use ASCII characters within certain ranges in the text file, but in Python 3 we can use special characters even as variable names. This isn't recommended, though. ASCII will always be encoded and presented correctly, no matter the system, while UTF-8 support is not completely standardised.

So, the difference between a text file ( *.txt ) and a Python file (*.py), as far as the operating system, file system, etc. care, is the sequence of characters after the dot. Naming files with a dot and a sequence to represent what they are for is common. On Unix-like systems, it is more convention than requirement. On Windows, it is how the system knows what to do with a file.

If you double-click a python file on Windows, then the OS will examine the filename and then try to run the interpreter so that it can execute - if you have Python installed, and haven't changed the default associations, that is.

On Linux, a visual file manager isn't part of the operating system. One is almost always bundled with your distribution, but it can be different on other systems. How each behaves can also be different. Often, double-clicking will attempt to make the same guesses as Windows, based on filename, but it may also consult other information, such as the first line of the file or permissions. More on those below.

More commonly when working in Cybersecurity, we run programs from a shell. To do that, we put in the path to the file and the filename like this: /home/student/prg.py

That command would try to execute the file prg.py, in the directory student, which itself is in the directory home, which in turn is in the root of the filesystem (/).

If the file is in the current directory that our shell is working in, we still have to give the path. If we were in the /home/student directory, we could still type the command above. But it does seem long-winded. Instead, we can shorten it to ./prg.py. The . character, when part of a path, means "here". If we are the student user, then /home/student would be our "home directory". In that case, we could also use the special character ~ to represent home, and execute the program like this: ~/prg.py. This way, even if our shell is not working in the home directory, the command still works.

If you've used windows command prompts (cmd or powershell), then this might seem like an unnecessary extra step. On windows, if something executable is in the current directory, just typing its name will execute it.

The cause of the difference is the path environment variable. An equivalent exists on Windows and Linux. It is simply a list of places to look for an executable when we type in its name to the shell. On Linux, ls is usually an executable file, located at /bin/ls. But we don't ever type the path. On Windows, the path also contains ., which means whatever directory you are currently in is searched for executables that match commands you type. On Linux, we don't do this. You can make it do this, but it is severely frowned-upon.

Why is it considered a bad idea to add . to the path? (Click for answer)

If an attacker can place a file with execute permissions in a directory that you might visit, then they could name that file ls or something else you would use without thinking. If the current directory exists in the path, then you could find you execute malicious code in your user account rather than getting a list of files!

Permissions

Linux doesn't just assume all files that end in .py should be executed with the Python interpreter. The file must also have the correct permission set.

Every file on a typical Linux filesystem has a series of permission flags. They show up in a directory listing if you use ls -l, like below:

Directory listing example showing permissions

The first column has entries that look like this: drwxr-xr-x

Each of the 10 characters in those sequences is either a dash (-) or a character giving permissions. The first character is typically either a dash (for a file) or d (for a directory).

Then we have three groups of three characters which are either rwx or that sequence with some or all of the characters replaced with a dash. In each one, if the r character is present, it means permission is granted to read the file. If the w character is present, permission is granted to write the file (or edit it, or delete it). If the x character is present, permission is granted to execute it. In the case of a directory, "execute" means to be able to enter it.

The three groups represent the permissions given to the user (owner), group and other (everyone else).

Why would you set your own permissions, as file owner, to be restricted? Try to think of good reasons before clicking to view the answer.

The simple answer is safety. Using the shell is powerful and efficient, but you can also use it to make some powerful and efficient mistakes, like execute something dangerous, delete a file you need, and so on. Having control over permissions like this allow you to decide what can be changed, what can be executed, and so on.

Changing Permissions with chmod

The chmod command can be used to change these permissions. You can use it on files you own or have been given permission to edit.

The simplest syntax is like this:

chmod u+x hello.sh

This says the user has the execute permission added for the file hello.sh. Here is an example of this being used. Note the output of ls before and after:

Example of the effects of chmod u+x to add execute permissions

Here is another example:

Example of using equals to set permissions

Note that now, using = instead of + we are changing the whole group of rwx for others (o) not just adding. So no matter what it was before, it becomes what we say. We could set it to r-x with chmod o=rx hello.sh.

Finally, we can remove individual privileges like this:

Example of using "minus" to remove permissions

Note the use of the - character to remove permissions, and also that you can put more than one specifier before the operator: og in this case, so we are affecting the group and other permissions, not the user permissions.

She bangs, she bangs, ooh baby, when she moves... ahem

Here we can see a shell program being executed:

Example of executing a shell script

The code is here:

hello.sh from the beginning to the end Download this file
#!/bin/bash
echo "Hello World!"

Note that the file has a line that says #!/bin/bash. This line tells the system that it should be interpreted by the given executable. In this case, it is run through a bash shell. Everything that follows (just one line in this case) is made up of bash commands. Traditionally, the hash (#) is pronounced "shh" and the exclamation point (!) is pronounced "bang". The two-character sequence #!, then, is often pronounced as "sh-bang".

For python, this is often #!/usr/bin/env python, or something similar. Rather than specify a specific Python interpreter, we are asking for the current environment's Python interpreter. This is useful because it is common to have multiple versions of Python available, in different locations. When we discuss virtualenv, you will see why this is so important.

The first character is a hash, so Python itself ignores the line because it is considered to be a comment. This means that on Windows, where the sh-bang sequence means nothing, it doesn't interfere.

Test Your Knowledge

Download the example file below in a linux system:

some.py from the beginning to the end Download this file
a=6
print("".join([chr(ord(i)+a) for i in "B_ffi\x1aQilf^"])) 
Now make it executable just for you.

Use chmod u+x some.py, making sure you are in the same directory as the downloaded file.

It still won't run, and you will get an error like the one below if you try:

Error when running python from the shell

This is because the shell doesn't know you want it to be interpreted by Python and tries to interpret it as a shell script.

Use a sh-bang line to tell the shell to use the Python interpreter

The first line should read #!/usr/bin/env python.

What output did you get?