Files and Directories
Python source code is saved in a text file. Older versions of Python could only use ASCII characters within certain ranges in the text file, but in Python 3 we can use special characters even as variable names. This isn't recommended, though. ASCII will always be encoded and presented correctly, no matter the system, while UTF-8 support is not completely standardised.
So, the difference between a text file ( *.txt
) and a Python file
(*.py
), as far as the operating system, file system, etc. care, is
the sequence of characters after the dot. Naming files with a dot and
a sequence to represent what they are for is common. On Unix-like
systems, it is more convention than requirement. On Windows, it is how
the system knows what to do with a file.
If you double-click a python file on Windows, then the OS will examine the filename and then try to run the interpreter so that it can execute - if you have Python installed, and haven't changed the default associations, that is.
On Linux, a visual file manager isn't part of the operating system. One is almost always bundled with your distribution, but it can be different on other systems. How each behaves can also be different. Often, double-clicking will attempt to make the same guesses as Windows, based on filename, but it may also consult other information, such as the first line of the file or permissions. More on those below.
More commonly when working in Cybersecurity, we run programs from a
shell. To do that, we put in the path to the file and the filename
like this: /home/student/prg.py
That command would try to execute the file prg.py
, in the directory
student
, which itself is in the directory home
, which in turn is
in the root of the filesystem (/
).
If the file is in the current directory that our shell is working in,
we still have to give the path. If we were in the /home/student
directory, we could still type the command above. But it does seem
long-winded. Instead, we can shorten it to ./prg.py
. The .
character, when part of a path, means "here". If we are the student
user, then /home/student
would be our "home directory". In that
case, we could also use the special character ~
to represent home,
and execute the program like this: ~/prg.py
. This way, even if our
shell is not working in the home directory, the command still works.
If you've used windows command prompts (cmd or powershell), then this might seem like an unnecessary extra step. On windows, if something executable is in the current directory, just typing its name will execute it.
The cause of the difference is the path environment variable. An
equivalent exists on Windows and Linux. It is simply a list of places
to look for an executable when we type in its name to the shell. On
Linux, ls
is usually an executable file, located at /bin/ls
. But
we don't ever type the path. On Windows, the path also contains .
,
which means whatever directory you are currently in is searched for
executables that match commands you type. On Linux, we don't do this. You can make it do this, but it is severely frowned-upon.
Why is it considered a bad idea to add .
to the path? (Click for answer)
If an attacker can place a file with execute permissions in a
directory that you might visit, then they could name that file ls
or
something else you would use without thinking. If the current
directory exists in the path, then you could find you execute
malicious code in your user account rather than getting a list of
files!
Permissions
Linux doesn't just assume all files that end in .py
should be executed with the Python interpreter. The file must also have the correct permission set.
Every file on a typical Linux filesystem has a series of permission flags. They show up in a directory listing if you use ls -l
, like below:
The first column has entries that look like this: drwxr-xr-x
Each of the 10 characters in those sequences is either a dash (-
) or
a character giving permissions. The first character is typically
either a dash (for a file) or d
(for a directory).
Then we have three groups of three characters which are either rwx
or that sequence with some or all of the characters replaced with a
dash. In each one, if the r
character is present, it means
permission is granted to read the file. If the w
character is
present, permission is granted to write the file (or edit it, or
delete it). If the x
character is present, permission is granted to
execute it. In the case of a directory, "execute" means to be able
to enter it.
The three groups represent the permissions given to the user (owner), group and other (everyone else).
Why would you set your own permissions, as file owner, to be restricted? Try to think of good reasons before clicking to view the answer.
The simple answer is safety. Using the shell is powerful and efficient, but you can also use it to make some powerful and efficient mistakes, like execute something dangerous, delete a file you need, and so on. Having control over permissions like this allow you to decide what can be changed, what can be executed, and so on.
Changing Permissions with chmod
The chmod
command can be used to change these permissions. You can
use it on files you own or have been given permission to edit.
The simplest syntax is like this:
chmod u+x hello.sh
This says the user has the execute permission added for the file hello.sh
. Here is an example of this being used. Note the output of ls
before and after:
Here is another example:
Note that now, using =
instead of +
we are changing the whole group of rwx
for others (o
) not just adding. So no matter what it was before, it becomes what we say. We could set it to r-x
with chmod o=rx hello.sh
.
Finally, we can remove individual privileges like this:
Note the use of the -
character to remove permissions, and also that
you can put more than one specifier before the operator: og
in this
case, so we are affecting the group and other permissions, not the
user permissions.
She bangs, she bangs, ooh baby, when she moves... ahem
Here we can see a shell program being executed:
The code is here:
hello.sh from the beginning to the end
Download this file #!/bin/bash
echo "Hello World!"
Note that the file has a line that says #!/bin/bash
. This line tells
the system that it should be interpreted by the given executable. In
this case, it is run through a bash shell. Everything that follows
(just one line in this case) is made up of bash
commands. Traditionally, the hash (#
) is pronounced "shh" and the
exclamation point (!
) is pronounced "bang". The two-character
sequence #!
, then, is often pronounced as "sh-bang".
For python, this is often #!/usr/bin/env python
, or something
similar. Rather than specify a specific Python interpreter, we are
asking for the current environment's Python interpreter. This is
useful because it is common to have multiple versions of Python
available, in different locations. When we discuss virtualenv
, you
will see why this is so important.
The first character is a hash, so Python itself ignores the line because it is considered to be a comment. This means that on Windows, where the sh-bang sequence means nothing, it doesn't interfere.
Test Your Knowledge
Download the example file below in a linux system:
some.py from the beginning to the end
Download this file a=6
print("".join([chr(ord(i)+a) for i in "B_ffi\x1aQilf^"]))
Now make it executable just for you.
Use chmod u+x some.py
, making sure you are in the same directory as the downloaded file.
It still won't run, and you will get an error like the one below if you try:
This is because the shell doesn't know you want it to be interpreted by Python and tries to interpret it as a shell script.
Use a sh-bang line to tell the shell to use the Python interpreter
The first line should read #!/usr/bin/env python
.
What output did you get?