HTTP - the basics
Recap Materials
You should know all of this, but I have added it for reference.
HTTP (Hyper Text Transfer Protocol) provides the basis for communication across the web. Understanding the HTTP protocol, how data is transferred between devices, and how it is manipulated, allows us to manipulate data between machines on the web, and influence the behaviour of remote devices.
What is HTTP?
HTTP is a text based client-server communication protocol, that defines how web servers and clients should behave in response to commands:
- Client sends a request to the server
- Server responds to this request
Example
Bob (client) would like to get a file from Alice (Server)
- Bob asks (requests) the file 'do you have a copy of X'
- Alice (responds) with the file 'yes, here you go'
When you enter a URL in a web browser, the URL is converted into a HTTP Request and forwarded to the server for processing. Devices that connect to a sever using HTTP are called user agents. While web browsers are the most common, many other services communicate over HTTP; for example, Google's web indexing crawler, Internet of Things (IoT) devices, and anything else that will make use of web content.
Stateless Communication
One important feature to consider with HTTP is that it is a stateless protocol. This means that each request to a server is independent of any other requests that have been made.
This is important as it effects the way we deal with our content. If we wish to remember who a user is then we need some kind of session management (more on this later). Often this can be exploited to make the server behave in a different way than expected.
Ports and Versions
While you may still find servers operating HTTP 1.0, the most common version is HTTP 1.1. This adds support for session persistence (not to be confused with Stateless communication). Where before each HTTP request required a new connection to the server (ie Open Socket, Make Request, Close Socket), later versions of HTTP allow multiple transactions to be encapsulated within a single connection. This improves latency, by reducing the overhead of opening sockets.
HTTP is currently at version 3.0. However, the main differences between 1.1 and 2.0/3.0 are optimisations to the way communications are handled. What this means is that for the purposes of web hacking, these later versions make little difference.
By default HTTP operates on port 80. Other common ports for HTTP services include 8080, and 8000 -- services discovered at this address during the recon phase are worth investigating. HTTPS (secure HTTP) operates on port 443 and sometimes on 8443.