Remembering State in HTTP
By default HTTP is stateless (it has no memory of what has happened before). This means that every interaction with the server is treated as an individual, new occurrence. This can be a a good thing: it greatly simplifies server design and resource requirements, and it's also hard or impossible to get a stateless service stuck in a bad state.
However, the stateless design is not without problems: as each interaction is treated as a separate occurrence there is no way that the server could know if two (or more) requests came from the same browser.
Sessions
Sessions are one mechanism that allow some form of state to be retained between connections to server. In the web context, a session is some form of data structure that can be used to store temporary data about how the user is interacting with the application. For example, you could store a user's name (or unique id) in the session, so you don't have to query the database each time you need it.
Sessions on the Server Side
While the exact mechanism used to maintain session data will differ between applications (for example using temporary files, databases, or internal memory) the core concept remains the same.
I like to think of sessions as a simple hash table data structure (or python dictionary). Each session is assigned a unique session id, which is then used as a key to retrieve any other data that is stored about the session.
Sticking with the python dictionary analogy, session data could look like this:
#Key # Data
4242aab : { "userId": 42,
"userName": "Dan"
"permissions": "Admin"
}
4242aac : { "userId" : 24,
"userName" : "James",
"permissions : "user"
So each user has a unique session ID that relates to a record in the session object. Each user should only be able to access their own information, and there will be a row in the data structure for each user connected to the server.
We also need a way of linking a particular request to a user. There are several ways of maintaining this information, each usually relies on passing some token to the browser, that is then resubmitted during any requests made. The token itself usually contains the session ID and therefore can be used to identify who is making the request. The token itself may differ between applications, (for example using web storage or indexed DB) but is generally a Cookie.
Session Timeline
- The User visits a web page and logs in to the site. The server will validate the user's credentials and create a new session object containing the authorised users ID. The sessionid for this object is returned to the client.
- The client stores the session id (usually in a cookie), and passes it to the server with any future requests.
- When the server receives a request with a session id attached, it looks it up in the session datastore and makes the information stored available for any further processing that takes place.
- When the client logs out, the information in the session datastore is removed.
Client Side Sessions
Some web frameworks make use of client side sessions. Here the data associated with the session is stored on the clients machines and forwarded with each request as part of a cookie.
This can have some advantages with distributed systems, we don't need to work out how to transfer state between different servers in our cluster. However, there is an obvious problem. As the session data is client side, we need some way of ensuring the data is not tampered with. This usually involves some sort of hashing / checksum that is validated each time a request is made. We will see an example of this in the Lab.
Cookies
The next problem we need to overcome is how to send the unique session Id between the client and server.
Cookies are one approach to token management. Cookies can be stored on the client's machine by the server, then passed back to the server when requests are made. This allows the server to track each session though the cookie data.
Note
There are other ways of maintaining state. These take advantage of the way requests work:
-
Modify the URL and append the token (it becomes a parameter of the GET request)
-
Store state in 'hidden' form fields
Both work, but can be error prone (as they rely on the person implementing them to do a good job, rather than 'well developed' session management libraries).
Cookies and Sessions
Sessions are a common way to store per-user data during a set of interactions with a server. This allows us to work around the connection-less nature of HTTP.
Sessions are created by the PHP page, and can have variables set or unset during the lifetime of the session.
PHP (and other systems) keep track of sessions using Cookies. A unique identifier is stored on the browser as a cookie and transmitted as part of all requests. When a request is made, the unique identifier in the session cookie is mapped to the relevant set of user parameters stored in the servers memory.
Note
Note that sessions should only last for a single browsing session (for example, until the browser is closed, or the cookie expires).
To store data permanently we need to use databases.
!!! important
While more permanent session cookies can seem like a good idea,
as it stops a user having to log in each time they visit a site
It is worth sassing expiry date (to get the user to periodically
log back in). We will see what can be done with a session cookie
in the session on XSS
In PHP the session is stored in the PHPSESSID
cookie.
Like any other cookie we can inspect the value in the browser:
document.cookie
"PHPSESSID=c18b3494fe3227a1b6a18d6652db6ad5"
Session Hijacking
While we can't directly modify the session parameters stored on the server (unless we can control the page where these setting are made) it may be possible to take control of another user's session. If we can snarf the ID token through the cookie (though XSS, Sniffing etc) then we can set our own session ID cookie to use this value.
Note
While the examples here are for PHP, the principle is the same for other
common web technologies.
For example: ASP.net stores it session IDS in the cookie
ASP.NET_SessionId
In summary
HTTP is a stateless protocol. However, for the modern, interactive web we need some way of remembering who is using the site. In this step we discussed HTTP sessions, and how they can be used to keep track of a users interactions with a server.
PHP and other dynamic languages make use of Sessions to identify unique users. The session ID is stored in a cookie, and transmitted to the server with each request. This gives the possibly for a security flaw, as if we can obtain another users session token, we will be identified by them on the server. We will discuss Session hijacking in more detail in the XSS tasks.