Building Web sites and services.
Recap
This should be a recap of the materials you were taught for web development. Even if you know it, its probably best to have a read, it has been 6 months since you had to use it.
Before we discuss the protocols used to send data in the web, we will have a quick recap of the way web services1 are built.
Being able to read and understand the structure of a web service is useful when performing a pen-test. Understanding the structure of a web site, and what (and how) data is sent too the server. The HTML mark-up can help us to understand elements on the page, and can give us some insight into how and backend services may function.
Additionally, with modern web development making heavy use of client side functionality to help build sites, and add functionality. Having an appreciation of how JavaScript works is useful to help understand the requests made to help build the pages.
Important
There is a world of difference between being able to "Read" and "Write" languages. In this module you wont really be expected to be able to write HTML or Javascript. However, being to read the source and understand the structure and logic of a site is extremely useful when it comes to testing web security2.
Use the Source Luke.
The first thing we need is a way to view the source code for a web-page. This will let us see the raw markup for the page, and will stop the browser from rendering any content. This means that we can see any comments, or "hidden" elements that are used to build a page.
We have a couple of ways to do view the source:
View Source
Our first option is the View source function. We can access this in most browsers by right clicking and selecting the appropriate menu option, or using the Ctrl+U shortcut.
"View source" will open a copy of the source code the browser has received in a new window, and is useful for browsing the full contents, or searching the contents of a page.
Important
The view source will only show us what the browser has recieved as part of the HTTP request. This means that for sites built using client side technologies, like react, we only get an abbreviated site containing the scripts used to build the rest of the page.
In this case, we will need to rely on the Inspect Element tool.
You can see this in action by viewing the source for Aula. Here we just get a whole bunch of scripts in the header,
and a rather interesting "Load Youtube so the editor works" as the main body.
Inspector Tool
The Inspector Tool (Firefox / Chrome, no idea what it is called in safari), gives us a "real time" view of the source code. This lets us check the generated source for a page, and will include stuff that had been added using client side technologies.
We can access the inspector tool for part of the page by right-clicking and hitting Inspect. (Shortcut Ctrl+Shift+C)
As well as viewing and searching the source, a nice feature of the inspector tool is that it allows us to make changes to the markup for the page. This can be really useful when developing sites, as it lets us change CSS values, or play with the way things are rendered.
Additionally, when it comes to pen-testing websites it can act like a poor mans burp suite, letting us modify form field types, or other parameters controlled client side before sending the modified request back to the server.
Task
There are Two flags in this pages source code.
Both flags are "hidden" within comments, the flag format is 5067{<some text}
(for example 5067{exampleFlag}
).
- First flag is below this list
- Second flag is hidden elsewhere on the page.
HTML
HTML is a markup language designed to provide structure and meaning to elements in a document. For example, is a given block of text a header, paragraph, or part of a list. This helps provide meaning to the text, so the browser knows how to display it correctly. Its a pretty simple3 language to read and understand (although I find writing it to be much harder).
You should have covered the basics of HTML in the first year, so we are not going into too much detail here. While we don't need to worry about the bulk of the HTML syntax here, there are a few things that its worth being mindful of when examining a site. If you want to go deeper into HTML syntax, there are some suggested pages in the Further Reading
Comments
As a developer, comments are important because they let us keep a note of the page logic or the way we were thinking when we were building the page. This can make working with other people on large projects easier, as the source contains the explanation, and is even more useful when we come back to a project after some time4.
As comments are ignored by the browser, they will not be shown to the end user. However, they will still be visible if we view the source code.
In HTML Comments are represented by wrapping the commented text in <--
and -->
tags.
Example
<!-- This text is commented -->
This text is not commented
This text is not commented
Comments can be a double edged sword. While explaining the logic of the code is great for maintenance; the comments still be viewed as part of a security audit, giving the attacker some insight into the logic on the page.
HTML Tags give Structure
As a markup language HTML gives us a way to add structure to a document. Placing an HTML element within tags tells the browser that some formatting, or other logic, needs to be applied to it.
Opening Tags consist of angle brackets <>
containg the tag type for example <strong>
.
The majority of tags are then closed using closing tags, which have a slash added after the opening bracket </strong>
.
Be mindful that some HTML elements such as Images are "empty" and do not include the closing tag.
Example
<!-- Bold Text -->
<strong> Bold Text </strong>
<!-- A List -->
<ul>
<li>Item 1</li>
<li>Item 2</li>
</ul>
Bold Text
- Item 1
- Item 2
Tag Attributes
Elements can also have attributes, these represent extra information that is used when processing the element. For example, a link element will have a href attribute that represents the URI the link directs you to.
Attributes are represented in the opening tag using <key>=<value>
pairs.
Attribute Examples
<a href="http://www.example.org">example.org</a>
There are a lot of different attributes, some are more interesting to us than others.
For example:
- In forms the
disabled
attribute can be used to stop a user entering a value in an element. As we can remove this attribute using the Inspector tool, we are still able to enter data. - There are various
on
attributes (onload, onerror
etc.) that tell an element to run JavaScript when a given condition is met.
These can help point us to interesting functions called by the page, and can be useful for XSS.
We are going to cover some useful attributes in our session on XSS. In the meantime you can find a more details on attribute in Further Reading
Forms
Forms are a common way to allow a user to send data to the server.
Therefore understanding what the form is sending, and where it is being sent to can be useful when trying to understand the application logic.
A form element can be defined with the following attributes:
- method: HTTP method to use to send the form data
- action: URI to send the form data to
We will look at how the method effects the way data is transferred in the next article.
Important
HTML5 has built in form validataion, as does various client side libaries.. This means that the browser will check the data the user has input matches what is expected (for example, a email address field contains an email address).
This can be really useful, as it can save the user having to send data to the server, just to have the request fail.
However, as the checks are client side, it is easy enough to manipulate the form types, or the data being sent. As an attacker, rememeber that client side validation is useless. As a developer, make sure to check the data on the server side also.
JavaScript
JavaScript6 was designed to allow a developer to run client side code in the users browser. It is now used for both front end, and backend services though JavaScript server technologies like Node.js5
Common uses for JavaScript include validating and displaying data, making amazing popups appear, and adding extra functionality like showing and hiding toolbars. Modern Web Services™, such as the aula may also make use of JavaScript to build the whole page programmatically. We are going to take a closer look at using some JS payloads when we discuss XSS.
JavaScript code can be included in a program in one of two ways:
- Inline code is generally used for short snippets of code, with the JavaScript program itself included between
<script>
tags. - Source Code can also be loaded from another file using the
src
attribute of a script tag.
Like Python, when the program is loaded anything at the base level (i.e. outside of a function) of the file will be executed, this means we can have a script run on the page load. However, developers will usually use functions to break the script into logical blocks of functionality. These functions can then be attacked to HTML elements, or the page itself using attributes.
Example
In this example, we have an inline script containing a function that pops an alert box.
We have attached the function to a button using its onclick
attirbute.
<button class="evilButton" onclick="showAlert()">Click Me</button>
<script>
function showAlert(){
alert("Hello")
}
</script>
Like HTML, the key thing to remember with JavaScript is a client side technology. This means that we can have full control over how it works, and could bypass the functionality. For example, removing JS based form validation so we can submit different input, than that expected.
Summary
In this article we have had a quick recap of the elements that make up a web page. Understanding the structure and logical flow of a site can be useful during a security audit, as it can let you identify areas that could be exploited.
While, we may not need to write correct HTML and JavaScript for a security audit, having an appreciation and understanding of how they can be used to structure a page will help us gain a better appreciation of the application flow.
Both of these technologies are also Client Side, meaning we have some control over how they behave. If a developer is relying on client side technologies for input validation, or ensuring the correct type of data are sent to the server, bypassing the functionality may allow us to exploit the server.
Further Reading
You don't need to read this, but I find them useful for reference.
- Mozilla Developer Network has some great documents on web development. MDN HTML
- Details of Elements in the HTML5 Specification
- Element Attribute and Event Reference from the HTML5 Specification
- A Re-Introduction to JavaScript
-
"Web Service" or "Web Site"? I tend to think of the Site as a traditional web page, and a service as something used to get data. However, text is just data, so lets use web service to describe a generic page served over the net. ↩
-
The same could be said for any programming, and pen-testing. While you don't need to be fluent in
<insert language here>
, being able to work out how it shapes the flow of a programs logic is incredibly useful. ↩ -
Remember, Simple does not necessary mean easy. ↩
-
As "Good" developers I obviously don't need to tell you this. ↩
-
Personally, I am still not sold on the benefits of using JavaScript server side, especially when there are other languages that seem a much better fit (like Python). But its big thing so I must be wrong. ↩
-
Technically its ECMA Script, but the name JavaScript seems to have stuck. ↩