Skip to content

Finding "Hidden" pages

Last week we focused on mapping the "visible" content of a site.
However, it is common for web applications to have content and functionality that is not reachable from the main content. This could be features used for debugging and testing, or old content that has not been removed. A sites authorisation levels may also conceal functionality from us. For example, an admin interface will be shown in the navigation menus for admin users, but hidden from others. However, if we can discover the page, we may be able to make use of that functionality.

Other example of things we might be able to find include:

  • Temporary / Backup files, editors such as emacs make use of temporary files while you are working. This may help you view the source of interpreted files.
  • Functionality that has been hidden from the user.
  • Backup archives, for example site or database backups.
  • Configuration files or directories

In the rest of this article we will look at some techniques and tools for enumerating the site with brute force.

Important

This kind of scanning can (and possibly will) be counted as an attack on an organisation. Do NOT do it unless you have permission to scan.

The University, picks up these types of scans and will block you (at least 24 hours) this may effect you being able to access other university services. The hacking lab should be fine, as we host the services on the internal network, which is seperate to the university one.

I will make this, and any workarounds for the problem, clear if we are doing any scanning on remote systems hosted on University infrastructure (for any of you who are 100% remote)

Inferring from existing content.

One way we might find hidden data is by inferring its existence from existing content.

For example, we might have a Reports section of the site, containing a list of published reports.

  • ImportantReport2019
  • ImportantReport2020

We might be able to find unlisted items by changing the year to the current one.

Brute Forcing Web Content

One way of finding this hidden content is to collate a list of common directories used in URLS, and perform a brute force attack using it. This technique is often known as "Directory Busting" / "Dirbusting"

There are several tools for doing this, including Nikto, Dirbuster, Gobuster, FFUF or even nmap. The tool you use is largely a matter of personal preference2, as each has its own set of features.

However, for each of the tools the basic premise is simple:

  1. Collect a list of URLS for common directories
  2. For Each item in the list
    1. Make a Request to the server for the page
    2. Check the Status Code / Data Returned

Response Codes

We usually use the response status codes returned when we are scanning to determine if a page exists or not.

  • 2xx Success
    • 200 OK -- The page was returned without error
  • 3xx Redirection
    • 303 Redirect -- The page has moved, attempt to automatically redirect to the correct address.
  • 400 Bad Request
    • 401 Unauthorised -- There has been some issue authenticating with the server.
    • 403 Forbidden -- the server actively rejects the request
    • 404 Not Found (no such page)
Code Meaning For Scan
200 Success The page exists on the server
3xx Redirect The page redirects the user elsewhere (usually to something like a login page). This can indicate functionality that requires authorization / authentication.
400 Bad Request Can indicate that the URL uses a custom naming scheme, it may also indicate a problem with the wordlist.
401 Unauthorized The page exists but requires authentication
403 Forbidden The page exists but requires authorization. This can also indicate a directory, so we may want to scan recursively
404 No page at this URL

Non Standard Return codes

Some applications make use of non standard return codes, For example, returning 200 (OK) but then display an error message, rather than showing a 404 (NOT FOUND). This often happens in single page applications, where requesting missing data takes you to a standard "home" page. This can make your job much harder as its not easy to see when a page has data, or is an error message.

One way of dealing with this is to look at the response sizes. As the "missing" pages go to a default page, then they will all be of the same size. Having identified one piece content that is missing, you can rule all1 of the other pages with this response size out of the scan.

Scanning with Gobuster / FFUF

Lets take a look at actually scanning a site with some tools.
We will look at both GoBuster and FFUF, so you can compare the two approaches and syntaxes. I personally prefer these as they are pure command line (and I tend to work in the terminal), but feel free to try different tools and find one that works for you.

The first thing we are going to need is a wordlist to scan with. This can be in different locations depending on your choice of pentesting OS, but for both Parrot or Kali they live in /usr/share/wordlists. You can also download a wordlist from Seclists. The wordlist folders are organised by category, so you get ones appropriate for password cracking, or enumeration.

You have a lot of choice when it comes to wordlists, but I like to use a smaller one for my initial recon. If nothing shows up, then I will move to a larger more complex list. You are also free to customise lists by adding new elements to them. Depending on what I discover during the mapping phase, I might add elements that are specific to a web framework, or common words from the site.

Tip

Unless I tell you otherwise, any scanning here will be in common.txt.
this should be at.

If we have a specific item to find (like flag.html), its worth checking that it is in the list. Otherwise the scanner wont find it.

Example

In the example below, we have a scan of last years skills test, using the common.txt wordlist So slightly different syntax, but we supply the same information to both tools.

Here we use the parmeters:

  • dir Directory scanning mode
  • -t 30 Use 30 threads for scanning process, this means we make multiple concurrent attempts to scan, and can speed the process up. However, we should take network speeds and the load on the server into account here. 30-50 threads is fine, using 1000 probably wont help much.
  • -w [wordlist] The wordlist we want to use
  • -u [address] The address we want to scan
$gobuster dir -w /usr/share/wordlists/dirb/common.txt -t 30 -u 127.0.0.1
===============================================================
Gobuster v3.1.0
by OJ Reeves (@TheColonial) & Christian Mehlmauer (@firefart)
===============================================================
[+] Url:                     http://127.0.0.1
[+] Method:                  GET
[+] Threads:                 30
[+] Wordlist:                /usr/share/wordlists/dirb/common.txt
[+] Negative Status codes:   404
[+] User Agent:              gobuster/3.1.0
[+] Timeout:                 10s
===============================================================
2021/08/25 15:26:38 Starting gobuster in directory enumeration mode
===============================================================
/about                (Status: 200) [Size: 4162]
/admin_area           (Status: 403) [Size: 234] 
/backend              (Status: 403) [Size: 234] 
/cgi-bin              (Status: 403) [Size: 234] 
/console              (Status: 200) [Size: 1985]
/secrets              (Status: 403) [Size: 234]  
/setup                (Status: 200) [Size: 2625] 

===============================================================
2021/08/25 15:27:35 Finished
===============================================================

FFUF has similar syntax, although it defaults to directory scanning mode, with multiple (40 i think) threads.

  • -w [wordlist]: Wordlist to use
  • -u [URL]: Address to scan

Notice that we add a FUZZ parameter to the URL, this tells FFUF where it should place the content used in the attack.

$ffuf -w /usr/share/wordlists/dirb/common.txt -u http://127.0.0.1/FUZZ

        /'___\  /'___\           /'___\       
      /\ \__/ /\ \__/  __  __  /\ \__/       
      \ \ ,__\\ \ ,__\/\ \/\ \ \ \ ,__\      
        \ \ \_/ \ \ \_/\ \ \_\ \ \ \ \_/      
        \ \_\   \ \_\  \ \____/  \ \_\       
         \/_/    \/_/   \/___/    \/_/       

       v1.3.1 Kali Exclusive <3
________________________________________________

:: Method           : GET
:: URL              : http://127.0.0.1/FUZZ
:: Wordlist         : FUZZ: /usr/share/wordlists/dirb/common.txt
:: Follow redirects : false
:: Calibration      : false
:: Timeout          : 10
:: Threads          : 40
:: Matcher          : Response status: 200,204,301,302,307,401,403,405
________________________________________________

                        [Status: 200, Size: 3846, Words: 737, Lines: 126]
about                   [Status: 200, Size: 4162, Words: 798, Lines: 129]
admin_area              [Status: 403, Size: 234, Words: 27, Lines: 5]
backend                 [Status: 403, Size: 234, Words: 27, Lines: 5]
cgi-bin                 [Status: 403, Size: 234, Words: 27, Lines: 5]
console                 [Status: 200, Size: 1985, Words: 411, Lines: 53]
secrets                 [Status: 403, Size: 234, Words: 27, Lines: 5]
setup                   [Status: 200, Size: 2625, Words: 590, Lines: 101]
:: Progress: [4614/4614] :: Job [1/1] :: 73 req/sec :: Duration: [0:00:56] :: Errors: 0 ::

Both scans give us the same reuslts, and mostly the same information.

We can see that the scan has found the following pages exist (200 Status Code), If we havent visited them as part of our mapping process then we can add them to the list:

  • about (existed in mapping)
  • console ("Hidden" page, the Werkzung Debugger included in flask)
  • setup ("Hidden" page, used to setup the databases when the program is first run)

We also have several URLS that return a 403 Forbiden code.

  • admin_area
  • cgi-bin
  • backend
  • secrets

This can mean one of two things, Either we need authorization to view the page,
or it is a directory, but directory listings are turned off for the server.

In this case we can "Recusrively" check each of the URLS we find, to see if there is any hidden content in a directory.

And example of doing this this with FFUF is below.

  $ffuf -w /usr/share/wordlists/dirb/common.txt -u http://127.0.0.1/cgi-bin/FUZZ

    /'___\  /'___\           /'___\       
   /\ \__/ /\ \__/  __  __  /\ \__/       
   \ \ ,__\\ \ ,__\/\ \/\ \ \ \ ,__\      
    \ \ \_/ \ \ \_/\ \ \_\ \ \ \ \_/      
     \ \_\   \ \_\  \ \____/  \ \_\       
      \/_/    \/_/   \/___/    \/_/       

   v1.3.1 Kali Exclusive <3
  ________________________________________________

  :: Method           : GET
  :: URL              : http://127.0.0.1/cgi-bin/FUZZ
  :: Wordlist         : FUZZ: /usr/share/wordlists/dirb/common.txt
  :: Follow redirects : false
  :: Calibration      : false
  :: Timeout          : 10
  :: Threads          : 40
  :: Matcher          : Response status: 200,204,301,302,307,401,403,405
  ________________________________________________

  backdoor                [Status: 403, Size: 234, Words: 27, Lines: 5]
  :: Progress: [4614/4614] :: Job [1/1] :: 84 req/sec :: Duration: [0:01:01] :: Errors: 0 ::

You can see we identify another URL, with a 403 status code. Meaning we should launch another scan on cgi-bin/backdoor/*

Important

Many of the tools have a "Recusive" mode that will attempt to do this for you. However, they can be a bit hit an miss, it is worth double checking the results are as expected, and the traversal has happened.

Most of the scanners support options to help with the search process. These include

Specifying File Extensions

We can ask the tool to append various extensions to the items in the word list.

For example, we might want to search for .html and .php files on a standard website, or tailor our search based on the technologies we have identified.

Some useful extensions to look for are:

  • Backups and Archives: .bak, .tmp, ,.zip, .tar
  • Database Dumps: .sql
  • Text Files: .txt

Note

Gobuster has the -d parameter that will search for backups automatically in scanning mode

Dealing with SSL

Often sites in CTF's will use self-signed certificates to allow HTTPs. However, tools may reject the certificate as not being valid.

In this case we can ask the scanner to ignore SSL errors (usully -k like curl, but check the documentation)

Cookies

If a site requires us to be authenticated, and uses session cookies we can pass them as part of the request.

  • Grab the session cookie (ie PHPSESSID) from the inspector tool
  • Pass it to the scanner using the relevant flag, this changes for different scanners, check the manpage:

For example, to pass the cookie "PHPSESSID=SOMETOKEN" we can use:

$gobuster dir -w /usr/share/wordlists/dirb/common.txt -t 30 -u 127.0.0.1 -b "PHPSESSID=SOMETOKEN"
$ffuf -w /usr/share/wordlists/dirb/common.txt -u http://127.0.0.1/FUZZ -b "PHPSESSID=SOMETOKEN"
Custom Headers

We may also need to authenticate using other header elements, for example HTTP Basic Auth In this case we can pass those to the scanner also, again check the manpage for the correct flags

For example, for HTTP Basic auth we might pass the followng header Authorization: Basic ZGFuZzpzd29yZGZpc2g=

$gobuster dir -w /usr/share/wordlists/dirb/common.txt -t 30 -u 127.0.0.1 -H "Authorization: Basic ZGFuZzpzd29yZGZpc2g=" 
$ffuf -w /usr/share/wordlists/dirb/common.txt -u http://127.0.0.1/FUZZ -H "Authorization: Basic ZGFuZzpzd29yZGZpc2g="

Page Scanning

By default most tools will only search for the items in the wordlist without any modification. As sites generally have a file extension (like .html, or .php), this means we need to tell our scan to take account of the extension when doing the scan. Not including the extension will help us find direcories, or REST style URL's, meaning our results are limited.

Often its easy to work out the extension, as we would have kept track of the types of pages returned and their type in the mapping phase. Its unlikely (but not impossible) that a site will change between extension types.

To convert the searches above to look for both HTML and PHP pages we can use

$gobuster dir -w /usr/share/wordlists/dirb/common.txt -t 30 -u 127.0.0.1 -x .html,.php
$ffuf -w /usr/share/wordlists/dirb/common.txt -u http://127.0.0.1/FUZZ -e .html,.php

Scanning Parameters

As well as searching for pages and directories, we can use the tools to fuzz parameters on sites. This can be useful for identifying content on a single page application, or to identify hidden parameters in a form field.

For identifying pages, its a similar process to the standard directory scanning.
However, rather than fuzzing the URL for the page, we fuzz the parameter that requests the page. We will have identified the relevant parameters in our mapping phase.

This type of scanning can also be useful when looking for things like SQL injection or XSS, as we can feed a list of potential payloads to the page and monitor the response for succsfull attempts.

Example

Imagine we have a single page application with the following URL, where <some text> denotes the page we are requesting

www.example.org/content.html?page=<some text>

We can scan for more content using the common wordlist with the following commands.

With Gobuster we need to use fuzzing mode rather than dir mode. We can then add FUZZ to the parameter we with to check.

gobuster fuzz -u https://www.example.org/content.html?page=FUZZ -w /usr/share/wordlists/dirb/common.txt

With FFUF the syntax is reasonably simple, adding FUZZ to the URL at the place we want to substitute our dictionary words.

$ffuf -w /usr/share/wordlists/dirb/common.txt -u http://127.0.0.1/content.html?page=FUZZ

We might also want to scan for parameter names, for example if we have identified an end point that can take multiple parameter types. Lets imagine we identify an endpoint which allows us to view, and modify user details.

  • www.example.org/users?id=123&view=1
  • www.example.org/users?id=123&edit=1

We could try a set of different parameters here, to see if we expose new functionally. Examples might be

  • setPassword
  • editPassword

Example

Like before we can use FUZZ mode with gobuster to search for the information we want.

gobuster fuzz -u https://www.example.org/content.html?FUZZ=1 -w /usr/share/wordlists/dirb/common.txt

Again, with FFUF the syntax is reasonably simple, adding FUZZ to the URL at the place we want to substitute our dictionary words.

$ffuf -w /usr/share/wordlists/dirb/common.txt -u http://127.0.0.1/content.html?FUZZ=1

Summary

In this article we have looked at brute force scanning sites and services. We examined the concept behind scanning, an looked at ways of doing it in several tools.

While we have covered the core concepts, its worth exploring the tool in depth, as they have other useful functionality. In the lab session, you will get a chance to practice using different scanners.

Discuss

Differ


  1. Obviously, there could be an interesting page that has the same response size as the missing ones. In that case its unlikely you are going to find it. 

  2. Personally, I like Gobuster, but have been using FFUF a lot recently. 

Back to top