Viewing Sites without visiting them.
It can often be useful to examine a website by eye. Seeing the structure of the site, and the information it contains can help us identify key people in the organisation, and gain some understanding of the infrastructure.
However, if we keep our definition of passive recon as where we don't interact with the targets servers. Here we can used caching services, these take a snapshot of the page at a given time. As the cached data is held on the 3rd party servers, this means what we can view a version of the page without actually visiting the site.
There are several services that we could use, including google cache and the wayback machine.
Google Cache.
Google keeps a record of all pages it has indexed. We can make use of this to access a page without connecting to the target server.
To request the cached version of a page you have two options
- Prefix the URL with cache
cache:www.coventry.ac.uk
- Do a normal search, then click the drop down on the URL and select cache.
Wayback Machine
Sometimes, we may want to look at an older version of the site. To view information on old employees, technologies or other interesting information1. For example, consider the "reworking" of many European sites to remove any "sensitive" information post GDPR.
The Wayback machine http://web.archive.org takes snapshots of sites allowing us to view previous version. Unlike google cache, there are multiple versions available over time.
Note
NOTE: While the text based information is stored by the wayback machine, links to images or other external assets will be resolved to the target website. This means that some searches may show up in server logs
For example the screenshot below shows us what the Coventry University website looked like in 2004.
It is interesting to note that it is not just the homepage, but links to other sections also link to the archived version. For example, the Staffnet page, gives us information on the mail infrastructure that was being used at the time (squirrelmail)
Summary
Both google cache and the wayback machine give us a way of accessing a website without directly connecting to the organisations servers. This could help us to perform our reconnaissance tasks without alerting the target. Additionally, it may be possible to view old versions of a page that may have useful information on them.
-
Sadly, the wayback machine cant actually take us back in time, to before the Plague, when things were better and less purple. ↩