Viewing Sites without visiting them.

It can often be useful to examine a website by eye. Seeing the structure of the site, and the information it contains can help us identify key people in the organisation, and gain some understanding of the infrastructure.

However, if we keep our definition of passive recon as where we don't interact with the targets servers. Here we can used caching services, these take a snapshot of the page at a given time. As the cached data is held on the 3^rd party servers, this means what we can view a version of the page without actually visiting the site.

There are several services that we could use, including google cache and the wayback machine.

Google Cache.

Google keeps a record of all pages it has indexed. We can make use of this to access a page without connecting to the target server.

To request the cached version of a page you have two options

Prefix the URL with cache cache:www.coventry.ac.uk
Do a normal search, then click the drop down on the URL and select cache.

Google Cache Example

Wayback Machine

Sometimes, we may want to look at an older version of the site. To view information on old employees, technologies or other interesting information¹. For example, consider the "reworking" of many European sites to remove any "sensitive" information post GDPR.

The Wayback machine http://web.archive.org takes snapshots of sites allowing us to view previous version. Unlike google cache, there are multiple versions available over time.

Note

NOTE: While the text based information is stored by the wayback machine, links to images or other external assets will be resolved to the target website. This means that some searches may show up in server logs

For example the screenshot below shows us what the Coventry University website looked like in 2004.

Wayback Machine for Coventry University 2004

It is interesting to note that it is not just the homepage, but links to other sections also link to the archived version. For example, the Staffnet page, gives us information on the mail infrastructure that was being used at the time (squirrelmail)

Wayback Machine Staffnet page

Summary

Both google cache and the wayback machine give us a way of accessing a website without directly connecting to the organisations servers. This could help us to perform our reconnaissance tasks without alerting the target. Additionally, it may be possible to view old versions of a page that may have useful information on them.

Sadly, the wayback machine cant actually take us back in time, to before the Plague, when things were better and less purple. ↩