Skip to content
Permalink
Browse files
bug fix
  • Loading branch information
aa7401 committed Jan 2, 2019
1 parent dba24bc commit d240889022d2070c8313e68f8b244222316deb22
Show file tree
Hide file tree
Showing 9 changed files with 118 additions and 23 deletions.
@@ -88,6 +88,7 @@ Before you can work with Git you need to update the repository configuration. Fo
2. Update your email (using your unversity email) `git config user.email 'bloggsj@uni.coventry.ac.uk'`
3. Update your commandline editor choice using `git config core.editor nano` (the editor must be installed!)
4. Cache your credential (username/password) for an hour using `git config credential.helper 'cache --timeout=3600'`
5. Update the path to your _git hooks_ directory using `git config core.hooksPath ./.githooks` (more on this in a later lab).

## 5 Local Setup

@@ -30,21 +30,28 @@ We will be working through some exercises that make use of all of these.

### 1.1 The Uniform Resource Locator

1. Start up the server script in the `exercises/02_http/01_url/` directory.
1. Install the following node packages. Refer the the previous lab if you get stuck at this point:
1. `koa`
2. `koa-router`
3. `koa-bodyparser`
4. `koa-static`
5. `js2xmlparser`
2. Access the root url, notice that the message **Hello World** is displayed in the browser.
3. Access the `/hello` url. This should result in the same message being displayed.
2. Open the `index.js` script and study lines 1-10.
1. The first line is the _shebang_, it tells the script what application is needed to run it.
2. Line 3 contains the string `'use strict'`. Strict mode preventsa certain 'unsafe' commands from running and enables you to catch a lot of potential coding errors. You should always add this to the top of your scripts.
3. Lines 5 and 6 import the `koa` and `koa-router` packages and stores them in a pair of immutable variables (constants).
4. Lines 6 and 7 use these two packages to create new JS objects and store these in another two immutable variables.
5. Line 10 stores the number `8080` in an immutable variable called `port`. We will use this later. We should always create named constants to store key numbers, this makes our script easier to follow.
Study the `index.js` script in the `exercises/02_http/01_url/` directory.

1. If you study lines 4-10 of `index.js` you will see a list of the modules we need to install. Refer the the previous lab if you get stuck at this point:
1. `koa`
2. `koa-router`
3. `koa-bodyparser`
4. `koa-static`
5. `js2xmlparser`
2. The first line is the _shebang_, it tells the script what application is needed to run it.
3. Lines 4-10 import the module packages we need for our script to work. Koa is a modular framework, on its own it does very little but depends on plugins (middleware) for its functionality.
4. Lines 11-15 are where we configure the koa _middleware_.
5. We need two global variables in our script, one to store a list of items and the second to store the port number we will be using:
1. The `let` keyword defines a _mutable variable_ which can change its value.
2. The `const` keyword defines an _immutable variable_. Once a value is assigned it cannot be changed, these are sometime called _constants_.
6. The main part of the script defines the _routes_ and we will be covering these in more detail as we progress through the lab.
7. Right at the end (line 123) we start the server on the defined port and _export_ the _koa object_ `app`. By exporting it we can import the script into our automated test suite (briefly covered in the previous lab).

Now start the server:

1. Access the root url, notice that the message **Hello World** is displayed in the browser.
2. Access the `/hello` url. This should result in the same message being displayed.
3. Locate the code for these two routes, can you understand how they work?

### 1.2 URL Parameters

File renamed without changes.
File renamed without changes.
File renamed without changes.
@@ -1,18 +1,20 @@
#!/usr/bin/env node

/* IMPORTING MODULES */
const Koa = require('koa')
const Router = require('koa-router')
const bodyParser = require('koa-bodyparser')
const staticFiles = require('koa-static')
const js2xmlparser = require('js2xmlparser')

/* CONFIGURING THE MIDDLEWARE */
const app = new Koa()
const router = new Router()

const bodyParser = require('koa-bodyparser')
app.use(router.routes())
app.use(bodyParser())

const staticFiles = require('koa-static')
app.use(staticFiles('./public'))

const js2xmlparser = require('js2xmlparser')

/* GLOBAL VARIABLES */
const port = 8080
let names = []

@@ -118,5 +120,4 @@ function formatHTML(list) {
return data
}

app.use(router.routes())
module.exports = app.listen(port, () => console.log(`listening on port ${port}`))
@@ -1,4 +1,3 @@
'use strict'

const puppeteer = require('puppeteer')
const fs = require('fs')
@@ -0,0 +1,72 @@


const puppeteer = require('puppeteer')
const fs = require('fs')
const request = require('request')
//const csv = require('fast-csv')

const getRates = async query => {
const width = 1920
const height = 926
const browser = await puppeteer.launch({ headless: false})
const page = await browser.newPage()
await page.setViewport({ width: width, height: height })
await page.goto('https://www.amazon.co.uk/s/ref=sr_pg_1?keywords=javascript', { waitUntil: 'domcontentloaded' })
await page.waitFor(5000)
console.log('ready to grab page content')
//const html = await page.content()
let records
const dom = await page.evaluate(() => {
const elements = document.querySelectorAll('li#result_1 > div')
records = elements.length
// const hotels = []
// elements.forEach((element) => {
// const quoteJson = {}
// try {
// //quoteJson.quote = element.innerText.replace(/ +/g, ',')
// quoteJson.country = element.querySelector('span.col:first-child').innerText
// //quoteJson.currencyStr = element.querySelector('span.col:nth-child(2)').innerText
// quoteJson.currency = element.querySelector('span.col:nth-child(2)').innerText.split(' (')[0]
// quoteJson.code = element.querySelector('span.col:nth-child(2)').innerText.split(' (')[1].replace(')', '')
// quoteJson.rate = parseFloat(element.querySelector('span.col:nth-child(3)').innerText)
// } catch (err) {
// return new Error('oops')
// }
// hotels.push(quoteJson)
// })
// return hotels
})
console.log(`found ${records} records`)
await browser.close()
return dom
}

const getCurrency = callback => getRates().catch(err => callback(err))

getCurrency( (err, data) => {
if(err) console.log('oops!')
console.log(`found ${data.length} CURRENCY codes`)
console.log(data.length)
fs.writeFileSync('currency.json', JSON.stringify(data, null, 2))
})

/*
https://www.amazon.co.uk/s/ref=sr_pg_2?rh=n%3A266239%2Ck%3Ajavascript&page=2&d=1&keywords=javascript&ie=UTF8&qid=1546457800
https://www.amazon.co.uk/s/ref=sr_pg_2?page=2&keywords=javascript
https://www.amazon.co.uk/s/ref=sr_pg_3?keywords=javascript
https://www.amazon.co.uk/JavaScript-Definitive-Guide-Guides/dp/0596805527/ref=sr_1_3?ie=UTF8&qid=1546457942&sr=8-3&keywords=javascript
simple search (note the number refers to the pagenation of the results):
https://www.amazon.co.uk/s/ref=sr_pg_1?keywords=javascript
uses the ISBN10 number:
https://www.amazon.co.uk/dp/0596805527
DOM EXTRACTION
use the Chrome plugin Element Locator.
li#result_1 > div > div:nth-of-type(2) > div > div:nth-of-type(2)
*/
@@ -0,0 +1,15 @@
{
"name": "02_scraping",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"author": "",
"license": "ISC",
"dependencies": {
"puppeteer": "^1.11.0",
"request": "^2.88.0"
}
}

0 comments on commit d240889

Please sign in to comment.