node-proxy-fetch v1.0.0
node-proxy-fetch
Fetch web content behind a firewall.
Inspiration
Fetching web content from other websites from client-side usually either results in a CORS or a 403 Forbidden error. A typical workaround for this is to fetch it via a proxy server, but this is also usually blocked due to "Are you a human?" checks.
node-proxy-fetch uses Puppeteer to get the actual page content, grabs the generated HTML, transforms and serves it.
Usage
In your proxy server code, assuming you're using Express:
// Packages:
import express from 'express'
import fetch from 'node-proxy-fetch'
// Constants:
const app = express()
// Functions:
app.get('/', async (req, res) => {
const webpage = await fetch({
targetURL: 'https://www.npmjs.com',
type: 'DOCUMENT',
puppeteerOptions: {
baseURL: 'https://www.npmjs.com/package/solid-custom-scrollbars'
}
})
res.send(webpage)
})
app.get('/image', async (req, res) => {
const image = (await fetch({
targetURL: 'https://picsum.photos/1000',
type: 'BLOB'
})).data
res.send(image)
})
app.listen(3000)Usage with Heroku
If you're using this package with Heroku, be sure to add puppeteer-heroku-buildpack as your app's buildpack.
Usage with AWS
If you want to use this package with AWS, try out the sister package aws-proxy-fetch, or check out this guide.
API
targetURL
string
The target URL that you want to fetch.
type
FetchType = 'DOCUMENT' | 'BLOB'
The type of content you are fetching.
axiosOptions
AxiosOptions - OPTIONAL
Options for Axios, only used when type is BLOB.
config
AxiosRequestConfig<any> - OPTIONAL
headers
AxiosRequestHeaders - OPTIONAL
puppeteerOptions
PuppeteerOptions - OPTIONAL
baseURL
string
The base URL with the pattern protocol://domain.tld. All relative paths in the fetched HTML is replaced with this.
waitFor
number - OPTIONAL
The number of milliseconds to wait for before scraping the HTML. This gives time for the Javascript to run on the page. Defaults to 5000.
transformExternalLinks
boolean - OPTIONAL
Whether to transform relative paths with the baseURL or not. Defaults to true.
launchOptions
Partial<PuppeteerOptions> - OPTIONAL
Launch options for Puppeteer.
launchArguments
string[] - OPTIONAL
Launch arguments for Puppeteer.
License
MIT