1.0.2 • Published 4 years ago
crawlx-cloudscraper v1.0.2
crawlx-cloudscraper
This plugin rewrites attempts callback for crawlx
, using puppeteer to bypass cloudflare's anti-ddos page.
const x = require("crawlx").default;
const cfPlugin = require("crawlx-cloudscraper")({
targetUrl: "https://www.apotea.se/",
waitForSelector: '#main-wrapper'
});
x.use(cfPlugin);
x({ url: "https://www.apotea.se/" }).then(t => {console.log(t.res.statusCode)});
Results:
Start Bypassing: limit concurrency to 0.
Start Bypassing: https://www.apotea.se/
Finish Bypassing: {"user-agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36","cookie":"shopper=************; ASP.NET_SessionId=**************; _culture=sv; __cfduid=***********; cf_clearance=***********"}
Finish Bypassing: resume concurrency to 2
200
Options:
const pluginOptions = {
// required
targetUrl: "https://www.apotea.se/",
waitForSelector: '#main-wrapper' // bypassed if this element exists
// optional
statusAllowed: [503], // requests with 503 code will be handled
attempts: 2,
userAgent: "", // empty: use crawlx's default useragent
fileDir: require('os').homedir(),
fileName: ".crawlx-cf.json", // file to store headers information
log: console.log,
delayForBypass: 6000,
}