Simple way to play with a site requests and responses.
Tested in puppeteer/chromium
only!
- Breaking changes in v1
- Known issues in managed mode
- Installing
- Why?
- Motivation
- API
- Samples
- Troubleshooting
- TODO
Cause: major shift of focus from proxying to convenient api for interception api
- removed enableLegacyCookieHandling
RequestMode.native
is now default- removed default handlers in
requestHandlers
- you need to add it manually - removed options gotHooks, proxy, agent - moved to
buildGotHttpHandler
- Grease chips are missing.
- Some headers are missing.
- [CORS] OPTIONS requests (preflight requests) are missing before the actual request will be executed.
- [CORS] Headers are close to being correct, but they're not.
- WebSockets will be handled by the browser (IP leak may occur if you are using a proxy in the package but not in Puppeteer itself).
- Optimization can be bad on high load.
Using npm
npm install automation-extra-interception-proxy
Using yarn
yarn add automation-extra-interception-proxy
This package solves next problems:
Time to time required to reach information from the browser request. By default you can reach easily only to headers information. If you want to just read all responses you also can do that but time to time it will throw errors by one of next reasons.
At first page can be already closed and then your code will throw an error.
At second some sites using service workers for requesting some information. Unfortunately you cant handle this situation without manual requesting and then converting to puppeteer.
If you want just adjust some requests or responses you should do that manually.
Example. You want get original request/response and do some adjustments. This package will help do that easily. You just getting what you want by single function call.
Yes, puppeteer already have a proxy support throw additional process arguments. But you should manually maintain proxy credentials each request(?, not sure). Also you cant use socks proxy(?, not sure).
Even with cooperative mode you can not make your decisions asynchronously. Here you can chain of handlers with will proceed request decision one by one. Also you can say that this is latest decision and no need to ask another handlers in the chain. Also in one handler you can can adjust request/response for the next one.
We live in the world where almost each website have internal api. When you are looking at the network tab in Chrome DevTools its easy to handle where and what. Data already yours but you cant just get what you want. But you have to fight for the information you desired for. So lets fight together!
- wrapPage
- IConfig
- continue
- ignoreResponseBodyIfPossible
- flushLocal
- recordError
- recordInternalError
- recordWarning
- RequestMode
- RequestStage
- IRequestOptions
- _bodyError
- IAbortReason
- ConfigurableMixin
- applyLoggableMixin
- NetworkMixin
- InterceptionProxyRequest
- UserListener
- _OriginalRequestStateManager
Add interception ability to the page (sample)
page
Puppeteer.Page Page for future interceptionsconfig
IConfig?
Returns Promise<InterceptionProxyPageConfig>
Plugin configuration object
Puppeteer' "Cooperative Intercept Mode" priority
This package using own way to manage cooperation
Use only if you know what it does
ignore
- Plugin will do nothing about original request
native
- Plugin will just listen to the original request/response data and all requests will fulfilled by puppeteer itself. But some plugin functionality can be unavailable.
managed
- Plugin will do all requests by requestHandlers
or by himself. All plugin features will be available.
Default - managed
Type: RequestMode
You can handle all plugins messages
Type: any
Request timeout in milliseconds(actual execution only)
Type: number
If you didn't changed request or response, let puppeteer handle this request by himself
Default: false
Type: boolean
If you did not use the plugin' response object it will not retrieve response from puppeteer for better performance
Applies for native
mode only
Type: boolean
src/interfaces/classes.ts:42-42
Will send gathered response back to the puppeteer immediately
If response not collected yet will call getResponse first.
Returns Promise<void>
src/interfaces/mixins.ts:14-14
If you are using this specific method global ignoreResponseBodyIfPossible
will be ignored
Type: boolean
src/interfaces/mixins.ts:43-43
Flush local configuration
key
any? If provided will flush only specific parameter at local level
Returns void
src/interfaces/mixins.ts:55-59
Pass an error to the logger
message
any Flow descriptionerror
any? Original error objectmeta
...any non specific meta information
Returns void
src/interfaces/mixins.ts:65-68
Pass an internal error to the logger
message
any Flow/error descriptionmeta
...any non specific meta information
Returns void
src/interfaces/mixins.ts:74-77
Pass an warn to the logger
message
any Flow/error descriptionmeta
...any non specific meta information
Returns void
src/interfaces/network.ts:7-21
Plugin mode for handling requests
src/interfaces/network.ts:11-11
Plugin will do nothing about original request
Type: string
src/interfaces/network.ts:16-16
Plugin will just listen to the original request/response data and all requests will fulfilled by puppeteer itself. But some plugin functionality can be unavailable.
Type: string
src/interfaces/network.ts:20-20
Plugin will do all requests by himself. All plugin features will be available.
Type: string
src/interfaces/network.ts:26-65
Current stage of the request
src/interfaces/network.ts:35-35
We got a new request from the puppeteer witch includes all necessary information about.
At this stage we can adjust request.
Type: string
src/interfaces/network.ts:42-42
The request in requesting process
At this stage we unable to adjust request but still have not response to go forward.
Type: string
src/interfaces/network.ts:50-50
We got response from the request witch probably was modified by the user and now user can adjust the response.
At this stage we can adjust response. At this stage the user will unable to override the request anymore.
Type: string
src/interfaces/network.ts:57-57
We sent final response of the request to the browser.
Its too late to adjust request or response.
Type: string
src/interfaces/network.ts:64-64
Page were closed and we unable do anything
From technical perspective sentResponse
looks just the same
Type: string
src/interfaces/network.ts:72-100
Plugin' request options. The request have significant difference with Puppeteer' request.
Can be modified. All changes will be applied to the actual Puppeteer' request and will be executed
src/interfaces/network.ts:78-78
Request method.
If request were executed you will unable to change this property.
Type: Method
src/interfaces/network.ts:85-85
Request url.
If request were executed you will unable to change this property.
Type: string
src/interfaces/network.ts:92-92
Request headers.
If request were executed you will unable to change this property.
Type: Headers
src/interfaces/network.ts:99-99
Request body.
If request were executed you will unable to change this property.
Type: (string | Buffer | undefined)
src/interfaces/network.ts:108-108
Type: string
src/interfaces/network.ts:129-129
This option will override the response
aborted
- An operation was aborted (due to user action).accessdenied
- Permission to access a resource, other than the network, was denied.addressunreachable
- The IP address is unreachable. This usually means that there is no route to the specified host or network.blockedbyclient
- The client chose to block the request.blockedbyresponse
- The request failed because the response was delivered along with requirements which are not met ('X-Frame-Options' and 'Content-Security-Policy' ancestor checks, for instance).connectionaborted
- A connection timed out as a result of not receiving an ACK for data sent.connectionclosed
- A connection was closed (corresponding to a TCP FIN).connectionfailed
- A connection attempt failed.connectionrefused
- A connection attempt was refused.connectionreset
- A connection was reset (corresponding to a TCP RST).internetdisconnected
- The Internet connection has been lost.namenotresolved
- The host name could not be resolved.timedout
- An operation timed out.failed
- A generic failure occurred.
Type: ErrorCode
src/mixins/ConfigurableMixin.ts:89-229
Extends ConfigurableMixinBase
Plugin general configuration.
src/mixins/LoggableMixin.ts:13-48
base
any
Returns any
src/mixins/NetworkMixin.ts:7-52
Extends base
Expecting this.stage in runtime
Extends RequestBase
Plugin' request. The request have significant difference with Puppeteer' request.
initial
INewRequestInitialArgsrequestOptions
IRequestOptions
src/classes/_OriginalRequestStateManager.ts:6-6
Type: function (response: (Puppeteer.HTTPResponse | null)): void
src/classes/_OriginalRequestStateManager.ts:13-59
keep in track frow state
- ``
- ``
Dependencies: data-urls
npm install data-urls
import { dataUrlHandler } from 'automation-extra-interception-proxy/handlers/dataUrl`
Dependencies: got
npm install got
import { buildGotHttpHandler } from 'automation-extra-interception-proxy/handlers/gotHttp`
/**
* This example will show how to enable interceptions for single page.
*
* This code will get some wallpaper image urls from bing.com
*
* This code could be broken if their behavior was changed.
*/
// require libs
const puppeteer = require('puppeteer');
const InterceptionUtils = require('automation-extra-interception-proxy');
// do everything async
(async () => {
// launch some browser
const browser = await puppeteer.launch({
headless: false,
});
// get some page
const page = await browser.newPage();
// attach interception commands
await InterceptionUtils.wrapPage(page, {
requestMode: "managed",
});
// create promise callback for async processing
let callback;
const promise = new Promise((resolve) => { callback = resolve; });
// add some listener
page.interceptions.addRequestListener('bing-images', async request => {
// filter anything else
if (request.url !== 'https://www.bing.com/hp/api/model') {
// just letting you know that we got something else here
console.log('Ignoring', request.url.slice(0, 50));
return
}
// get response data
const response = await request.getResponse();
// grab data directly from their api response
const apiData = response.json;
// doing anything you like
const imageUrls = apiData.MediaContents.map(({ ImageContent }) =>
`https://www.bing.com${ImageContent.Image.Url}`);
// back to async thread
callback(imageUrls);
}); // end of listener
// goto to our destination and wait for the response
const [imageUrls] = await Promise.all([
promise,
page.goto('https://www.bing.com/'),
]);
// print our image urls
console.log('imageUrls', imageUrls);
// not necessary: cleaning our listener
page.interceptions.deleteLocalRequestListener('bing-images');
// closing browser
await browser.close();
})(); // ent of our thread
Probably you're using old version of puppeteer. Try you upgrade first.
In case if you don't want to or cookies still does not work switch the package to v0.9.0 or below and enable enableLegacyCookieHandling.
Yes, the implementation is still raw.
- handlers
-
- move to separate packages
-
- documentation
-
- support Grease cipher - as another handler
- finalize cors managed requests - need to pass cors test
- add tests
-
- plugin flow
- documentation
-
- improve
docs
command
- improve
-
- describe
wrapPage
- describe
-
- describe
InterceptionProxyPlugin
class
- describe
- websocket support
Copyright © 2021-2025, Utyfua. Released under the MIT License.