url-sanitizer v2.0.4
URL Sanitizer
URL sanitizer for Node.js, browsers and web sites. Sanitize not only regular URLs, but also data URLs and blob URLs. It also has the ability to parse URLs and verify URIs.
Install
npm i url-sanitizerFor browsers and web sites, standalone ESM builds are available in dist/ directory.
- node_modules/url-sanitizer/dist/url-sanitizer.min.js
- node_modules/url-sanitizer/dist/url-sanitizer-wo-dompurify.min.js
Or, download them from Releases.
NOTE: url-sanitizer-wo-dompurify.min.js is built without DOMPurify.
If you use it, make sure DOMPurify is exposed globally, e.g. window.DOMPurify.
Usage
import urlSanitizer, {
isURI, isURISync, parseURL, parseURLSync, sanitizeURL, sanitizeURLSync
} from 'url-sanitizer';sanitizeURL(url, opt)
Sanitize the given URL.
blob,dataandfileschemes must be explicitly allowed.- Given a
blobURL, returns a sanitizeddataURL.
Parameters
Returns Promise<string?> Sanitized URL, nullable.
// Sanitize tags and quotes
const res1 = await sanitizeURL('https://example.com/?<script>alert(1)</script>');
// => 'https://example.com/'
const res1_2 = await sanitizeURL('https://example.com/" onclick="alert(1)"');
// => 'https://example.com/'
// Can parse and sanitize data URL
const res2 = await sanitizeURL('data:text/html,<div><script>alert(1);</script></div><p onclick="alert(2)"></p>', {
allow: ['data']
})
// => 'data:text/html,%3Cdiv%3E%3C/div%3E%3Cp%3E%3C/p%3E'
console.log(decodeURIComponent(res2));
// => 'data:text/html,<div></div><p></p>'
// Also can parse and sanitize base64 encoded data
const base64data3 = btoa('<div><script>alert(1);</script></div>');
const res3 = await sanitizeURL(`data:text/html;base64,${base64data3}`, {
allow: ['data']
})
// => 'data:text/html,%3Cdiv%3E%3C/div%3E'
console.log(decodeURIComponent(res3));
// => 'data:text/html,<div></div>'
const base64data3_2 = btoa('<div><img src="javascript:alert(1)"></div>');
const res3_2 = await sanitizeURL(`data:text/html;base64,${base64data3_2}`);
// => 'data:text/html,%3Cdiv%3E%3Cimg%3E%3C/div%3E'
console.log(decodeURIComponent(res3_2));
// => 'data:text/html,<div><img></div>'
// Can parse and sanitize blob URL
const blob4 = new Blob(['<svg><g onload="alert(1)"/></svg>'], {
type: 'image/svg+xml'
});
const url4 = URL.createObjectURL(blob4);
const res4 = await sanitizeURL(url4, {
allow: ['blob']
});
// => 'data:image/svg+xml,%3Csvg%3E%3Cg%3E%3C/g%3E%3C/svg%3E'
console.log(decodeURIComponent(res4));
// => 'data:image/svg+xml,<svg><g></g></svg>'
// Deny if the scheme matches the `deny` list
const res5 = await sanitizeURL('web+foo://example.com', {
deny: ['web+foo']
});
// => null
// Allow only if the scheme matches the `only` list
const res6 = await sanitizeURL('http://example.com', {
only: ['data', 'git', 'https']
});
// => null
const res6_2 = await sanitizeURL('https://example.com/"onmouseover="alert(1)"', {
only: ['data', 'git', 'https']
});
// => 'https://example.com/'
// `only` also allows combination of the schemes in the list
const res7 = await sanitizeURL('git+https://example.com/foo.git?<script>alert(1)</script>', {
only: ['data', 'git', 'https']
});
// => 'git+https://example.com/foo.git'sanitizeURLSync(url, opt)
Synchronous version of the sanitizeURL().
dataandfileschemes must be explicitly allowed.blobscheme is not supported, returnsnull. Use async sanitizeURL() forblob.
parseURL(url)
Parse the given URL.
- Blob URLs are simply parsed and not yet sanitized.
Parameters
urlstring URL input.
Returns Promise<ParsedURL> Result.
ParsedURL
Object with additional properties based on URL API.
Type: object
Properties
inputstring URL input.validboolean Is valid URI.dataobject? Parsed result of data URL,nullable.hrefstring? Sanitized URL input.originstring? Scheme, domain and port of the sanitized URL.protocolstring? Protocol scheme of the sanitized URL.usernamestring? Username specified before the domain name.passwordstring? Password specified before the domain name.hoststring? Domain and port of the sanitized URL.hostnamestring? Domain of the sanitized URL.portstring? Port number of the sanitized URL.pathnamestring? Path of the sanitized URL.searchstring? Query string of the sanitized URL.hashstring? Fragment identifier of the sanitized URL.
const res1 = await parseURL('javascript:alert(1)');
/* => {
input: 'javascript:alert(1)',
valid: false
} */
const res2 = await parseURL('https://www.example.com/?foo=bar#baz');
/* => {
input: 'https://www.example.com/?foo=bar#baz',
valid: true,
data: null,
href: 'https://www.example.com/?foo=bar#baz',
origin: 'https://www.example.com',
protocol: 'https:',
hostname: 'www.example.com',
pathname: '/',
search: '?foo=bar',
hash: '#baz',
...
} */
// base64 encoded SVG '<svg><g onclick="alert(1)"/></svg>'
const res3 = await parseURL('data:image/svg+xml;base64,PHN2Zz48ZyBvbmNsaWNrPSJhbGVydCgxKSIvPjwvc3ZnPg==');
/* => {
input: 'data:image/svg+xml;base64,PHN2Zz48ZyBvbmNsaWNrPSJhbGVydCgxKSIvPjwvc3ZnPg==',
valid: true,
data: {
mime: 'image/svg+xml',
base64: false,
data: '%3Csvg%3E%3Cg%3E%3C/g%3E%3C/svg%3E'
},
href: 'data:image/svg+xml,%3Csvg%3E%3Cg%3E%3C/g%3E%3C/svg%3E',
origin: 'null',
protocol: 'data:',
pathname: 'image/svg+xml,%3Csvg%3E%3Cg%3E%3C/g%3E%3C/svg%3E',
...
} */
// base64 encoded PNG
const res4 = await parseURL('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==');
/* => {
input: 'data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==',
valid: true,
data: {
mime: 'image/png',
base64: true,
data: 'iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg=='
},
href: 'data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==',
origin: 'null',
protocol: 'data:',
pathname: 'image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==',
...
} */
// Note that blob URLs are parsed but not yet sanitized
const blob5 = new Blob(['<svg><g onload="alert(1)"/></svg>'], {
type: 'image/svg+xml'
});
const url5 = URL.createObjectURL(blob5);
const res5 = await parseURL(url5);
/* => {
input: 'blob:nodedata:82ecc5a4-aea8-48d7-a407-64e2ef0913da',
valid: true,
data: null,
href: 'blob:nodedata:82ecc5a4-aea8-48d7-a407-64e2ef0913da',
origin: 'null',
protocol: 'blob:',
pathname: 'nodedata:82ecc5a4-aea8-48d7-a407-64e2ef0913da',
...
} */parseURLSync(url)
Synchronous version of the parseURL().
isURI(uri)
Verify if the given URI is valid and registered.
Parameters
uristring URI input.
Returns Promise<boolean> Result.
- Always
trueforweb+*andext+*schemes, exceptweb+javascript,web+vbscript,ext+javascript,ext+vbscript. falseforjavascriptandvbscriptschemes.
const res1 = await isURI('https://example.com/foo');
// => true
const res2 = await isURI('javascript:alert(1)');
// => false
const res3 = await isURI('mailto:foo@example.com');
// => true
const res4 = await isURI('foo:bar');
// => false
const res5 = await isURI('web+foo:bar');
// => true
const res6 = await isURI('web+javascript:alert(1)');
// => falseisURISync(uri)
Synchronous version of the isURI().
urlSanitizer
Instance of the sanitizer.
urlSanitizer.get()
Get a list of registered URI schemes.
Returns Array<string> Array of registered URI schemes.
- Includes schemes registered at iana.org by default.
- Historical schemes omitted.
moz-extensionscheme added.
- Also includes custom schemes added via urlSanitizer.add().
const schemes = urlSanitizer.get();
// => ['aaa', 'aaas', 'about', 'acap', 'acct', ...]urlSanitizer.has(scheme)
Check if the given scheme is registered.
Parameters
schemestring Scheme.
Returns boolean Result.
const res1 = urlSanitizer.has('https');
// => true
const res2 = urlSanitizer.has('foo');
// => falseurlSanitizer.add(scheme)
Add a scheme to the list of registered URI schemes.
javascriptandvbscriptschemes can not be registered. It throws.
Parameters
schemestring Scheme.
Returns Array<string> Array of registered URI schemes.
console.log(urlSanitizer.has('foo'));
// => false
const res = urlSanitizer.add('foo');
// => ['aaa', 'aaas', 'about', 'acap', ... 'foo', ...]
console.log(urlSanitizer.has('foo'));
// => trueurlSanitizer.remove(scheme)
Remove a scheme from the list of registered URI schemes.
Parameters
schemestring Scheme.
Returns boolean Result.
trueif the scheme is successfully removed,falseotherwise.
console.log(urlSanitizer.has('aaa'));
// => true
const res1 = urlSanitizer.remove('aaa');
// => true
console.log(urlSanitizer.has('aaa'));
// => false
const res2 = urlSanitizer.remove('foo');
// => falseurlSanitizer.reset()
Reset sanitizer.
Returns void
Acknowledgments
The following resources have been of great help in the development of the URL Sanitizer.
- DOMPurify
- Uniform Resource Identifier (URI) Schemes - IANA
- Encoding -- determine the character encoding of a text file. - file/file
Copyright (c) 2023 asamuzaK (Kazz)
1 year ago
1 year ago
1 year ago
1 year ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago