url-regex-unsafe v3.0.2
url-regex-unsafe
Regular expression matching for URL's. Maintained, and browser-friendly version of url-regex. This package is vulnerable to CVE-2020-7661. Works in Node v10.12.0+ and browsers.
Table of Contents
Foreword
url-regex-unsafe is a fork of url-regex-safe, which is a fork of url-regex. url-regex-safe has resolved CVE-2020-7661 on Node by including RE2 for Node.js usage. However, RE2 does not support lookahead assertions in regular expressions, which leads to some limitations. To avoid these limitations, url-regex-unsafe gets rid of RE2 and uses built-in RegExp instead. This means that url-regex-unsafe is still vulnerable to CVE-2020-7661.
Install
npm:
npm install url-regex-unsafeyarn:
yarn add url-regex-unsafeUsage
Node
const urlRegexUnsafe = require('url-regex-unsafe');
const str = 'some long string with url.com in it';
const matches = str.match(urlRegexUnsafe());
for (const match of matches) {
console.log('match', match);
}
console.log(urlRegexUnsafe({ exact: true }).test('github.com'));Browser
VanillaJS
This is the solution for you if you're just using <script> tags everywhere!
<script src="https://unpkg.com/url-regex-unsafe"></script>
<script type="text/javascript">
(function () {
var str = 'some long string with url.com in it';
var matches = str.match(urlRegexUnsafe());
for (var i = 0; i < matches.length; i++) {
console.log('match', matches[i]);
}
console.log(urlRegexUnsafe({ exact: true }).test('github.com'));
})();
</script>Bundler
Assuming you are using browserify, webpack, rollup, or another bundler, you can simply follow Node usage above.
TypeScript
This package has built-in support for TypeScript.
Options
| Property | Type | Default Value | Description | |
|---|---|---|---|---|
exact | Boolean | false | Only match an exact String. Useful with regex.test(str) to check if a String is a URL. We set this to false by default in order to match String values such as github.com (as opposed to requiring a protocol or www subdomain). We feel this closely more resembles real-world intended usage of this package. | |
strict | Boolean | false | Force URL's to start with a valid protocol or www if set to true. If true, then it will allow any TLD as long as it is a minimum of 2 valid characters. If it is false, then it will match the TLD against the list of valid TLD's using tlds. | |
auth | Boolean | false | Match against Basic Authentication headers. We set this to false by default since it was deprecated in Chromium, and otherwise it leaves the user with unwanted URL matches (more closely resembles real-world intended usage of this package by having it set to false by default too). | |
localhost | Boolean | true | Allows localhost in the URL hostname portion. See the test/test.js for more insight into the localhost test and how it will return a value which may be unwanted. A pull request would be considered to resolve the "pic.jp" vs. "pic.jpg" issue. | |
parens | Boolean | false | Match against Markdown-style trailing parenthesis. We set this to false because it should be up to the user to parse for Markdown URL's. | |
apostrophes | Boolean | false | Match against apostrophes. We set this to false because we don't want the String background: url('http://example.com/pic.jpg'); to result in http://example.com/pic.jpg'. See this issue for more information. | |
trailingPeriod | Boolean | false | Match against trailing periods. We set this to false by default since real-world behavior would want example.com versus example.com. as the match (this is different than url-regex where it matches the trailing period in that package). | |
ipv4 | Boolean | true | Match against IPv4 URL's. | |
ipv6 | Boolean | true | Match against IPv6 URL's. | |
tlds | Array | tlds | Match against a specific list of tlds, or the default list provided by tlds. | |
returnString | Boolean | false | Return the RegExp as a String instead of a RegExp (useful for custom logic, such as we did with Spam Scanner). |
Quick tips and migration from url-regex
You must override the default and set strict: true if you do not wish to match github.com by itself (though www.github.com will work if strict: false).
Unlike the deprecated and unmaintained package url-regex, we do a few things differently:
- We set
stricttofalseby default (url-regex had this set totrue) - We added an
authoption, which is set tofalseby default (url-regex matches against Basic Authentication; had this set totrue- however this is a deprecated behavior in Chromium). - We added
parensandipv6options, which are set tofalseandtrueby default (url-regex hadparensset totrueandipv6was non-existent or set tofalserather). - We added an
apostropheoption, which is set tofalseby default (url-regex had this set totrue). - We added a
trailingPeriodoption, which is set tofalseby default (which means matches won't contain trailing periods, whereas url-regex had this set totrue).
Contributors
| Name | Website |
|---|---|
| ocavue | https://github.com/ocavue/ |
| Nick Baugh | http://niftylettuce.com/ |
| Kevin Mårtensson | |
| Diego Perini |
License
MIT © ocavue