url-reference v0.8.0
URLReference
"URL or Relative Reference"
The URLReference class is designed to overcome shortcomings of the URL class.
Features
- Supports Relative and scheme-less URLs.
- Supports Nullable Components.
- Distinct Rebase, Normalize and Resolve methods.
- Resolve is Behaviourally Equivalent with the WHATWG URL Standard.
Examples
new URLReference ('filename.txt#top', '//host') .href
// => '//host/filename.txt#top'
new URLReference ('?do=something', './path/to/resource') .href
// => './path/to/resource?do=something'
new URLReference ('take/action.html') .resolve ('http://🌲') .href
// => 'http://xn--vh8h/take/action.html'API
Summary
The module exports a single class URLReference with nullable properties (getters/setters):
scheme,username,password,hostname,port,pathname,pathroot,driveletter,filename,query,fragment.
It has three key methods:
rebase,normalizeandresolve.
It can be converted to an ASCII, or to a Unicode string via:
- the
hrefgetter and thetoStringmethod.
Terminology
Special: The WHATWG URL standard uses the phrase "special URL" for URLs that have a "special scheme". A "special scheme" is a scheme that is equivalent to
http,https,ws,wss,ftporfile.Hierarchical: The path of an URL may either be hierarchical or opaque. An hierarchical path is subdivided into smaller components, an opaque path is not.
(!) The path of a "special" URL is always hierarchical. The path of a non-special URL is hierarchical if the URL has an authority or otherwise if its path starts with a path-root
/.Rebase is a generalisation of reference transformation as defined in the RFC3986 (URI) that supports constructing relative references. The base argument may be a relative reference, in addition to an absolute URL.
Non-strict: The RFC3986 (URI) standard defines a strict and a non-strict variant of reference transformation. The non-strict variant ignores the scheme of the input if it is equivalent to the scheme of the base. The WHATWG uses the non-strict behaviour for "special" URLs and the strict behaviour for other URLs.
URLReference
Constructor
new URLReference ()new URLReference (input)new URLReference (input, base)
Constructs a new URLReference object. The result may represent a relative URL. The resolve method can be used to ensure that the result represents an absolute URL.
Arguments input and base are optional. Each may be a string to be parsed, or an existing URLReference object. If a base argument is supplied, then input is rebased onto base after parsing.
const r1 = new URLReference ();
// r.href == '' // The 'empty relative URL'
const r2 = new URLReference ('/big/trees/');
// r.href == '/big/trees/'
const r3 = new URLReference ('index.html', '/big/trees/');
// r.href == '/big/trees/index.html'
const r4 = new URLReference ('README.md', r3);
// r.href == '/big/trees/README.md'Parsing behaviour
The parsing behaviour is adapted according to the scheme of input or the scheme of base otherwise.
- The hostname is parsed as an opaque hostname string.
- The parsing and validation of a hostname as a domain is done in the resolve method instead.
- The invalid
\code-points before the host and in the path are converted to/if the input has a special scheme or if it has no scheme at all. - Windows drive letters are detected if the scheme is equivalent to
fileor if no scheme is present. - If no scheme is present and a windows drive letter is detected then then the scheme is implicitly set to
file.
const r1 = new URLReference ('\\foo\\bar', 'http:')
// r1.href == 'http:/foo/bar'
const r2 = new URLReference ('\\foo\\bar', 'ofp:/')
// r2.href == 'ofp:/\\foo\\bar'
const r3 = new URLReference ('/c:/path/to/file')
// r3.href == 'file:/c:/path/to/file'
// r3.hostname == null
// r3.driveletter == 'c:'
const r4 = new URLReference ('/c:/path/to/file', 'http:')
// r4.href == 'http:/c:/path/to/file'
// r4.hostname == null
// r4.driveletter == nullMethods
Rebase – urlReference .rebase (base)
The base argument may be a string or an URLReference object.
Rebase implements a slight generalisation of reference transformation as defined in RFC3986 (URI). Rebase returns a new URLReference instance, or throws an error if the base argument reprensents an URL with an opaque path.
Rebase applies a "non-strict" reference transformation to URLReferences that have a "special scheme". This legacy behaviour is required to achieve compatibility with the WHATWG URL Standard.
Note: A "non-strict reference transformation" ignores the scheme of the input if it matches the scheme of the base. This has a surprising consequence: An URLReference that has a special scheme may still behave as a relative URL:
const base = new URLReference ('http://host/dir/')
const rel = new URLReference ('http:?do=something')
const rebased = rel.rebase (base)
// rebased.href == 'http://host/dir/?do=something'Rebase applies a "strict" reference transformation to non-special URLReferences. The strict variant does not remove the scheme from the input.
const base = new URLReference ('ofp://host/dir/')
const abs = new URLReference ('ofp:?do=something')
const rebased = abs.rebase (base)
// rebased.href == 'ofp:?do=something'It is not possible to rebase a relative URLReference on a base that has an opaque path.
const base = new URLReference ('ofp:this/is/an/opaque-path/')
const rel = new URLReference ('filename.txt')
// const rebased = rel.rebase (base) // throws:
// TypeError: Cannot rebase <filename.txt> onto <ofp:this/is/an/opaque-path/>
const base2 = new URLReference ('ofp:/not/an/opaque-path/')
const rebased = rel.rebase (base2) // This works as expected
// rebased.href == 'ofp:/not/an/opaque-path/filename.txt'Normalize – urlReference .normalize ()
Normalize collapses dotted segments in the path, removes default ports and percent encodes certain code-points. It behaves in the same way as the WHATWG URL constructor, except for the fact that it supports relative URLs. Normalize always returns a new URLReference instance.
Resolve
urlReference .resolve ()urlReference .resolve (base)
The optional base argument may be a string or an existing URLReference object.
Resolve returns a new URLReference that represents an absolute URL, or throws an error if this is not possible. It uses the same forceful error correcting behaviour as the WHATWG URL constructor.
Note: An unpleasant aspect of the WHATWG behaviour is that if the input is a non-file special URL, and the input has no authority, then the first non-empty path component will be coerced to an authority:
const r1 = new URLReference ('http:/foo/bar')
// r.host == null
// r.pathname == '/foo/bar'
const r2 = r1.resolve ('http://host/')
// The scheme of r1 is ignored because it matches the base.
// Thus the hostname is taken from the base.
// r2.href == 'http://host/foo/bar'
const r3 = r1.resolve ()
// r1 does not have an authority, so the first non-empty path
// component `foo` is coerced into an authority for the result.
// r1.href == 'http://foo/bar'Resolve does additional processing and checks on the authority:
- Asserts that file-URLs and web-URLs have an authority.
- Asserts that the authority of web-URLs is not empty.
- Asserts that file-URLs do not have a username, password or port.
- Parses opaque hostnames of file-URLs and web-URLs as a domain or an ip-address.
String – urlReference .toString ()
Converts the URLReference to a string. This preserves unicode characters in the URL, unlike the href getter which ensures that the result consists of ASCII code-points only.
new URLReference ('take/action.html') .resolve ('http://🌲') .toString ()
// => 'http://🌲/take/action.html'
new URLReference ('take/action.html') .resolve ('http://🌲') .href
// => 'http://xn--vh8h/take/action.html'Properties
Access to the components of the URLReference goes through the following getters/setters. All properties are nullable, however some invariants are maintained.
schemeusernamepasswordhostnameportpathname
driveletterpathrootfilename
queryfragment
Licence
MIT Licenced.
2 years ago