UriParser
class UriParser (View source)
Provides a RFC 3986 compliant solution to URL parsing.
UriParser provides a method for parsing URLs that accurately complies with
the RFC specification. Unlike the built function parse_url()
, the parser in
this library is based on the ABNF definition of the generic URI syntax. In
other words, this library does not allow any kind of invalid URLs and parses
them exactly as defined in the specification.
While the intention of this library is to provide an accurate implementation for URL parsing, it possible to use this library for parsing any kind of valid URIs, since the parsing is simply based on the generic URI syntax. Some of the features are simply more suited to dealing with URLs. The parser, however, does not provide any additional validation based on the URI scheme.
While the RFC specification does not allow UTF-8 characters in URIs, these are still commonly used, especially in user input. To accommodate this fact, the parser provides two additional compatibility modes that permit UTF-8 in some of the URI components in addition to providing a simple support for international domain names.
Constants
MODE_RFC3986 |
Parsing mode that conforms strictly to the RFC 3986 specification |
MODE_UTF8 |
Parsing mode that allows UTF-8 characters in some URI components |
MODE_IDNA2003 |
Parsing mode that also converts international domain names to ascii |
Methods
Creates a new instance of UriParser.
Sets the parsing mode.
Details
at line 61
__construct()
Creates a new instance of UriParser.
at line 85
setMode(int $mode)
Sets the parsing mode.
The parser supports three different parsing modes as indicated by the available parsing mode constants. The modes are as follows:
MODE_RFC3986
adheres strictly to the RFC specification and does not allow any non ascii characters in the URIs. This is the default mode.MODE_UTF8
allows UTF-8 characters in the user information, path, query and fragment components of the URI. These characters will be converted to appropriate percent encoded sequences.MODE_IDNA2003
also allows UTF-8 characters in the domain name and converts the international domain name to ascii according to the IDNA 2003 standard.
at line 102
Uri|null
parse(string $uri)
Parses the URL using the generic URI syntax.
This method returns the Uri
instance constructed from the components
parsed from the URL. The URL is parsed using either the absolute URI
pattern or the relative URI pattern based on which one matches the
provided string. If the URL cannot be parsed as a valid URI, null is
returned instead.