\idna_convert
Encode/decode Internationalized Domain Names.
The class allows to convert internationalized domain names
(see RFC 3490 for details) as they can be used with various registries worldwide
to be translated between their original (localized) form and their encoded form
as it will be used in the DNS (Domain Name System).
The class provides two public methods, encode() and decode(), which do exactly
what you would expect them to do. You are allowed to use complete domain names,
simple strings and complete email addresses as well. That means, that you might
use any of the following notations:
- www.nörgler.com
- xn--nrgler-wxa
- xn--brse-5qa.xn--knrz-1ra.info
Unicode input might be given as either UTF-8 string, UCS-4 string or UCS-4 array.
Unicode output is available in the same formats.
You can select your preferred format via .
ACE input and output is always expected to be ASCII.
- Author: Matthias Sommerfeld <mso@phlylabs.de>
- Copyright: 2004-2011 phlyLabs Berlin, http://phlylabs.de
Synopsis
- // members
- protected $_punycode_prefix;
- protected $_invalid_ucs;
- protected $_max_ucs;
- protected $_base;
- protected $_tmin;
- protected $_tmax;
- protected $_skew;
- protected $_damp;
- protected $_initial_bias;
- protected $_initial_n;
- protected $_sbase;
- protected $_lbase;
- protected $_vbase;
- protected $_tbase;
- protected $_lcount;
- protected $_vcount;
- protected $_tcount;
- protected $_ncount;
- protected $_scount;
- protected $_error;
- protected $_mb_string_overload;
- protected $_api_encoding;
- protected $_allow_overlong;
- protected $_strict_mode;
- protected $_idn_version;
- protected $NP;
- // methods
- public boolean __construct()
- public boolean set_parameter()
- public string decode()
- public string encode()
- public string encode_uri()
- public string get_last_error()
- protected mixed _decode()
- protected mixed _encode()
- protected int _adapt()
- protected string _encode_digit()
- protected int _decode_digit()
- protected void _error()
- protected string _nameprep()
- protected array _hangul_decompose()
- protected array _hangul_compose()
- protected integer _get_combining_class()
- protected array _apply_cannonical_ordering()
- protected array _combine()
- protected string _utf8_to_ucs4()
- protected string _ucs4_to_utf8()
- protected string _ucs4_to_ucs4_string()
- protected array _ucs4_string_to_ucs4()
- protected static integer byteLength()
- public idna_convert getInstance()
- public singleton()
Hierarchy
Extended by
Members
protected
-
$NP
Holds all relevant mapping tables See RFC3454 for details - $_allow_overlong
- $_api_encoding
- $_base
- $_damp
- $_error
- $_idn_version
- $_initial_bias
- $_initial_n
- $_invalid_ucs
- $_lbase
- $_lcount
- $_max_ucs
- $_mb_string_overload
- $_ncount
- $_punycode_prefix
- $_sbase
- $_scount
- $_skew
- $_strict_mode
- $_tbase
- $_tcount
- $_tmax
- $_tmin
- $_vbase
- $_vcount
Methods
protected
- _adapt() — Adapt the bias according to the current code point and position
- _apply_cannonical_ordering() — Applies the cannonical ordering of a decomposed UCS4 sequence
- _combine() — Do composition of a sequence of starter and non-starter
- _decode() — The actual decoding algorithm
- _decode_digit() — Decode a certain digit
- _encode() — The actual encoding algorithm
- _encode_digit() — Encoding a certain digit
- _error() — Internal error handling method
- _get_combining_class() — Returns the combining class of a certain wide char
- _hangul_compose() — Ccomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul
- _hangul_decompose() — Decomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul
- _nameprep() — Do Nameprep according to RFC3491 and RFC3454
- _ucs4_string_to_ucs4() — Convert UCS-4 strin into UCS-4 garray
- _ucs4_to_ucs4_string() — Convert UCS-4 array into UCS-4 string
- _ucs4_to_utf8() — Convert UCS-4 string into UTF-8 string See _utf8_to_ucs4() for details
- _utf8_to_ucs4() — This converts an UTF-8 encoded string to its UCS-4 representation By talking about UCS-4 "strings" we mean arrays of 32bit integers representing each of the "chars". This is due to PHP not being able to handle strings with bit depth different from 8. This apllies to the reverse method _ucs4_to_utf8(), too.
- byteLength() — Gets the length of a string in bytes even if mbstring function overloading is turned on
public
- __construct() — the constructor
- decode() — Decode a given ACE domain name
- encode() — Encode a given UTF-8 domain name
- encode_uri() — Removes a weakness of encode(), which cannot properly handle URIs but instead encodes their path or query components, too.
- getInstance() — Attempts to return a concrete IDNA instance.
- get_last_error() — Use this method to get the last error ocurred
- set_parameter() — Sets a new option value. Available options and values: [encoding - Use either UTF-8, UCS4 as array or UCS4 as string as input ('utf8' for UTF-8, 'ucs4_string' and 'ucs4_array' respectively for UCS4); The output is always UTF-8] [overlong - Unicode does not allow unnecessarily long encodings of chars, to allow this, set this parameter to true, else to false; default is false.] [strict - true: strict mode, good for registration purposes - Causes errors on failures; false: loose mode, ideal for "wildlife" applications by silently ignoring errors and returning the original input instead
- singleton() — Attempts to return a concrete IDNA instance for either php4 or php5, only creating a new instance if no IDNA instance with the same parameters currently exists.