time-to-botec

Benchmark sampling in different programming languages
Log | Files | Refs | README

repl.txt (926B)


      1 
      2 {{alias}}( str )
      3     Converts a UTF-16 encoded string to an array of integers using UTF-8
      4     encoding.
      5 
      6     The following byte sequences are used to represent a character. The sequence
      7     depends on the code point:
      8 
      9         0x00000000 - 0x0000007F:
     10             0xxxxxxx
     11 
     12         0x00000080 - 0x000007FF:
     13             110xxxxx 10xxxxxx
     14 
     15         0x00000800 - 0x0000FFFF:
     16             1110xxxx 10xxxxxx 10xxxxxx
     17 
     18         0x00010000 - 0x001FFFFF:
     19             11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
     20 
     21     The `x` bit positions correspond to code point bits.
     22 
     23     Only the shortest possible multi-byte sequence which can represent a code
     24     point is used.
     25 
     26     Parameters
     27     ----------
     28     str: string
     29         UTF-16 encoded string.
     30 
     31     Returns
     32     -------
     33     out: Array
     34         Array of integers.
     35 
     36     Examples
     37     --------
     38     > var str = '☃';
     39     > var out = {{alias}}( str )
     40     [ 226, 152, 131 ]
     41 
     42     See Also
     43     --------
     44