Struct Unicode
Provides utility methods for UTF-8, UTF-16, UCS-4 (a.k.a. UTF-32), and WTF-8.
Namespace: Unity.Collections
Assembly: Unity.Collections.dll
Syntax
public struct Unicode
Fields
kMaximumValidCodePoint
The maximum value of a valid UNICODE code point
Declaration
public const int kMaximumValidCodePoint = 1114111
Field Value
Type | Description |
---|---|
int |
Properties
BadRune
The null rune value.
Declaration
public static Unicode.Rune BadRune { get; }
Property Value
Type | Description |
---|---|
Unicode.Rune | The null rune value. |
Remarks
In this package, the "bad rune" is used as a null character. It represents no valid code point.
ReplacementCharacter
The Unicode character �.
Declaration
public static Unicode.Rune ReplacementCharacter { get; }
Property Value
Type | Description |
---|---|
Unicode.Rune | The Unicode character �. |
Remarks
This character is used to stand-in for characters that can't be rendered.
Methods
IsValidCodePoint(int)
Returns true if a code point is valid.
Declaration
public static bool IsValidCodePoint(int codepoint)
Parameters
Type | Name | Description |
---|---|---|
int | codepoint | A code point. |
Returns
Type | Description |
---|---|
bool | True if a code point is valid. |
NotTrailer(byte)
Returns true if the byte is not the last byte of a UTF-8 character.
Declaration
public static bool NotTrailer(byte b)
Parameters
Type | Name | Description |
---|---|---|
byte | b | The byte. |
Returns
Type | Description |
---|---|
bool | True if the byte is not the last byte of a UTF-8 character. |
UcsToUtf16(char*, ref int, int, Rune)
Writes a rune to a buffer as a UTF-16 encoded character.
Declaration
public static ConversionError UcsToUtf16(char* buffer, ref int index, int capacity, Unicode.Rune rune)
Parameters
Type | Name | Description |
---|---|---|
char* | buffer | The buffer of chars to write to. |
int | index | Reference to a char index into the buffer. If the write succeeds, index is incremented by the size in chars of the character written. If the write fails, index is not incremented. |
int | capacity | The size in chars of the buffer. Used to check that the write is in bounds. |
Unicode.Rune | rune | The rune to encode. |
Returns
Type | Description |
---|---|
ConversionError | None if the write succeeds. Otherwise, returns CodePoint, Overflow, or Encoding. |
UcsToUtf8(byte*, ref int, int, Rune)
Writes a rune to a buffer as a UTF-8 encoded character.
Declaration
public static ConversionError UcsToUtf8(byte* buffer, ref int index, int capacity, Unicode.Rune rune)
Parameters
Type | Name | Description |
---|---|---|
byte* | buffer | The buffer to write to. |
int | index | Reference to a byte index into the buffer. If the write succeeds, index is incremented by the size in bytes of the character written. If the write fails, index is not incremented. |
int | capacity | The size in bytes of the buffer. Used to check that the write is in bounds. |
Unicode.Rune | rune | The rune to encode. |
Returns
Type | Description |
---|---|
ConversionError | None if the write succeeds. Otherwise, returns CodePoint, Overflow, or Encoding. |
Utf16ToUcs(out Rune, char*, ref int, int)
Reads a UTF-16 encoded character from a buffer.
Declaration
public static ConversionError Utf16ToUcs(out Unicode.Rune rune, char* buffer, ref int index, int capacity)
Parameters
Type | Name | Description |
---|---|---|
Unicode.Rune | rune | Outputs the character read. If the read fails, rune is not set. |
char* | buffer | The buffer of chars to read. |
int | index | Reference to a char index into the buffer. If the read succeeds, index is incremented by the size in chars of the character read. If the read fails, index is not incremented. |
int | capacity | The size in chars of the buffer. Used to check that the read is in bounds. |
Returns
Type | Description |
---|---|
ConversionError |
Utf16ToUtf8(char*, int, byte*, out int, int)
Copies UTF-16 characters from one buffer to another buffer as UTF-8.
Declaration
public static ConversionError Utf16ToUtf8(char* utf16Buffer, int utf16Length, byte* utf8Buffer, out int utf8Length, int utf8Capacity)
Parameters
Type | Name | Description |
---|---|---|
char* | utf16Buffer | The source buffer. |
int | utf16Length | The number of chars to read from the source. |
byte* | utf8Buffer | The destination buffer. |
int | utf8Length | Outputs the number of bytes written to the destination. |
int | utf8Capacity | The size in bytes of the destination buffer. |
Returns
Type | Description |
---|---|
ConversionError | None if the copy fully completes. Otherwise, returns Overflow. |
Remarks
Assumes the source data is valid UTF-16.
Utf8ToUcs(out Rune, byte*, ref int, int)
Reads a UTF-8 encoded character from a buffer.
Declaration
public static ConversionError Utf8ToUcs(out Unicode.Rune rune, byte* buffer, ref int index, int capacity)
Parameters
Type | Name | Description |
---|---|---|
Unicode.Rune | rune | Outputs the character read. If the read fails, outputs ReplacementCharacter. |
byte* | buffer | The buffer of bytes to read. |
int | index | Reference to a byte index into the buffer. If the read succeeds, index is incremented by the size in bytes of the character read. If the read fails, index is incremented by 1. |
int | capacity | The size in bytes of the buffer. Used to check that the read is in bounds. |
Returns
Type | Description |
---|---|
ConversionError | None if the read succeeds. Otherwise, returns Overflow or Encoding. |
Utf8ToUtf16(byte*, int, char*, out int, int)
Copies UTF-8 characters from one buffer to another as UTF-16.
Declaration
public static ConversionError Utf8ToUtf16(byte* utf8Buffer, int utf8Length, char* utf16Buffer, out int utf16Length, int utf16Capacity)
Parameters
Type | Name | Description |
---|---|---|
byte* | utf8Buffer | The source buffer. |
int | utf8Length | The number of bytes to read from the source. |
char* | utf16Buffer | The destination buffer. |
int | utf16Length | Outputs the number of chars written to the destination. |
int | utf16Capacity | The size in chars of the destination buffer. |
Returns
Type | Description |
---|---|
ConversionError |
Remarks
Assumes the source data is valid UTF-8.
Utf8ToUtf8(byte*, int, byte*, out int, int)
Copies UTF-8 characters from one buffer to another.
Declaration
public static ConversionError Utf8ToUtf8(byte* srcBuffer, int srcLength, byte* destBuffer, out int destLength, int destCapacity)
Parameters
Type | Name | Description |
---|---|---|
byte* | srcBuffer | The source buffer. |
int | srcLength | The number of bytes to read from the source. |
byte* | destBuffer | The destination buffer. |
int | destLength | Outputs the number of bytes written to the destination. |
int | destCapacity | The size in bytes of the destination buffer. |
Returns
Type | Description |
---|---|
ConversionError | None if the copy fully completes. Otherwise, returns Overflow. |
Remarks
Assumes the source data is valid UTF-8.