Interface IEncoding
Describes the result of a tokenization pipeline execution.
Namespace: Unity.InferenceEngine.Tokenization
Assembly: Unity.InferenceEngine.Tokenization.dll
Syntax
public interface IEncoding
Properties
Length
The number of tokens.
Declaration
int Length { get; }
Property Value
| Type | Description |
|---|---|
| int |
Overflow
In case the tokenization pipeline produces more tokens than the expected size, the following tokens are stored into another IEncoding instance. This overflow can also define its own overflow, similarly to a linked list.
Declaration
IEncoding Overflow { get; }
Property Value
| Type | Description |
|---|---|
| IEncoding |
Methods
GetAttentionMask()
The attention mask. When a tokenization requires truncation and padding, this mask indicates which tokens are the most relevant.
Declaration
IReadOnlyList<int> GetAttentionMask()
Returns
| Type | Description |
|---|---|
| IReadOnlyList<int> | The number of available tokens. |
GetAttentionMask(ICollection<int>)
The attention mask. When a tokenization requires truncation and padding, this mask indicates which tokens are the most relevant.
Declaration
int GetAttentionMask(ICollection<int> output)
Parameters
| Type | Name | Description |
|---|---|---|
| ICollection<int> | output | The target container of attention state. |
Returns
| Type | Description |
|---|---|
| int | The number of available tokens. |
GetEncodings()
Presents encodings one by one, starting by the main, then overflowing sequences.
Declaration
IEnumerable<IEncoding> GetEncodings()
Returns
| Type | Description |
|---|---|
| IEnumerable<IEncoding> | Main encoding followed by its overflowing sequences. |
GetIds()
The list of token ids.
Declaration
IReadOnlyList<int> GetIds()
Returns
| Type | Description |
|---|---|
| IReadOnlyList<int> | The number of available tokens. |
GetIds(ICollection<int>)
The list of token ids.
Declaration
int GetIds(ICollection<int> output)
Parameters
| Type | Name | Description |
|---|---|---|
| ICollection<int> | output | The target container of ids. |
Returns
| Type | Description |
|---|---|
| int | The number of available tokens. |
GetOffsets()
The token offsets.
Declaration
IReadOnlyList<Range> GetOffsets()
Returns
| Type | Description |
|---|---|
| IReadOnlyList<Range> | The number of available tokens. |
GetOffsets(ICollection<Range>)
The token offsets.
Declaration
int GetOffsets(ICollection<Range> output)
Parameters
| Type | Name | Description |
|---|---|---|
| ICollection<Range> | output | The target container of offsets. |
Returns
| Type | Description |
|---|---|
| int | The number of available tokens. |
GetSpecialMask()
The special tokens mask
Declaration
IReadOnlyList<int> GetSpecialMask()
Returns
| Type | Description |
|---|---|
| IReadOnlyList<int> | The number of available tokens. |
GetSpecialMask(ICollection<int>)
The special tokens mask
Declaration
int GetSpecialMask(ICollection<int> output)
Parameters
| Type | Name | Description |
|---|---|---|
| ICollection<int> | output | The target container of special states. |
Returns
| Type | Description |
|---|---|
| int | The number of available tokens. |
GetTokens()
The list of tokens.
Declaration
IReadOnlyList<Token> GetTokens()
Returns
| Type | Description |
|---|---|
| IReadOnlyList<Token> | The number of available tokens. |
GetTokens(ICollection<Token>)
The list of tokens.
Declaration
int GetTokens(ICollection<Token> output)
Parameters
| Type | Name | Description |
|---|---|---|
| ICollection<Token> | output | The target container of tokens. |
Returns
| Type | Description |
|---|---|
| int | The number of available tokens. |
GetTypeIds()
The type ids.
Declaration
IReadOnlyList<int> GetTypeIds()
Returns
| Type | Description |
|---|---|
| IReadOnlyList<int> | The number of available tokens. |
GetTypeIds(ICollection<int>)
The type ids.
Declaration
int GetTypeIds(ICollection<int> output)
Parameters
| Type | Name | Description |
|---|---|---|
| ICollection<int> | output | The target container of type ids. |
Returns
| Type | Description |
|---|---|
| int | The number of available tokens. |
GetValues()
The list of token ids.
Declaration
IReadOnlyList<string> GetValues()
Returns
| Type | Description |
|---|---|
| IReadOnlyList<string> | The number of available tokens. |
GetValues(ICollection<string>)
The list of token ids.
Declaration
int GetValues(ICollection<string> output)
Parameters
| Type | Name | Description |
|---|---|---|
| ICollection<string> | output | The target container of values. |
Returns
| Type | Description |
|---|---|
| int | The number of available tokens. |