Class StripNormalizer
Normalizes text by removing leading and/or trailing whitespace characters.
Implements
Inherited Members
Namespace: Unity.InferenceEngine.Tokenization.Normalizers
Assembly: Unity.InferenceEngine.Tokenization.dll
Syntax
public class StripNormalizer : INormalizer
Remarks
This normalizer strips whitespace from the beginning and/or end of a substring based on the configured options. Whitespace is determined using IsWhiteSpace(char), which includes spaces, tabs, newlines, carriage returns, and other Unicode whitespace characters. Whitespace within the content (between non-whitespace characters) is preserved.
Examples
// Strip both sides (default)
var normalizer = new StripNormalizer();
var result = normalizer.Normalize(new SubString(" hello "));
// result: "hello"
// Strip left only
var leftNormalizer = new StripNormalizer(left: true, right: false);
var result = leftNormalizer.Normalize(new SubString(" hello "));
// result: "hello "
// Strip right only
var rightNormalizer = new StripNormalizer(left: false, right: true);
var result = rightNormalizer.Normalize(new SubString(" hello "));
// result: " hello"
Constructors
StripNormalizer(bool, bool)
Initializes a new instance of the StripNormalizer class.
Declaration
public StripNormalizer(bool left = true, bool right = true)
Parameters
| Type | Name | Description |
|---|---|---|
| bool | left | If |
| bool | right | If |
Remarks
Setting both parameters to false will result in no normalization being performed.
Methods
Normalize(SubString)
Applies transformations to the input string before pre-tokenization.
Declaration
public SubString Normalize(SubString input)
Parameters
| Type | Name | Description |
|---|---|---|
| SubString | input | The string to transform. |
Returns
| Type | Description |
|---|---|
| SubString | The resulting string. |