| Allegro.pas 5.2.0Introduction Units Class Hierarchy Classes, Interfaces, Objects and Records Types Variables Constants Functions and Procedures Identifiers | Unit al5strings
Uses Classes, Interfaces, Objects and Records Constants Variables Description
Functions to integrate Pascal Stringwith Allegro AL_STR. Also implements Allegro's UNICODE support. About string manipulation
 By default, Delphi RTL libraries defines STRINGasUNICODESTRING. Since Allegro expectsANSISTRINGthis means you should use convert funcions asUTFToStringandUTF8Encodeto work properly, wich makes such operations non compatible with Free Pascal. This unit defines a collection of functions and procedures that works like RTL string manipulation ones (i.e. SysUtilsandStringsunit) but using theAL_STRtype, ensuring your code will work both Delphi and Free Pascal without changes. It includes a few conversion functions as well if you need them. About UTF-8 string routines
 Some parts of the Allegro API, such as the font routines, expect Unicode strings encoded in UTF-8. The UTF8 basic routines are provided to help you work with UTF-8 strings, however it does not mean you need to use them.
 Briefly, Unicode is a standard consisting of a large character set of over 100,000 characters, and rules, such as how to sort strings. A code point is the integer value of a character, but not all code points are characters, as some code points have other uses. Unlike legacy character sets, the set of code points is open ended and more are assigned with time.
 Clearly it is impossible to represent each code point with a 8-bit byte (limited to 256 code points) or even a 16-bit integer (limited to 65536 code points). It is possible to store code points in a 32-bit integers but it is space inefficient, and not actually that useful (at least, when handling the full complexity of Unicode; Allegro only does the very basics). There exist different Unicode Transformation Formats for encoding code points into smaller code units. The most important transformation formats are UTF-8 and UTF-16.
 UTF-8 is a variable-length encoding which encodes each code point to between one and four 8-bit bytes each. UTF-8 has many nice properties, but the main advantages are that it is backwards compatible with C strings, and ASCII characters (code points in the range 0-127) are encoded in UTF-8 exactly as they would be in ASCII.
 UTF-16 is another variable-length encoding, but encodes each code point to one or two 16-bit words each. It is, of course, not compatible with traditional C strings. Allegro does not generally use UTF-16 strings.
 Here is a diagram of the representation of the word "ål", with a NUL terminator, in both UTF-8 and UTF-16.
 
  
    | String | å | l | NUL |  
    | Code points | U+00E5 (229) | U+006C (108) | U+0000 (0) |  
    | UTF-8 bytes | 0xC3, 0xA5 | 0x6C | 0x00 |  
    | UTF-16LE bytes | 0xE5, 0x00 | 0x6C, 0x00 | 0x00, 0x00 |  
 You can see the aforementioned properties of UTF-8. The first code point U+00E5 ("å") is outside of the ASCII range (0-127) so is encoded to multiple code units – it requires two bytes. U+006C ("l") and U+0000 (NUL) both exist in the ASCII range so take exactly one byte each, as in a pure ASCII string. A zero byte never appears except to represent the NUL character, so many functions which expect C-style strings will work with UTF-8 strings without modification.
 On the other hand, UTF-16 represents each code point by either one or two 16-bit code units (two or four bytes). The representation of each 16-bit code unit depends on the byte order; here we have demonstrated little endian.
 Both UTF-8 and UTF-16 are self-synchronising. Starting from any offset within a string, it is efficient to find the beginning of the previous or next code point.
 Not all sequences of bytes or 16-bit words are valid UTF-8 and UTF-16 strings respectively. UTF-8 also has an additional problem of overlong forms, where a code point value is encoded using more bytes than is strictly necessary. This is invalid and needs to be guarded against.
 In the "ustr" functions, be careful whether a function takes code unit (byte) or code point indices. In general, all position parameters are in code unit offsets. This may be surprising, but if you think about it, it is required for good performance. (It also means some functions will work even if they do not contain UTF-8, since they only care about storing bytes, so you may actually store arbitrary data in the ALLEGRO_USTRs.)
 For actual text processing, where you want to specify positions with code point indices, you should use al_ustr_offset to find the code unit offset position. However, most of the time you would probably just work with byte offsets.  OverviewFunctions and ProceduresTypesDescriptionFunctions and Procedures
| function al_string_to_str (const aString: ShortString): AL_STR; overload; inline; |  |  |  
| function al_string_to_str (const aString: AnsiString): AL_STR; overload; inline; |  |  |  
| function al_string_to_str (const aString: UnicodeString): AL_STR; overload; inline; |  | 
Converts Pascal strings to AL_STR. |  
| function al_str_to_string (const aString: AL_STR): String; overload; inline; |  |  |  
| function al_str_to_string (const aString: AL_STRptr): String; overload; inline; |  | 
Converts AL_STR or AL_STRptr to a STRING. |  
| function al_str_to_shortstring (const aString: AL_STR): ShortString; overload; inline; |  |  |  
| function al_str_to_shortstring (const aString: AL_STRptr): ShortString; overload; inline; |  | 
Converts AL_STR or AL_STRptr to a Pascal string. |  
| function al_str_to_ansistring (const aString: AL_STR): AnsiString; overload; inline; |  |  |  
| function al_str_to_ansistring (const aString: AL_STRptr): AnsiString; overload; inline; |  | 
Converts AL_STR or AL_STRptr to an ANSISTRING. |  
| function al_str_to_unicodestring (const aString: AL_STR): UnicodeString; overload; inline; |  |  |  
| function al_str_to_unicodestring (const aString: AL_STRptr): UnicodeString; overload; inline; |  | 
Converts AL_STR or AL_STRptr to an UNICODESTRING. |  
| function al_str_format (const aFmt: AL_STR; const aArgs : array of const) : AL_STR; |  | 
Formats a string with given arguments.
 It works exactly like RTL SysUtils.Formatbut using AL_STR instead ofSTRING. |  
| function al_ustr_new (const s: AL_STR): ALLEGRO_USTRptr; CDECL; external ALLEGRO_LIB_NAME; |  | 
Creates a new string containing a copy of the C-style string s. The string must eventually be freed with al_ustr_free. See also
  al_ustr_new_from_bufferCreates a new string containing a copy of the buffer pointed to by sof the givensizein bytes.al_ustr_assignOverwrites the string us1with another stringus2.al_ustr_dupReturns a duplicate copy of a string. |  
| function al_ustr_new_from_buffer (const s: AL_STRptr; size: AL_SIZE_T): ALLEGRO_USTRptr; CDECL; external ALLEGRO_LIB_NAME; |  | 
Creates a new string containing a copy of the buffer pointed to by sof the givensizein bytes. The string must eventually be freed with al_ustr_free. See also
  al_ustr_newCreates a new string containing a copy of the C-style string s. |  
| procedure al_ustr_free (us: ALLEGRO_USTRptr); CDECL; external ALLEGRO_LIB_NAME; |  | 
Frees a previously allocated string. Does nothing if the argument is Nil. See also
  al_ustr_newCreates a new string containing a copy of the C-style string s.al_ustr_new_from_bufferCreates a new string containing a copy of the buffer pointed to by sof the givensizein bytes. |  
| function al_cstr (const us: ALLEGRO_USTRptr): AL_STRptr; CDECL; external ALLEGRO_LIB_NAME; |  | 
Gets a AL_STRptrpointer to the data in a string. This pointer will only be valid while theALLEGRO_USTRobject is not modified and not destroyed. The pointer may be passed to functions expecting C-style strings, with the following caveats: 
  ALLEGRO_USTRs are allowed to contain embeddedNUL($00) bytes. That meansal_ustr_size (u)andLength (al_cstr (u))may not agree.
An ALLEGRO_USTRmay be created in such a way that it is notNULterminated. A string which is dynamically allocated will always beNULterminated, but a string which references the middle of another string or region of memory will not beNULterminated.If the ALLEGRO_USTRreferences another string, the returned C string will point into the referenced string. Again, noNULterminator will be added to the referenced string.     See also
  al_ustr_to_bufferWrites the contents of the string into a pre-allocated buffer of the given size in bytes.al_cstr_dupCreates a NUL($00) terminated copy of the string.al_ustr_assign_cstrOverwrites the string us1with the contents of the strings. |  
| procedure al_ustr_to_buffer (const us: ALLEGRO_USTRptr; buffer: AL_STRptr; size: AL_INT); CDECL; external ALLEGRO_LIB_NAME; |  | 
Writes the contents of the string into a pre-allocated buffer of the given size in bytes. The result will always be NULterminated, so a maximum ofsize - 1bytes will be copied. See also
  al_cstrGets a AL_STRptrpointer to the data in a string.al_cstr_dupCreates a NUL($00) terminated copy of the string. |  
| function al_cstr_dup (const us: ALLEGRO_USTRptr): AL_STRptr; CDECL; external ALLEGRO_LIB_NAME; |  | 
Creates a NUL($00) terminated copy of the string. Any embeddedNULbytes will still be presented in the returned string. The new string must eventually be freed withal_free. If an error occurs Nilis returned. See also
  al_cstrGets a AL_STRptrpointer to the data in a string.al_ustr_to_bufferWrites the contents of the string into a pre-allocated buffer of the given size in bytes.al_freeLike FreeMem, releases the memory occupied by pointerp. |  
| function al_ustr_dup (const us: ALLEGRO_USTRptr): ALLEGRO_USTRptr; CDECL; external ALLEGRO_LIB_NAME; |  | 
Returns a duplicate copy of a string. The new string will need to be freed with al_ustr_free. See also
  al_ustr_dup_substrReturns a new copy of a string, containing its contents in the byte interval [start_pos, end_pos).al_ustr_freeFrees a previously allocated string. |  
| function al_ustr_dup_substr (const us: ALLEGRO_USTRptr; start_pos, end_pos: AL_INT): ALLEGRO_USTRptr; CDECL; external ALLEGRO_LIB_NAME; |  | 
Returns a new copy of a string, containing its contents in the byte interval [start_pos, end_pos). The new string will beNULterminated and will need to be freed withal_ustr_free. If necessary, use al_ustr_offset to find the byte offsets for a given code point that you are interested in.  
 Note
 This is used because the way the C language works. I didn't test if Pascal do need this kind of stuff. Future versions of Allegro.pas would not include this function, so don't use it unless your really need to (and tell me if you really need it to remove this warning from documentation). See also
  al_ustr_dupReturns a duplicate copy of a string.al_ustr_freeFrees a previously allocated string. |  
| function al_ustr_empty_string: ALLEGRO_USTRptr; CDECL; external ALLEGRO_LIB_NAME; |  | 
Returns a pointer to a static empty string. The string is read only and must not be freed. |  
| function al_ref_cstr (out info: ALLEGRO_USTR_INFO; const s: AL_STR): ALLEGRO_USTRptr; CDECL; external ALLEGRO_LIB_NAME; |  | 
Creates a string that references the storage of a C-style string. The information about the string (e.g. its size) is stored in the infoparameter. The string will not have any other storage allocated of its own, so if you allocate the info structure on the stack then no explicit "free" operation is required. The string is valid until the underlying C string disappears.
 Example:  
VAR
  Info: ALLEGRO_USTR_INFO;
  us: ALLEGRO_USTRptr;
BEGIN
  us := al_ref_cstr (Info, 'my string')
END;    See also
  al_ref_bufferCreates a string that references the storage of an underlying buffer.al_ref_ustrCreates a read-only string that references the storage of another ALLEGRO_USTR string. |  
| function al_ref_buffer (out info: ALLEGRO_USTR_INFO; const s: AL_STRptr; size: AL_SIZE_T): ALLEGRO_USTRptr; CDECL; external ALLEGRO_LIB_NAME; |  | 
Creates a string that references the storage of an underlying buffer. The size of the buffer is given in bytes. You can use it to reference only part of a string or an arbitrary region of memory.
 The string is valid while the underlying memory buffer is valid.   See also
  al_ref_cstrCreates a string that references the storage of a C-style string.al_ref_ustrCreates a read-only string that references the storage of another ALLEGRO_USTR string. |  
| function al_ref_ustr (out info: ALLEGRO_USTR_INFO; const us: ALLEGRO_USTRptr; star_pos, end_pos: AL_INT): ALLEGRO_USTRptr; CDECL; external ALLEGRO_LIB_NAME; |  | 
Creates a read-only string that references the storage of another ALLEGRO_USTR string. The information about the string (e.g. its size) is stored in the structure pointed to by the infoparameter. The new string will not have any other storage allocated of its own, so if you allocate the info structure on the stack then no explicit "free" operation is required. The referenced interval is [start_pos, end_pos). Both are byte offsets. The string is valid until the underlying string is modified or destroyed.
 If you need a range of code-points instead of bytes, use al_ustr_offset to find the byte offsets.   See also
  al_ref_cstrCreates a string that references the storage of a C-style string.al_ref_bufferCreates a string that references the storage of an underlying buffer. |  
| function al_ustr_size (const us: ALLEGRO_USTRptr): AL_SIZE_T; CDECL; external ALLEGRO_LIB_NAME; |  | 
Returns the size of the string in bytes. This is equal to the number of code points in the string if the string is empty or contains only 7-bit ASCII characters.  See also
  al_ustr_lengthReturns the number of code points in the string. |  
| function al_ustr_length (const us: ALLEGRO_USTRptr): AL_SIZE_T; CDECL; external ALLEGRO_LIB_NAME; |  | 
Returns the number of code points in the string.   See also
  al_ustr_sizeReturns the size of the string in bytes.al_ustr_offsetReturns the byte offset (from the start of the string) of the code point at the specified indexin the string. |  
| function al_ustr_offset (const us: ALLEGRO_USTRptr;index: AL_INT): AL_INT; CDECL; external ALLEGRO_LIB_NAME; |  | 
Returns the byte offset (from the start of the string) of the code point at the specified indexin the string. A zero index parameter will return the first character of the string. Ifindexis negative, it counts backward from the end of the string, so an index of -1 will return an offset to the last code point. If the indexis past the end of the string, returns the offset of the end of the string. See also
  al_ustr_lengthReturns the number of code points in the string. |  
| function al_ustr_next (const us: ALLEGRO_USTRptr; var aPos: AL_INT): AL_BOOL; CDECL; external ALLEGRO_LIB_NAME; |  | 
Finds the byte offset of the next code point in string, beginning at aPos.aPosdoes not have to be at the beginning of a code point. This function just looks for an appropriate byte; it doesn't check if found offset is the beginning of a valid code point. If you are working with possibly invalid UTF-8 strings then it could skip over some invalid bytes.   ReturnsTrueon success, andaPoswill be updated to the found offset. Otherwise returnsFalseifaPoswas already at the end of the string, andaPosis unmodified.
 See also
  al_ustr_prevFinds the byte offset of the previous code point in string, before aPos. |  
| function al_ustr_prev (const us: ALLEGRO_USTRptr; var aPos: AL_INT): AL_BOOL; CDECL; external ALLEGRO_LIB_NAME; |  | 
Finds the byte offset of the previous code point in string, before aPos.aPosdoes not have to be at the beginning of a code point. This function just looks for an appropriate byte; it doesn't check if found offset is the beginning of a valid code point. If you are working with possibly invalid UTF-8 strings then it could skip over some invalid bytes.   ReturnsTrueon success, andaPoswill be updated to the found offset. Otherwise returnsFalseifaPoswas already at the end of the string, andaPosis unmodified.
 See also
  al_ustr_nextFinds the byte offset of the next code point in string, beginning at aPos. |  
| function al_ustr_insert_chr (us: ALLEGRO_USTRptr; aPos: AL_INT; c: AL_INT32) : AL_SIZE_T; CDECL; external ALLEGRO_LIB_NAME; |  | 
Inserts a code point into usbeginning at byte offsetaPos.aPoscannot be less than 0. IfaPosis past the end ofusthen the space between the end of the string andaPoswill be padded withNUL('\0') bytes. ReturnsThe number of bytes inserted, or 0 on error. See also
  al_ustr_offsetReturns the byte offset (from the start of the string) of the code point at the specified indexin the string.al_ustr_remove_chrRemoves the code point beginning at byte offset pos. |  
| function al_ustr_remove_chr (us: ALLEGRO_USTRptr; apos: AL_INT): AL_BOOL; CDECL; external ALLEGRO_LIB_NAME; |  | 
Removes the code point beginning at byte offset pos.
 Use al_ustr_offset to find the byte offset for a code-points offset.    ReturnsTrueon success. Ifaposis out of range oraposis not the beginning of a valid code point, returnsFalseleaving the string unmodified.
 See also
  al_ustr_offsetReturns the byte offset (from the start of the string) of the code point at the specified indexin the string.al_ustr_insert_chrInserts a code point into usbeginning at byte offsetaPos. |  
| function al_ustr_assign (us1: ALLEGRO_USTRptr; const us2: ALLEGRO_USTRptr): AL_BOOL; CDECL; external ALLEGRO_LIB_NAME; |  | 
Overwrites the string us1with another stringus2. ReturnsTrueon success,Falseon error.
 See also
  al_ustr_assign_cstrOverwrites the string us1with the contents of the strings. |  
| function al_ustr_assign_cstr (us1: ALLEGRO_USTRptr; const s: AL_STR): AL_BOOL; CDECL; external ALLEGRO_LIB_NAME; |  | 
Overwrites the string us1with the contents of the strings. ReturnsTrueon success,Falseon error.
 See also
  al_ustr_assignOverwrites the string us1with another stringus2. |  
| function al_ustr_equal (const us1, us2: ALLEGRO_USTRptr): AL_BOOL; CDECL; external ALLEGRO_LIB_NAME; |  | 
Returns Trueif the two strings are equal. This function is more efficient than al_ustr_compare so is preferable if ordering is not important. See also
  al_ustr_compareThis function compares us1andus2by code point values. |  
| function al_ustr_compare (const u, v: ALLEGRO_USTRptr): AL_INT; CDECL; external ALLEGRO_LIB_NAME; |  | 
This function compares us1andus2by code point values. It returns zero if the strings are equal, a positive number ifus1comes afterus2, else a negative number. This does not take into account locale-specific sorting rules. For that you will need to use another library.   See also
  al_ustr_ncompareThis function compares us1andus2by code point values.al_ustr_equalReturns Trueif the two strings are equal. |  
| function al_ustr_ncompare (const u, v: ALLEGRO_USTRptr): AL_INT; CDECL; external ALLEGRO_LIB_NAME; |  | 
This function compares us1andus2by code point values. It returns zero if the strings are equal, a positive number ifus1comes afterus2, else a negative number. This does not take into account locale-specific sorting rules. For that you will need to use another library.   See also
  al_ustr_compareThis function compares us1andus2by code point values.al_ustr_equalReturns Trueif the two strings are equal. |  
| function al_utf8_width (c: AL_INT32): AL_SIZE_T; CDECL; external ALLEGRO_LIB_NAME; |  | 
Returns the number of bytes that would be occupied by the specified code point when encoded in UTF-8. This is between 1 and 4 bytes for legal code point values. Otherwise returns 0. |  Types
| ALLEGRO_USTR = _al_tagbstring; |  | 
An opaque type representing a string. ALLEGRO_USTRs normally contain UTF-8 encoded strings, but they may be used to hold any byte sequences, includingNil. |  
| ALLEGRO_USTR_INFO = _al_tagbstring; |  | 
A type that holds additional information for an ALLEGRO_USTR that references an external memory buffer.    See also
  al_ref_cstrCreates a string that references the storage of a C-style string.al_ref_bufferCreates a string that references the storage of an underlying buffer.al_ref_ustrCreates a read-only string that references the storage of another ALLEGRO_USTR string. |  Generated by PasDoc 0.15.0. Generated on 2024-11-10 15:15:06.
 |