ReadStringFormat

ReadStringFormat()

Syntax

Result = ReadStringFormat(#File)

Description

Checks if the current file position contains a BOM (Byte Order Mark) and tries to identify the String encoding used in the file.

Parameters

#File The file to use.

Return value

Returns one of the following values:
  #PB_Ascii  : No BOM detected. This usually means a plain text file.
  #PB_UTF8   : UTF-8 BOM detected.
  #PB_Unicode: UTF-16 (little endian) BOM detected.

  #PB_UTF16BE: UTF-16 (big endian) BOM detected.
  #PB_UTF32  : UTF-32 (little endian) BOM detected.
  #PB_UTF32BE: UTF-32 (big endian) BOM detected.
The #PB_Ascii, #PB_UTF8 and #PB_Unicode results may be used directly in further calls to ReadString() to read the file. The other results represent string formats that cannot be directly read with PureBasic string functions. They are included for completeness so that an application can display a proper error-message.

Remarks

If a BOM is detected, the file pointer will be placed at the end of the BOM. If no BOM is detected, the file pointer remains unchanged.

The Byte Order Mark is a commonly used way to indicate the encoding for a textfile. It is usually placed at the beginning of the file. It is however not a standard, just a commonly used practice. So if no BOM is detected at the start of a file, it does not necessarily mean that it is a plain text file. It could also just mean that the program that created the file did not use this practice. WriteStringFormat() may be used to place such a BOM in a file.

For more information, see this Wikipedia Article.
More information about using unicode in a PureBasic program can also be found here.

See Also

WriteStringFormat(), ReadString(), OpenFile(), ReadFile()

Supported OS

All

<- ReadString() - File Index - ReadUnicodeCharacter() ->