Unicode Format and Options
Top  Previous  Next


Unicode Format

There are 3 options in the Unicode Format listbox.

   unicodeformat

Unicode (UTF-16)
This is the native format of Unicode Text File on Windows. Select this option if you are converting plain-text files

Unicode Big-Endian (UTF-16)
This is the native format of MacOS and other Big-Endian System.

UTF-8
This format is mainly for HTML files. Select this option if you are converting HTML-based files.

Please refer to Unicode Format Topic for details

Options

Add UTF-8 charset Meta Tag to HTML files

This option is valid for converting to UTF-8 encoding only. When this option is enabled, Unifier will analyze the content of HTML files and update the char-set meta tag. By default, only HTML files with the following file extensions are updated.

.htm
.php
.html
.php3
.shtml
.phtml
.asp

 

   
Tips:

You may override which file extensions are treated as HTML by Unifier with File | Preferences command. Please read Preferences section for details.


Convert HTML Character Entity to Raw Unicode


When this option is checked and a HTML file is converted, Unifier will try to convert the HTML Character Entity Reference into raw Unicode. For example,
© represent © symbol in HTML. Unifier will search for those tokens and convert all character entities to raw Unicode characters. This option has effect on HTML file only (.htm, .html, .shtml, .asp, .php, .php3, .phtml). Please read HTML Character Entity Reference in HTML section for additional information.

   
Tips:

You may override which file extensions are treated as HTML by Unifier with File | Preferences command. Please read Preferences section for details.


Add Byte-Order Mark

This option specifies if Byte-Order Mark (BOM) is added in the beginning of converted file. In most case, BOM should be present to identify that the file is encoded in Unicode and its Transformation Format (UTF). If you are not sure the usage of BOM, just keep this option checked.


See Also
Select Source Files
Setting Output Option
How does Unifier handle charset Meta Tag in HTML files?
Unicode Format
Preferences