Unicode Utilities 2.24 C/C++ script

SPONSORED LINKS

    Specification

  • Version: 2.24
  • File size: 0 KB
  • File name: unidesc.tgz
  • Last update:
  • Platform: Linux / BSD
  • Language: C/C++
  • Price:GPL
  • Company: Bill Poser (View more)

Unicode Utilities 2.24 script description:




Publisher review:
Unicode Utilities package consists of a set of programs for manipulating and analyzing Unicode text. This package consists of a set of programs for manipulating and analyzing Unicode text. The analysis utilities are useful when working with Unicode files when one doesn't know the writing system, doesn't have the necessary font, needs to inspect invisible characters, needs to find out whether characters have been combined or in what order they occur, or needs statistics on which characters occur.

uniname defaults to printing the character offset of each character, its byte offset, its hex code value, its encoding, the glyph itself, and its name. Command line options allow undesired information to be suppressed and the Unicode range to be added. Other options permit a specified number of bytes or characters to be skipped.

unidesc reports the character ranges to which different portions of the text belong. It can also be used to identify Unicode encodings (e.g. UTF-16be) flagged by magic numbers.

unihist generates a histogram of the characters in its input, which must be encoded in UTF-8 Unicode. By default, for each character it prints the frequency of the character as a percentage of the total, the absolute number of tokens in the input, the UTF-32 code in hexadecimal, and, if the character is displayable, the glyph itself as UTF-8 Unicode. Command line flags allow unwanted information to be suppressed. In particular, note that by suppressing the percentages and counts it is possible to generate a list of the unique characters in the input.

ExplicateUTF8 is intended for debugging or for learning about Unicode. It determines and explains the validity of a sequence of bytes as a UTF8 encoding.

Utf8lookup is a shell script which invokes uniname to provide an easy way to look up the character name corresponding to a codepoint from the command line.

Unirev is a filter that reverses UTF-8 strings character-by-character (as opposed to byte-by-byte). This is useful when dealing with text that is not encoded in the order in which you want to display it or analyze it. For example, if you want to display Arabic on a terminal window that does not support bidi text, Unirev will put it into the normal display order.
Unicode Utilities 2.24 is a C/C++ script for Joomla / Mambo Modules scripts design by Bill Poser. It runs on following operating system: Linux / BSD.
Unicode Utilities package consists of a set of programs for manipulating and analyzing Unicode text.

Operating system:
Linux / BSD

Latest script and internet news

222

222

22

Posted on: 18 Jul 2023 22:27 by A. Brown

111

111

111

Posted on: 18 Jul 2023 22:24 by A. Brown

The permanently active Push system offered by the new Google Chrome 42

The permanently active Push system offered by the new Google Chrome 42

Hacked By !Sc-sT

Posted on: 17 Mar 2015 07:57 by A. Brown

SPREAD THE WORD

User Rating


Rating: 2.2 out of 5
Based on 13 ratings. 13 user reviews.

  • Currently 2.15 out of 5
  • 1
  • 2
  • 3
  • 4
  • 5