AnyBook4Less.com | Order from a Major Online Bookstore |
![]() |
Home |  Store List |  FAQ |  Contact Us |   | ||
Ultimate Book Price Comparison Engine Save Your Time And Money |
![]() |
Title: The Unicode Standard, Version 4.0 by The Unicode Consortium, Joan Aliprand, Julie Allen, Joe Becker, Mark Davis, Michael Everson, Asmus Freytag, John Jenkins, Mike Ksar, Rick McGowan ISBN: 0-321-18578-1 Publisher: Addison-Wesley Pub Co Pub. Date: 29 August, 2003 Format: Hardcover Volumes: 1 List Price(USD): $74.99 |
Average Customer Rating: 4.75 (4 reviews)
Rating: 5
Summary: All the Languages of Man
Comment: Anyone dealing with XML or java soon runs into Unicode because this is the standard for representing characters in electronic form in those computer languages. Java, for instance, was designed from its inception to use Unicode. Earlier computer languages like C and C++ can have routines added to handle these, while C# uses XML and hence Unicode.
But chances are, when you deal with Unicode, you only deal with a subset. Often only a small subset at that, unless you are using Chinese/Japanese. Typically you work with ascii and the codes for your spoken language if that is not a Western European language. Very few of us deal with much more than this.
Which illustrates the appeal of the book. The Big Picture. ALL of Unicode. The breadth is stunning. It shows the written form of every major spoken language and many minor ones. Has the pictograms for Chinese [of course]. But also the symbols for Khmer, Canadian Aboriginal, Tamil, Syraic, et cetera, et cetera. Thumbing through this, you may encounter languages that you did not even know existed. It is one thing to say that we live in a multilingual world. But it is another to actually see it expressed comprehensively at the most basic level.
There are two audiences for this book. The first is any computer person who has to deal with issues of internationalisation.
But another audience is every Department of Languages or Cultural Anthropology in a university. If this describes your background, then you should know that you do not need facility in computing to appreciate the significance of this book. You can use it as a standard reference, akin to the Oxford English Dictionary vis-a-vis the English language. Look, ignore the computer stuff in the text. Yes, you can do this. The book groups related languages into common chapters. The explanatory text is lucid and the graphics for the languages lets you easily cross compare. Of course, at a higher level of meaning like sentences, you will need specialised texts in those languages. But to understand a language, you need to start at its letters or pictograms.
Think of this book as an index into all the languages of man.
Rating: 4
Summary: New version of one of the most-used standards
Comment: One reason for the wide acceptance of the Unicode standard is that the Unicode consortium has made it so freely available. There's no point in my discussing in detail what is in this volume when you can peruse PDF files of the entire work on the Unicode website (minus only chapter division graphics).
Browse through the book just like you would in a bookstore or library. Print out parts of it or all of it for free if you want. Well, it is free if you don't count the cost of paper (about 1500 sheets or twice that for simplex printing), cost of a binder (or maybe two binders) and the time you would have to spend punching the holes.
If you are mainly or only interested in particular sections of the standard then printing only those sections may be a reasonable thing to do.
On the other hand the price is *very* reasonable for an 8½" × 11" hardbound book with 1,462 pages. If it's the sort of book you know you want for browsing and for reference then it is likely you will want it in this nicely bound copy.
Like the previously published versions of the Unicode standard, this book is a beautiful book that is useful to those who don't need or want to get into the technical details of character properties and rules for bi-directional display and other necessary rules for displaying the characters. But for the actual use of many characters you will have to consult other lists outside the Unicode book or files, e.g. dictionaries and grammars of various languages or explanations of symbols used in various fields of mathematics.
Language and writing systems are messy and inconsistant and handling them systematically and coherently cannot be made easy. Accordingly the rules and explanations in this standard are by necessity often long and involved and couched in technical language. It can't be avoided that, for example, one must sometimes distinguish carefully between _characters_, _glyphs_, _graphemes_, _grapheme clusters_, _ligatures_ and _digraphs_ and whether one character is a _canonical equivalent_ of another character or sequence of characters or a _compatibility equivalent_ of another character or sequence of characters or just similar to another character or sequence of characters.
The Unicode character set is still a work in progress. Version 4.0 may not even approach the half-way mark in encoding every character that has been used in normal text records by human beings for which a meaning is known. No-one has ever tried to produce a list of characters on this scale before. No-one yet knows how many distinct characters there are.
But 4.0 covers 96,382 characters from *almost* every script currently used for modern languages and from some ancient scripts as well including Ugaritic cuneiform, Cretan Linear B and the ancient Cypriot syllabary. (Sumerian/Akkadian cuneiform is being worked on and Egyptian hieroglyphics will eventually follow.)
Included are a plethora of technical symbol characters including mathematical characters, chess pieces, die faces, characters needed for modern western music notation, characters needed for Byzantine music notation, ornamental dingbats and so much more. All of it is now at the fingertips of every computer user -- that is if fonts that contain the characters are installed.
Finding fonts that display some of these characters is still a problem. :-(
But it would be a worse problem if these characters weren't assigned to a common character set. The past practice of numerous special fonts for various symbols and scripts which disagreed with one another on how the characters were encoded produced a horrible mess.
Large as it is, with 40% more pages than version 3.0, the book doesn't contain the whole standard. Increasingly as the standard has expanded tabular material has been dropped from the printed volumes and replaced with references to data files available on the website or on the CD that comes with the book.
The end of section 3.2 specifies six files found as Annexes on the website and on the CD which "are essential parts of version 4.0" including an explanation of the bidirectional algorithm which appeared in the printed text for earlier releases. And there are many mentions in the printed standard of other files available on the CD or website. A binder containing printouts of this material is necessary if you want a truly complete hardcopy of the entire 4.0 standard.
Unfortunately the 4.0 HTML files are carelessly laid down on the CD with external links pointing to files on the Unicode website and not to the corresponding files on the CD. Graphics are sometimes missing though the only file I think this matters with is StandardizedVariants.html which has a number of variant character images. (The data in this short file should have been in the book).
If you work online you probably won't notice anything wrong but you also are likely not to notice that after clicking on a link you are viewing a file from the Unicode website instead of a file on the CD. That may matter in the future if you need to reference a 4.0 file and don't observe that the file you are actually looking at is from the website and is a "latest version" file that has been updated beyond 4.0. If you are working offline you can avoid this, but it is annoying to have to manually search for the file by name because the link fails.
Also, although the Readme.txt file on the CD mentions "mapping tables" and files with "the extension .UNI", these useful conversion tables which were included on the CD's with previous releases are missing on the 4.0 CD. But they are available on the website.
This is a minor caveat. I suspect most people will use the website in any case rather than the CD.
Rating: 5
Summary: Essential reference for modern programming
Comment: The Unicode character set is among the most widely used and least known of the international software standards. Java programmers have used it every day for a decade or so, but barely one in ten appear to know anything about it.
The content of ISO standard 10646 (successor to 8-bit ISO 646), goes way beyond just a charcter set. It contains information critical to the correctness of any program that steps outside the English-language world, i.e. every program on the Internet, and many others sooner or later. This is the basis for correct handling of numerals (there's a lot more than 0 to 9), letters, and text. It's also the explanation for some program behaviors that might otherwise baffle a programmer, or at least a programmer with the wit to be baffled.
More than just crucial, the content of this standard is plain fun. Its snippets of information from every major world language give wonderful insight into how people express themselves. It drives home the delighful diversity of human language and experience. It's also a near-bottomless source of stump-your-friends trivia.
I admit, I'll never use every fact in this incredible assembly. I use a lot of the information, though, and I use it as the point of entry into every discussion of internationalization and localization of software.
![]() |
Title: Unicode Demystified: A Practical Programmer's Guide to the Encoding Standard by Richard Gillam ISBN: 0201700522 Publisher: Addison-Wesley Pub Co Pub. Date: 16 September, 2002 List Price(USD): $49.99 |
![]() |
Title: Unicode: A Primer by Tony Graham ISBN: 0764546252 Publisher: John Wiley & Sons Pub. Date: 22 March, 2000 List Price(USD): $24.99 |
![]() |
Title: Developing International Software, Second Edition by Dr. International ISBN: 0735615837 Publisher: Microsoft Press Pub. Date: 09 October, 2002 List Price(USD): $69.99 |
![]() |
Title: CJKV Information Processing by Ken Lunde ISBN: 1565922247 Publisher: O'Reilly & Associates Pub. Date: December, 1998 List Price(USD): $69.95 |
![]() |
Title: Java Internationalization by David Czarnecki, Andy Deitsch ISBN: 0596000197 Publisher: O'Reilly & Associates Pub. Date: March, 2001 List Price(USD): $39.95 |
Thank you for visiting www.AnyBook4Less.com and enjoy your savings!
Copyright� 2001-2021 Send your comments