
If it’s replaced by apostrophe ( ’), it means someone relied on so-called smart quotes and didn’t double-check for errors. If it’s missing or replaced by ' (neutral apostrophe), that means someone can’t type the character or thinks downstream systems can’t accept it, or those downstream systems changed it behind your back. In the English language, the giveaway character that can conclusively prove your copy is dirty is this: ‘ (opening single quotation mark). After you finish this article, you will be in a position to avoid borked Unicode like this for the rest of your career.Ī mismatch in server and browser settings means the browser can’t figure out the character encoding and uses the wrong characters. I want to make sure you know what I’m talking about, so I’m going to show you a few errors of character encoding. Absolutely do not use this method to enter what you think are “special characters” in the day-to-day run of your work as writer or editor. You have to know this troublesome implementation detail because it is the only way to reliably enter and edit the few characters that demand this approach. Some examples, purely for illustration purposes: You do that by starting the name or number with an ampersand and ending it with a semicolon. In these cases, you’re specifying the character by an agreed-upon name or by its Unicode number. When the character your system displays can be confused with something else or is simply invisible, as in the case of whitespace or non-breaking hyphen ( see below), you need to enter a character entity, which uses a sequence of other characters to escape the character you actually want. And do that everywhere – hed, dek, body copy, in RSS, on Twitter. But the point is use the character you want right in your document.

You may need to copy and paste it from another source.
#WHAT HAPPENNED TO THE UNFORMATTED TEXT OPTION ON WORD FOR MAC 16.14.1 HOW TO#
You may need to learn how to type it, but I’m going to teach you how. The most important advice is the easiest: Just type the character you want. For the working hack, it really is that simple. (That’s called using a character entity or escaping a character.)

In rare cases, you will refer to characters by their Unicode number. In essentially every case, you just type, insert, or paste the actual character you want – just as with characters you find noncontroversial. You don’t even have to know what that stands for. Unicode is a large specification that can be expressed in a variety of ways, but in the normal course of events the only variation you need to know about is UTF‑8. To write clean copy, everything you write, edit, pass on to someone else, receive from someone else, and publish has to use Unicode from start to finish. Corollary: There is no such thing as a “special character.” (Really, there isn’t.)

Unicode does not include every conceivable character, but every character you the journalist will need is available in Unicode. Those have all been superseded by a single numbering system, Unicode. In the past, there were dozens of conflicting sets of numbers used to enumerate characters. A system of enumeration of characters is called character encoding. That includes visible characters like letters and numerals whitespace characters like spaces and tabs and invisible characters like optional hyphens. Here is the complete list of facts you need to know.Ĭomputers store characters as numbers. The concepts involved in producing clean copy can extend way out to the horizon, but journalists don’t need to worry about expert-level details. Getting character encoding right is an absolute necessity for working print journalists, which is all well and good except for the fact that nobody has ever bothered to teach journalists what character encoding is.īy the time you’re done reading this article, you’ll gain knowledge of character encoding that leads to confidence that you can write clean copy and muscle memory and good habits to actually do it. Your copy can’t be considered “clean” unless and until it is stored and reproduced correctly.

As an editor who was formative in my development, the late Sid Adilman of the Toronto Star, put it, the goal is to write an article that “reads well.” Clean copy is a prerequisite for that.īut I want to expand the definition to encompass character encoding. Conservatively, the term refers to text that is spelled and punctuated correctly and makes sense. Writing clean copyĬlean copy is an established concept in print journalism. You’ll learn enough to avoid the dreaded borked Unicode. But what does that mean? This article gives journalists the basics they need to know to ensure that every character, word, sentence, and paragraph they intended to write gets correctly saved and reproduced on computer systems, and ultimately online and in print. Borked Unicode: Tips for journalists on writing clean copy
