Writing Arabic with Latin letters
The most natural way to write Arabic is of course with Arabic letters. But sometimes we, in particular academics, want to write Arabic with Latin characters - e.g. to add an Arabic word or title into an English text - but without sacrificing precision. Since Arabic has more consonants than English, we then need to add some special letters; such as a dot below (or above) some consonants, or to add a line above long vowels. We call these marks "diacritics". How can do we this on a Mac or PC?
Well, that depends on what kind of computer and software you use. I have no personal experience with PCs and Windows, but have added below some information from people who use such computers. As for Macintosh users, you can write transliteration both on older and on current Macs, but there are some differences between recent and older software, so you may have to be aware of some details. Let us first, however, look at some of the basics.
What is it?We tend to use three words a bit confusingly: transliteration means to reproduce an Arabic text precisely in Latin characters, that is, based on how it is written in the Arabic script. Transcription means to reflect, in Latin characters, Arabic as it is spoken. Since Arabic speech is not quite identical to Arabic text, these two terms are not fully synonymous, and phoneticists and those who work with dialects may want a finer distinction and use special alphabets such as the "IPA" system. However, non-linguists often use the two terms interchangeably. For our purpose here, we will actually focus on what is properly called transliteration, basing ourselves on the characters of the Arabic script.
For that purpose, then, we generally use standard English characters which we modify by adding a dot or a line, a diacritic, to them. This allows the specialist to know what its Arabic origin is (a sad, not a sin), while the non-specialist will recognize its approximate pronunciation ("s" in both cases). Exactly what diacritics we use, varies: it is a matter of intense academic debates what kind of transliteration system is "most correct". If your web browser allows it, you can see the difference between khārjah, ḫārǧat and jârŷa, the same name in three different transliteration systems. As a writer, you may choose your own - so you can control how many diacritic characters you need - or a publisher may impose a system on you, in which case, you may need access to more than you normally write.
However that may be, the important point is that the diacritic character, the "s with dot" - emphatic s - is a fully separate character, not just an "s" with an extra. Not just linguistically, but also on the computer: We see it as a kind of s, but for the computer, the "s" and the "s-dot" have no relation to each other at all. It is simply that when the programmer drew the s-dot character, he made it resemble the s, before adding the dot.
How do we get it?So, we need to add these "diacritic characters", the s-with-dot, the a-with-line, and the others, to our computer's repertoire of characters, in addition to the regular "s" and "a". What does this entail?
In order to type these letters, you need two things:
Evidently, not every font has these characters we need, the vast majority do not. But, on modern computers (Mac and Windows) there will normally be at least one that does. On the most recent Macs, you can get by fine with Times and Helvetica, and all current Macs have a standard font called Lucida Grande with a huge variety of diacritics. On most Windows machines, you will find a font called Arial Unicode MS with even more (and in Vista or Windows 7, the regular Times New Roman as well as half a dozen other fonts contain what we want). However, you may not want to be limited to these fonts, and may want to install more fonts that also contain these diacritics.
You can find a number of such fonts on the Internet, which you can download and install on your computer (see below for a list). You will find that many of them have the term Unicode in their name, as you saw in Arial above. By "Unicode" is normally meant that the font contains more characters than the regular a-z of English and West European accents, it kind of means "Exotic Alternative". Technically, Unicode is just an agreement between computer makers on how their machines should treat non-Western characters. It "codifies" characters in all kinds of living and dead languages and scripts, including such specialist needs as our scientific transliteration characters. That is, it standarizes the computer's identification of each character.
When a font maker calls his font a "Unicode font", he thus
means that there are some such exotic characters in his font,
but exactly which ones he has put into his particular font is up to him.
So not all "Unicode fonts" will have the characters we are looking
A further step would be if all the most common fonts on all our machines actually included our characters, or at least that the fonts with the same name had the same characters on Mac and PC. That is not always true. Sometimes you can send a document with transliteration to a colleague, using e.g. Times New Roman, or another common font, and get a frantic message back saying that the "diacritics have disappeared". Tell him not to worry, the characters are there, but his machine has a different version of the same font, and that version does not contain these characters. If he sets the text in another font, the diacritics will magically reappear.
How to type them
Well, so much for the first point, the fonts themselves. But they are not much good if we cannot type our "s with dot under". Clearly, there is no such "s-dot" key on our keyboard. So we must try to trick our keyboard into "producing" the diacritcs. The problem is just that there are only around 50 keys on the keyboard in front of you. By using the Shift key, the Option (or "Alt") key, and both of these together, we can quadruple that to about 200. But Lucida Grande has 2,000 different characters, and a really big font from 20,000 to 50,000. No way we can type all those with a regular keyboard.
One way to get at the special characters, is to display on-screen a huge "palette" of all characters in a font, and click on the one you want, so it is inserted into the text. Both Mac and Windows have such tools ("Character palette" and "Character map", respectively). But that is cumbersome. It is better to be able to type them directly with the keyboard, a bit like how we type upper-case letters by typing s with Shift down to get "S". Why not in the same way making Alt and s together produce "s-dot"?
That we can do if we allow our computer to switch between different keyboard layouts. Layouts are software settings that link the physical keys we press to particular characters on the screen. We are not stuck with just what is painted onto the plastic. Mac people know this, if we press the Alt key (or "Option" key as we call it) with e.g. "a", you can get an "å". Change the layout setting, and the Option-a combination will link to a different character in the font, such as ā - a long a. You can have many such alternative keyboard setting layouts on your computer, they are listed in a "keyboard" menu - often called the "flag" menu as it displays small flags. Adding one of our own geared towards our purposes, will let Option/Alt combinations type the diacritic characters we need. If, of course, the font we have selected contains such characters.
A Mac has pre-installed one or two such special layouts that allow access to some diacritics (Windows does not, but you can add one). But these are not geared particularly to Arabic transliteration. We may want something that gives easy access to our particular characters. Such exist, but here Mac and Windows must part ways. This is a Mac website, so we will spend most time on a Mac, before summing up the solutions I know of for the Windows people.
DIACRITICS ON A MAC
Writing transliteration on the Mac does not require anything extra if:
(a) Your computer system was bought or upgraded after May 2005.
That was the short answer. It may require some expansion:
Re: (a): In 2005, Apple introduced new versions of the standard fonts Times and Helvetica that contain [most of] the characters we need, as part of their operating system 10.4. If your Mac is slightly older, you will probably want to add some freeware or other fonts that have these characters. Read below for more about adding fonts for transliteration.
Re: (b): In 2004, Microsoft introduced its "Office 2004" package which could use these fonts. Older versions of Word could not. If you use an older Word, you have to check the solution below I have called "The Traditional Way". -- If you are not a Microsoft user (and why should you be?), the same applies, new programs and versions can most often use the modern fonts, older versions may not. Again, further detail below.
Re: (c): The USA Extended layout is built-in, so you can start using it Now; you just have to select it. But it may not be the most convenient layout for our purpose, so you may rather want to get a slightly more handy one. Yes, it too is explained below.
Forward On: The Modern Way of Unicode
With the new versions of Times and Helvetica that came in system 10.4, we thus got vowels with line above, dot below consonants for emphatics, carons (small "v's") above g, s and z; d and t with line below, etc. Helvetica also has "proper" ‘ayn and hamza as separate characters; Times does not, but many people use single quote marks ‘ and ’ for those anyway.
Most other fonts on the Mac do not have these characters, but Times and Helvetica are so widely used that this was a huge step forward. Previously you had to use the more stocky font called Lucida Grande for the purpose. This makes it possible for us to see these characters in documents or web pages. The next step is to find out the best way for you to use them yourself, to get into the documents you are writing.
A palette, or two ways to type diacritics
The USA Extended keyboard
USA Extended gives access to many kinds of diacritics and special characters: To type u with line above, you first type Option-a, and then u. Option-a is the general key for "line above next character". For dot below d, type Option-x and then d. Of course, your text must be in Times, Helvetica or other font that has these characters, the keyboard can only produce what the font actually contains. If you are typing in eg. "Times New Roman", the variant of Times that does not have diacritics, the program will most often switch to another font that has the character (e.g. Lucida Grande), or it may, in some cases, just display an empty box or an empty space.
The "Diacs" keyboard, especially for Arabists
This keyboard layout is an alternative which focuses on our keys: You get "d with dot under" by typing Option-d (uppercase with Shift), long u by Option-u, etc. Hamza is Option-j, and ‘ayn Option-l. One keystroke for every character, which should provide for faster and more convenient typing. The package contains various national variants, if you are a US user, American Diacs is your choice, if you are UK British, French, Italian, German or Scandinavian, you will find a layout suited to your national keyboard. After installing it and choosing it in the "International" System Preference, you select it in the same way as mentioned above from the "flag" menu.
What? Is my Mac from April 2005 (or was it 2002 or 2003?) obsolete already?
Do not worry. If your Mac just missed out on May 2005, but is otherwise from within the last decade, all the above still applies, it is just that the only installed font that has our characters is Lucida Grande, maybe also Courier and Monaco. But these may not be the most beautiful fonts for printing. So, you would want to add some fonts for that purpose, fonts that e.g. look like Times, or other printable fonts, but have added our types of characters. Even those on the most recent machines may want to add some such fonts, to extend their repertoire of diacritics-enabled fonts.The JaghbUni font for diacritics
There are several such Times look-alikes or other useful fonts available on the Net, and we provide another on our server here, it is called JaghbUni (or "Jaghbub Unicode"), and is particularly geared for Middle Eastern transliteration - actually, the package contains three fonts, looking like Times, Helvetica and Palatino. Unlike Apple's Times, JaghbUni contains true ‘ayn and hamza.
At the bottom of this page is a list of fonts you can download from the Internet, of various shapes and usage, which contain the basic diacritics for our purposes.
Can I print in these fonts, send documents to my colleagues, or use in email?
Print, certainly. These are just regular fonts.
Send documents with these letters as attachments to colleagues; that depends
on the colleague. The point is that these characters belong to particular
fonts, as you understand from what we have said above. But when you send
the document by email or otherwise to someone else, the font is not included, only
the command "this document should be displayed in the Gentium font" (if that is the
font you used). If the other guy has Gentium on his PC, fine, the PC will comply and
display the text - and thus your diacritics - in his Gentium.
For sending to another Mac user, the same is basically true, but the Mac will often (not always) help him part of the way. If it senses that some characters do not exist in the font of the document, it will automatically display those characters in Lucida Grande, the Mac's basic "all strange characters" font. They will stick out like a sore thumb, but he will see them. Some programs, however, will not allow this automatic replacement, and there the Mac like the PC requires the user to manually choose Lucida Grande or another Unicode font for the text.
As people upgrade from the still very common Windows XP to Vista or higher (and Macs to 10.5 or higher), these problems will be reduced, as newer computers tend to have better fonts for diacritics. But it will still be a few years before we can assume that anyone we write to has full fonts for transliteration.
In email, the same applies. Almost all current email software supports Unicode and should display your diacritics like any other text. But there may be people still who swear by older software that has served them well, but does not support Unicode, or may display the email in default fonts without diacritics. (Eudora is a case in point: excellent program still, but no Unicode and thus no diacritics). Mostly, however, Unicode-based diacritics should work today.
I want to make an index - how can I sort this text alphabetically?
That depends mostly on what software you use. Many program will sort the actual diacritic characters correctly: s/dot under is placed together with s, a/line over with a, etc. That is the case with Microsoft Office (Word/Excel), NisusWriter, OpenOffice, NeoOffice, Jedit X, BBEdit, FileMaker, EndNote, and the "list manager" iData (the other "list managers" iList and EagleData do not sort diacritics correctly).
However, in all of them the separate letters ‘ayn and hamza are placed after z, not ignored in sorting as we want, thus ‘Ali will appear after Zubayr.
The Problem with the many TimesThe most common font in use is probably Times, both on Mac and Windows. But that can create much confusion when diacritics are concerned, for there are many different versions of the font "Times", and some have no diacritics, some have a few, some have all. And, you may not even know which version is installed on your machine. This is important since so many use this font, so we should go through the options.
Basically, we are talking of two different fonts: Times that comes with the Mac and Times New Roman ("TNR") which comes with Windows, but is also installed on Macs. Each of these come in different versions that have the same name, but different content of diacritics. They are, in descending order of usefulness:
Now, for the tricky part: You may have several versions of each font on your machine, some new and some old, and not know which you are actually using. The thing is: As you may know, on the Mac, fonts can be placed in several different folders on your machine. For example, there is a "Fonts" folder on your hard disk (Macintosh HD / Library / Fonts), and another "Fonts" folder inside your own user account (Macintosh HD / Joe / Library / Fonts), and the Mac can see and use both of these sets of fonts. But if a font with the same name exists in both folders, it is the User account folder (Joe / etc.) that takes priority, the other font becomes invisible to the system and is not used ("Times" and "Times New Roman" are two different names, so these are seen as separate fonts; both appear in your Fonts menu, but only one of each).
Thus, installing a new font may not mean that it replaces an older font. When the Mac system installs fonts in a system upgrade, it only changes the fonts in the first folder, the HD / Library / Fonts folder. But Microsoft installs its fonts in the User accounts folder. That is therefore the active version, if you have a "Times New Roman" font file in both folders. So, if you years ago installed Office 2004, and then later installed or upgraded your Mac to Leopard or Snow Leopard with a newer and better TNR font, it is still the older Microsoft version that is the active one; the newer font is ignored. That is generally not a good idea, because this Microsoft Office version not only lacks diacritics, it also may create problems for Safari and other applications when they try to display Arabic text. You should normally avoid the Microsoft 2004 version of Times New Roman even if the alternative is an older Apple version, and certainly if you have a newer one available. (Incidentally, from Lion onwards, the above mentioned "Library" folders have all been made invisible to the user. Not useful. To see them, either hold the Option key while selecting the Go menu in Finder - "Library" will then appear - or write "~/Library" (for your account's folder) or "/Library" (for the system-wide HD/Library folder) in the "Go To Folder" menu. See here for other ways to access it.)
So, it is useful to look inside these two Fonts folder and see if you can spot identically named fonts in both (or you can use the FontBook application, where duplicates are marked with a bullet or yellow triangle). If you find any, drag the older font out of its Fonts folder to make it inactive. While you are at it, do that for all such duplicates, it is generally sound advice only to have only one version of a font available; even though the system is able to juggle them, they sometimes cause problem. And in general, the newer (or larger, check the file size) version is normally the better.
The Traditional Way: Classic and contemporary, but private
That takes care of (a) and (c) above. Unfortunately, none of this is any good if the actual programs (applications) you want to use cannot make handle these new fonts - all of them are based on the Unicode system we mentioned, and you only have to go back to 2003 before Microsoft's programs on the Mac were unable to deal with these types of fonts.
So, if your favourite word processing software is Word for X, AppleWorks, or other that do not work with the Unicode system described above, your options are more limited. You can still write in diacritics - we have been doing so on the Mac since the 1980s - but not in these Unicode standard fonts.
You must instead type in older, non-standard fonts, what we now call a "legacy" (that is, old-style) font that contains the diacritics you want. There is no lack of such fonts; over the years many academics created all kinds of fonts with special characters for their own private or shared usage. The problem with these fonts is that they are private, they do not follow any agreement with other computers on how these characters are to be displayed. On your own machine, that does not matter, you can type, edit, print with these private fonts in any program, old or new. But if you want to share your documents with others, colleagues, or in particular: with your publisher, the diacritics will disappear or get "transformed" into something else. So that will demand some trickery on either side. However, before Unicode this was inevitable, so we had to learn to live with it.
I myself made one such "private" font, called Jaghbub, on which the newer "JaghbUni" font is based, of course. You are free to use Jaghbub, if you like. You cannot use the "American Diacs" keyboard with this font, because that is a Unicode keyboard, and Jaghbub is not a Unicode font. Instead, you use a parallel keyboard layout for the old-style font is called US Diacs. It is all in the package below (this also includes a Palatino and a Helvetica-based font).
What if I do not want to install anything? Though. There is no built-in way for the old software to display our diacritics. Lucida Grande does not work for diacritics, nor does USA Extended in these programs.
And what about sharing my old documents, written in old-style transliteraion fonts? You can of course ask your correspondent to install the same private fonts as you used when you wrote the document. But often that is not feasible. And anyway, you may want to "update" your old documents yourself to the Unicode system, because the publisher asks you to; or because you want to continue working on them, e.g. to integrate them in recent work written in Unicode fonts.
The answer then is to "convert" the documents to the Unicode system - which means nothing more dramatic than to take the old "s with dot under" in your old font and replace with "s with dot under" in Unicode, and so on, as a series of "find and replace all"'s. The trick is just to figure out these characters; and of course it is tedious to do umpteen "finds" for each document.
For this reason, we have put here on this site some tools that automate the process. They are "macros" that convert from old-style to Unicode Arabic transliteration characters, from some 35 to 40 different old-style private transliteration fonts that I have found information about. The macros are for the programs NisusWriter Pro and Microsoft Word 2004. Check the separate page that lists the fonts included, with download and instruction information.
Should I go "Modern" or "Traditional"?
If you have a choice, there is no question: Unicode fonts is what to use if you at all can. People who have been using legacy transliteration fonts and have emotional investment in them (like me!) will find both that co-operation with others is much simpler, and that the greater choice of fonts you now get opens up many new possibilities. So, maintain or upgrade your old documents, but make the change to Unicode for anything new you start, is my advice.
If you are new to transliteration, certainly the only reason to go with legacy fonts is if you for some reason must use software that does not support Unicode. There is not much that does not, but there are some cases:
If you work in the "Classic" environment (older Macs only), Unicode is not available, all Classic programs follow the "Traditional" route.
But some OS X applications also belong in this group, particularly older programs. From Microsoft that includes the version called "Word for X", which came in 2001. Eudora does not support Unicode. Filemaker added Unicode from version 7, as did Dreamweaver, also from vs. 7. A Unicode version of Quark XPress came in 2006 (also this version 7, as it happened). AppleWorks is not being updated, and does not support Unicode.
If your software version is from after ca. 2005, however, it is most likely to support Unicode. Only a few elderly holdouts such as MarinerWrite, Panorama, Mailsmith and a few others lack Unicode today. You can check the web page here on Arabic software, which lists what relevant software does and does not support Unicode (look under the "No"s for the no-shows.) Any program that supports Unicode and lets you choose a font, can use Unicode diacritics fonts, so even "No" programs for Arabic will do diacritics if it supports Unicode.
AND WHAT ABOUT A PC?
Since I do not use one, I can only present second-hand information. But all current Windows machines accept "modern" Unicode fonts. Windows XP or higher has Unicode built in, as had its predecessor Windows 2000. Only very old systems like Windows 95 / 98 / ME, which supported Unicode indirectly, will present any problems here.
Windows Vista and Windows 7 install a new version of Times New Roman that
has a full range of Unicode diacritics. However, many are still using Windows XP, which uses an older and less complete version of this font; XP is the last Windows system that does not include a full Unicode font as standard. But those who have got a regular installation of Microsoft Office 2000 or higher, will most likely have at least one font that can display all the characters we need, Arial Unicode MS, one of the most complete Unicode fonts around. (Lucida Sans Unicode is another font that comes with some installations, but it does not contain all our characters, in particular not ‘ayn and hamza).
Windows programs will sometimes not "substitute" missing characters from a default font, the PC user must manually choose a suitable font for display. So, a PC user who receives a document online will often have to remember to put the text in one of the fonts on his machine that has the required characters - Arial Unicode if nothing else - to see them, in particular the emphatics that the XP Times New Roman lacks. The characters are still "there", he just needs to apply a useful font to see them.
Then, how to get the diacritics into the document you are writing on a PC? In Word, you can create your own keyboard shortcuts for such special characters: Choose Insert : Symbol. Then select a Unicode font (one that contains the characters you want), and scroll until you find one of the relevant diacritic characters. With this selected, e.g. s with dot under, click on the button "Shortcut Key" and tell it "Alt + s". This should allow you in the future to type s/dot under by typing Alt and s together. You should then go through the relevant characters, and define shortcuts for each of them. Once you have done so, you should be able to type diacritics as easily on the PC as on a Mac.However, that shortcut procedure only works in Word, not e.g. in Notepad or other programs. But there are other alternatives. One, somewhat cumbersome, is to use the tool "Character Map" which is included with Windows (in Accessories: System Tools), you must scroll a bit to find your desired character there, which you can then paste into your document.
More practical is to install a system keyboard layout that lets you type diacritic characters directly, in any program that supports Unicode, like the Mac way sketched above. I see that the University of Chicago Mamluk site recommends the layout Alt Latin (actually a "port" from the Mac). Installation and everything is also pretty similar to what I have described for the Mac: you go to a Settings panel and enable it, and then choose the Alt Latin keyboard from a Keyboards menu when you want it. Usage is a bit more cumbersome, in that it is not specifically geared for the Middle East: you need two keystrokes rather than one to type "u/line above": First "Alt-a" [for the line] and then "u". But it works and is probably worth looking into for Windows users. See a fuller description at the University of Chicago site. (If this keyboard does not suit you, it is also possible to download a Keyboard Layout editor from Microsoft and create your own.)
If you use a Unicode font both on Windows and on the Mac, then you can easily transfer or send documents with transliterated text back and forth between them. Only remember that the computer may not remember the font choice, in particular if your correspondent on the "other side" does not use the same font as you. So, he may have to re-choose the correct font for the text, according to what he has available. As long as both are Unicode fonts, among those mentioned here, the diacritics will then appear correctly.
Can we use diacritics in Web pages?
Actually yes. There is no guarantee that everybody who reads the page will be able to see them, but it is more and more common e.g. from library sites to transliterate titles with Unicode characters. Most current web browsers can display these Unicode characters. On the Mac, all current web browsers will (Internet Explorer will not, but is now obsolete). On Windows, we can also probably assume that most current machines can display diacritics in Unicode.
However, if we want to create such a page, how do we do it? There are two ways. One, and the easiest, is to tell the browser that you are writing in Unicode by inserting a "charset" header. The web editing program you are using may do this for you, if not, go to the code and add this line in the header of your page:
The other option, which was preferred earlier, is to convert all non-regular characters to special codes, "HTML entities". Then, the charset header is not required. Some web editors prefer this option, although it may be redundant today.
<font face="Jaghbuni, TimesTL, Gentium, Arial Unicode MS, Lucida Grande"> (diacritc text) </font> (put in the fonts that seem useful to you, the browser will pick the first font on the list that is available on the user's machine).
For relevant html editors, see the Arabic Programs page.
If you are an IPA person, notice that IPA is also part of Unicode and a "full" or "complete" Unicode font will also include IPA characters. JaghbUni does not, but many of the others listed below do. In the JaghbAdd package I have also included some keyboard layouts for accessing these IPA characters.
Unicode fonts for transliteration of ArabicThere are a number of Unicode fonts on the Web that can be used for transliterating Arabic, commercial and non-commercial. Commonly installed on Windows XP machines using MS Office is Arial Unicode MS (also installed on Macs from OS 10.5). It is at 22 MB much more complete than Lucida Grande. (The largest font I have found, however, with some 51,000 characters, is Code 2000, see below.)
Windows computers using Microsoft's Vista system should also have an expanded version of Times New Roman, which includes a full set of transliteration characters, our characters should there also be in Arial/Microsoft Sans Serif, Courier New, Segoe UI, Tahoma and Arabic Typesetting. In Windows 7, the following fonts are added to the list: Calibri, Cambria, Consolans and Deja Vu Sans Light, Mono, and Serif (this in "Home Premium edition").
On the Mac, the number of useful fonts for our purpose also increases with every new system version. This is the list of fonts with transliteration characters that are installed on Macs from Systems 10.3 and up (fonts marked * do not have ‘ayn and hamza, but all have the "IJMES diacritics"; macrons over a, i, and u, and dots under s-d-t-h-z).
(1) Roman only, italics lack ‘ayn and hamza. This is a font from the GNU Freefont package. (The third font in the package, FreeSans does not have our characters.)
Check each page for conditions of use. These are all free (or inexpensive shareware) at least in trial versions that appear to be fully functional.