Arabic

The Arabic Macintosh

  • Home
  • The Arabic Mac
  • Programs
  • Scripts
  • Downloads
  • Jaghbub
  • Eudora Tables
  • Links

  • From DOS Arabic to Mac Arabic

    On the PC side, there has always been many ways of writing Arabic, and each of them have been quite incompatible the Mac's Arabic - and mostly with each other! Thus, for Mac users to be able to read Arabic documents written on a PC, some form of conversion must take place. That is the task of the tools described on this page, which deal with two of the most common Arabic systems on the PC side; the old DOS Arabic, and newer and increasingly popular (for PCs) Arabic Windows, a.k.a. WinArabic. Microsoft Word/PC for Arabic, for example, uses WinArabic.

    It is the Mac that most closely follows the international standard (so, if you happen to get Arabic from a Unix machine, it will be directly readable on your Mac). Thus, the onus on conversion should be on the PC side, and Microsoft has enclosed a utility for the purpose, convert.exe, in the WinArabic package. (Also, click here for an independent tool to be used on the PC.) But in many cases users are either not aware of the problem; or for some other reason the conversion has to be done on the Mac.

    For this reason I have made some transliteration (transcoding) filters, on the basis of the published WinArabic and DOS Arabic code tables for WinArabic code page (cp1256) and DOS Arabic (cp864) to the Macintosh Arabic system.

    They are stored in two packaged files, one for Arabic Windows, one for DOS.The following are included:

    Arabic Windows to and from Mac

    • Win -> MacArab: Converts from Arabic Windows to Mac Arabic, for Arabic-only text files.
    • WinMix -> MacArab: Converts from Arabic Windows to Mac Arabic, for files with mixed Arabic and English text.
    • MacArabic -> WinArab: Converts from Mac Arabic to Arabic Windows. See further explanation below.
    • Win -> MacArab/rtf: Converts files in saved in "Rich Text Format" (RTF) from Arabic Windows to Mac Arabic. For use with any word processeor that can save/open RTF files.
    • MacArabic -> WinArabic/rtf: Converts files in saved in "Rich Text Format" (RTF) from Mac Arabic to Arabic WIndows. For use with any word processeor that can save/open RTF files.

    Arabic DOS (Code page 864) to Mac

    • DOS -> MacArab: Converts from Arabic Windows to Mac Arabic, for Arabic-only text files.
    • DOSMix -> MacArab: Converts from Arabic Windows to Mac Arabic, for files with mixed Arabic and English text.
    • DOS -> MacArab/rtf: Converts files in saved in "Rich Text Format" (RTF) from Arabic Windows to Mac Arabic. For use with any word processeor that can save/open RTF files.
    As can be seen, there are two of each of the PC-to-Mac filters. The first of each, Win-MacArab and Dos-MacArab respectively, are meant to be used on texts that contain Arabic only, i.e. no English or other Roman text. They convert not only the Arabic letters, but also the punctuation marks, spacing etc. to create a completely Arabized text.

    Since Windows and DOS Arabic do not distinguish between Roman and Arabic punctuations on the code level, I have set up two other filters, the WinMix and DosMix files, for use on text files that contain mixed English/Roman and Arabic text. These will only convert those letters and punctuations that have specific Arabic characters on the Windows or DOS side, but not those which are shared between English and Arabic. Thus comma and semicolons are transcoded, but not period and colon.

    These filters are thus to be used on the Mac side, i.e. for files having been received from a DOS/Windows user. They are, like my other convertors on the server, in the form of "paradoids": drag-and-drop programlets. Drag a text-only file in the DOS format and drop on the appropriate paradoid icon, and a new text file with the same creator as the original appears with transcoded characters. The original file is left untouched.

    The paradoids work only on text-only files. However, any file coming from a DOS computer is considered by the Mac as a text-only file (since DOS files do not have the "creator signatures" the Mac uses), so even a formatted e.g. Arabic Word for Windows file can be transcoded. The result will be a text file where the formatting information has been lost, but from which the textual content often can quite easily be read and extracted into the Mac program of one's choice (this will of course vary according to what DOS program was originally used, and how much formatting info it has put into the text itself. In my test files, which I believe to have been created in Arabic Word, I only had to remove a few lines of formatting gibberish from each file.)

    If you have access to the original DOS/Windows computer, you may save the file in so-called "Rich Text Format" (RTF). Many word processors understand this, which is basically an intermediary between different programs. If so, you can use the two converters marked /rtf. This will replace, not the characters themselves, but their RTF codes. In this way, you can convert also formatted text, not just raw format. Notice, however, that some RTF programs have a problem with the Arabic "t" character, which shares it code with "non-breaking space", and remove it. Thus, not all RTF-based conversions may be successful.

    Problems in transcoding

    I have not had any Arabic DOS files to play around with, but I have tested out the WinArabic filters on a "real-life"-situation. I have discovered that some of the differences between the Mac and Windows/DOS way of doing things will lead to some cleaning to be necessary after each transcoding. (For the following, it is important to keep the concept of dominant text orientation on the Mac clearly in the mind: this refers, not to the direction the typed text, or to the alignment of paragraphs, but to how blocks of English and Arabic text are arranged along the line; from left to right or right to left. See the Introduction to the Arabic Mac page on this server for details.)

    Parentheses

    Using the most extensive, Win-MacArab and Dos-MacArab paradoids create no problems. If you use the mixed-text paradoids, however, parentheses will remain in European form (because WinArabic does not have separate Arabic ones). If your dominant orientation is Left-to-Right, the parentheses themselves will appear to be correct (like this) but the text before the parentheses will be to the left of them, and the text after to the right of them.

    If you reverse orientation to Right-to-Left, the text elements will organize themselves correctly from the right to the left, but the parentheses will appear mirror-imaged, )like this( .

    To resolve this, you must find/replace the parentheses in the Arabic text, reversing the )s to (s and vice versa.

    The same applies to square and curly brackets.

    Periods, colons

    The punctuation marks that have a common code value for Arabic and Roman on the DOS side will also create problems in mixed-text files. If the text is in Left-to-Right dominance, they will force the elements of the line to break up at the punctuation and be arranged from left to right. Thus, such mixed texts must be read only under Right-to-Left dominant text orientation.

    Using the full Win-Mac Arabic and DOS-Mac Arabic paradoids will of course avoid these problems, as punctuation marks are here transcoded to unequivocal Arabic ones.

    Numerals

    Numbers create their own problems for the WinArabic filters. Basically, they cannot be transcoded. In WinArabic, numbers are entered "logically", 1995 as 1-9-9-5. On the Mac, they are entered as letters, right to left: 5-9-9-1. Transcoding the numerals will make every group of numbers to be reversed. Thus, the WinArabic filters do not transcode numerals. However, this may not affect display. If you turn on "al-Shakl al-'arabiyya lil-arqam" in the Arabic control panel, all numerals put in an Arabic font will appear in Arabic shapes whatever their code positions. Thus this will not make a noticeable difference to the text. But do notice that the Arabic numerals actually have the European code positions, not the Arabic ones.

    The full filter is thus like the MixedText one in not transcoding numerals, so the same thing applies for the latter filter. Since numerals are "English" in spite of their appearance, they will again force you to have Right-to-Left orientation, to avoid the elements of line to be incorrectly arranged. However, notice one detail for the mixed-text filter: If the numerals come before a punctuation mark, like "wa-mata fi 1990." the full element "1990." will be considered one Roman block, and the period will come "after", i.e. to the right of the 0 in 1990. You must hunt such instances out and replace them with "real" Arabic punctuations in the resultant text file.

    DOS Arabic does have separate Arabic numerals in addition to the Roman ones, so numerals are transcoded as other Arabic characters in the DOS-Mac and DOSMix filters. As I do not have anything to test with, I do not today know whether numerals will appear reversed or not. I await comments from those who may test this.

    Notice, incidentally, that DOS Arabic does not have short vowels.

    As mentioned, I have not had occasion to test these very widely yet (since most people I know use Mac Arabic!), so if users find errors or suggestions for improvement, please let me know.

    Knut S. Vikør
    23.3.98


    Back Forward

    Home | The Arabic Mac | Downloads | Index
    Responsible for this Web page is Knut S. Vikør.
    Last updated Thursday, 04-Nov-2010 09:55:20 CET