+ Reply to Thread
Results 1 to 7 of 7

[SOLVED] Arabic characters gives ASCII code 63

  1. #1

    [SOLVED] Arabic characters gives ASCII code 63

    Dear all,
    I have imported some Arabic text into excel. When I convert them to
    ASCII, I get the value 63 for all characters. What happened? Do I need
    to install any language packs?

    Regards,
    Julian


  2. #2
    Gareth
    Guest

    Re: Arabic characters gives ASCII code 63

    You shouldn't need a Arabic language pack to *handle* Arabic in my
    experience - you do if you want to read it and have the glyphs behave
    correctly .

    How did you import them? A text file? Was it unicode? Or single byte
    (ISO-8859-6 maybe)?

    Gareth

    [email protected] wrote:
    > Dear all,
    > I have imported some Arabic text into excel. When I convert them to
    > ASCII, I get the value 63 for all characters. What happened? Do I need
    > to install any language packs?
    >
    > Regards,
    > Julian
    >


  3. #3

    Re: Arabic characters gives ASCII code 63

    Hi,
    I imported via XML files. I don't know what a unicode or single byte is

    Can you help?

    Regards,
    Julian


    Gareth wrote:
    > You shouldn't need a Arabic language pack to *handle* Arabic in my
    > experience - you do if you want to read it and have the glyphs behave
    > correctly .
    >
    > How did you import them? A text file? Was it unicode? Or single byte
    > (ISO-8859-6 maybe)?
    >
    > Gareth
    >
    > [email protected] wrote:
    > > Dear all,
    > > I have imported some Arabic text into excel. When I convert them to
    > > ASCII, I get the value 63 for all characters. What happened? Do I need
    > > to install any language packs?
    > >
    > > Regards,
    > > Julian
    > >



  4. #4

    Re: Arabic characters gives ASCII code 63

    I just found out. I think it's using xml version="1.0" encoding="utf-8"
    Hope this info helps.

    Regards,
    Julian


  5. #5
    Gareth
    Guest

    Re: Arabic characters gives ASCII code 63

    Hi Julian

    I've never used utf-8 I'm afraid. UTF-8 is a single byte character set
    but... it used 2 bytes when handling Arabic AFAIK. Excel supports
    standard unicode UTF-16.

    from http://czyborra.com/utf/#UTF-8
    <<UTF-8 consumes two bytes for all non-Latin (Greek, Cyrillic, Arabic,
    etc.) letters that have traditionally been stored in one byte and three
    bytes for all symbols, syllabics and ideographs that have traditionally
    only needed a double byte. This can be considered a waste of space and
    bandwidth which is even tripled when the 8bit form is MIME-encoded as
    quoted-printable ("=C3=A4" is 6 bytes for the one character ä). SCSU
    aims to solve the compression problem.>>

    Sounds a bit strange to me but it makes sense to use 2 bytes - you just
    can't get all of the Arabic char set easily into 1 byte. ISO-8859-6
    tries but misses out some less commonly used letters/glyphs.

    You get char 63 - which in hex is 3F - I suspect this might be the
    leading byte in a two byte character.

    My questions:

    (a) Where is the file from / how is it created?

    (b) Are you sure the Arabic is in there correctly in the first place?
    (Try opening it up in IE to check or even in notepad.)

    If you find it *is* in there correctly feel free to send me a copy of
    the file and I'll take a look at it. Send it to the name excelvba with a
    domain name of garhoo and put a com at the end of it.

    HTH,
    Gareth



    [email protected] wrote:
    > I just found out. I think it's using xml version="1.0" encoding="utf-8"
    > Hope this info helps.
    >
    > Regards,
    > Julian
    >


  6. #6

    Re: Arabic characters gives ASCII code 63

    Hi Gareth,
    I mailed you the file. But if you didnt receive it please email me at
    [email protected] or [email protected] and I will resend the file to you.

    Thank you for your time.

    regards,
    julian

    Gareth wrote:
    > Hi Julian
    >
    > I've never used utf-8 I'm afraid. UTF-8 is a single byte character set
    > but... it used 2 bytes when handling Arabic AFAIK. Excel supports
    > standard unicode UTF-16.
    >
    > from http://czyborra.com/utf/#UTF-8
    > <<UTF-8 consumes two bytes for all non-Latin (Greek, Cyrillic, Arabic,
    > etc.) letters that have traditionally been stored in one byte and three
    > bytes for all symbols, syllabics and ideographs that have traditionally
    > only needed a double byte. This can be considered a waste of space and
    > bandwidth which is even tripled when the 8bit form is MIME-encoded as
    > quoted-printable ("=3DC3=3DA4" is 6 bytes for the one character =E4). SCSU
    > aims to solve the compression problem.>>
    >
    > Sounds a bit strange to me but it makes sense to use 2 bytes - you just
    > can't get all of the Arabic char set easily into 1 byte. ISO-8859-6
    > tries but misses out some less commonly used letters/glyphs.
    >
    > You get char 63 - which in hex is 3F - I suspect this might be the
    > leading byte in a two byte character.
    >
    > My questions:
    >
    > (a) Where is the file from / how is it created?
    >
    > (b) Are you sure the Arabic is in there correctly in the first place?
    > (Try opening it up in IE to check or even in notepad.)
    >
    > If you find it *is* in there correctly feel free to send me a copy of
    > the file and I'll take a look at it. Send it to the name excelvba with a
    > domain name of garhoo and put a com at the end of it.
    >
    > HTH,
    > Gareth
    >
    >
    >
    > [email protected] wrote:
    > > I just found out. I think it's using xml version=3D"1.0" encoding=3D"ut=

    f-8"
    > > Hope this info helps.
    > >=20
    > > Regards,
    > > Julian
    > >



  7. #7
    Gareth
    Guest

    Re: Arabic characters gives ASCII code 63

    Hi Julian

    I emailed you a response. For the benefit of other NG readers - since I
    haven't seen this issue around much and someone else may be interested -
    I'll write a note here too.

    I see what you're trying to do and your approach makes sense. The only
    problem is that you are looking at the ASCII/ANSI values i.e. assuming
    that each character is represented as a number between 0 and 255. This
    isn't the case, VBA handles the string internally as unicode i.e. two
    bytes per character. This is hidden from the developer - the length of a
    5 character string is still 5 but it's still 10 bytes.

    Thus, all you need to do is get the unicode value for each character
    rather than the ANSI number. This is achieved by using AscW in place of
    Asc and then writing it back you need to use ChrW rather than Chr.

    I've pasted a demo function below to show how Excel handles strings as
    unicode - even without having to use StrConv(x, vbunicode) etc.

    HTH,
    Gareth

    Function CopyUnicodeToCellByCharacter(rng As Range) As String
    'just an experiment to make sure my methodology works
    Dim i As Integer
    Dim CellValue As String
    Dim NewValue As String
    Dim UnicodeChar As Integer

    'Get the string from the cell. Although it may not look like it
    'this is in fact unicode. It's kinda hidden from you.
    CellValue = rng.Value

    'go through the string character by character (note that
    'each character is 2 bytes - you just don't see it)
    For i = 1 To Len(CellValue)
    'get the unicode value for this character
    UnicodeChar = AscW(Mid$(CellValue, i, 1))
    'append this to our string - as unicode
    NewValue = NewValue & ChrW(UnicodeChar)
    Next i

    'Write our string back to the cell.
    'Again, this is unicode (no conversion necessary)
    CopyUnicodeToCellByCharacter = NewValue

    End Function


    [email protected] wrote:
    > Hi Gareth,
    > I mailed you the file. But if you didnt receive it please email me at
    > [email protected] or [email protected] and I will resend the file to you.
    >
    > Thank you for your time.
    >
    > regards,
    > julian
    >
    > Gareth wrote:
    >
    >>Hi Julian
    >>
    >>I've never used utf-8 I'm afraid. UTF-8 is a single byte character set
    >>but... it used 2 bytes when handling Arabic AFAIK. Excel supports
    >>standard unicode UTF-16.
    >>
    >>from http://czyborra.com/utf/#UTF-8
    >><<UTF-8 consumes two bytes for all non-Latin (Greek, Cyrillic, Arabic,
    >>etc.) letters that have traditionally been stored in one byte and three
    >>bytes for all symbols, syllabics and ideographs that have traditionally
    >>only needed a double byte. This can be considered a waste of space and
    >>bandwidth which is even tripled when the 8bit form is MIME-encoded as
    >>quoted-printable ("=C3=A4" is 6 bytes for the one character ä). SCSU
    >>aims to solve the compression problem.>>
    >>
    >>Sounds a bit strange to me but it makes sense to use 2 bytes - you just
    >>can't get all of the Arabic char set easily into 1 byte. ISO-8859-6
    >>tries but misses out some less commonly used letters/glyphs.
    >>
    >>You get char 63 - which in hex is 3F - I suspect this might be the
    >>leading byte in a two byte character.
    >>
    >>My questions:
    >>
    >>(a) Where is the file from / how is it created?
    >>
    >>(b) Are you sure the Arabic is in there correctly in the first place?
    >>(Try opening it up in IE to check or even in notepad.)
    >>
    >>If you find it *is* in there correctly feel free to send me a copy of
    >>the file and I'll take a look at it. Send it to the name excelvba with a
    >>domain name of garhoo and put a com at the end of it.
    >>
    >>HTH,
    >>Gareth
    >>
    >>
    >>
    >>[email protected] wrote:
    >>
    >>>I just found out. I think it's using xml version="1.0" encoding="utf-8"
    >>>Hope this info helps.
    >>>
    >>>Regards,
    >>>Julian
    >>>

    >
    >


+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1