+ Reply to Thread
Results 1 to 5 of 5

Best way to find/replace Unicode characters

  1. #1
    Forum Contributor
    Join Date
    06-09-2009
    Location
    Greece
    MS-Off Ver
    Office 365
    Posts
    133

    Best way to find/replace Unicode characters

    I want to replace a number of Unicode characters (about 250) with different ones.

    For example, these
    ἐἑἒἓἔἕὲέἢἣἤἥἦ

    with these
    εεεεεεεεηηηηη

    What is the best way to do it with a macro? I know this way, but it seems too repetitive...

    Please Login or Register  to view this content.

  2. #2
    Forum Moderator - RIP Richard Buttrey's Avatar
    Join Date
    01-14-2008
    Location
    Stockton Heath, Cheshire, UK
    MS-Off Ver
    Office 365, Excel for Windows 2010 & Excel for Mac
    Posts
    29,464

    Re: Best way to find/replace Unicode characters

    Hi,

    Just create two arrays for each set of characters and then create a for next loop and use the loop counter as the index to the two arrays,
    Richard Buttrey

    RIP - d. 06/10/2022

    If any of the responses have helped then please consider rating them by clicking the small star icon below the post.

  3. #3
    Forum Expert macropod's Avatar
    Join Date
    12-22-2011
    Location
    Canberra, Australia
    MS-Off Ver
    Word, Excel & Powerpoint 2003 & 2010
    Posts
    3,726

    Re: Best way to find/replace Unicode characters

    You could use a macro like:
    Please Login or Register  to view this content.
    So far, the code will replace all accented lower-case α, ε & η characters with their unaccented versions. I'll leave you to add the rest and/or exclude any you don't want converted.

    Each new line beginning with StrFR has two parts - the wildcard Find string and a Replacement character. There is one new line beginning with StrFR per replacement character. Each Find string is bounded by the '[' and ']' characters, whilst the '-' characters denote a contiguous character range. The '|' characters differentiate the Find string and a Replacement characters and each part of the StrFR 'array'. The '&H????' strings are the Unicode values of the characters, which you can find via the Insert|Symbol dialogue.

    Although it looks complicated, the code as posted does all of the α, ε & η replacements with just three Find/Replace executions - one for each letter.
    Last edited by macropod; 03-16-2014 at 07:41 PM. Reason: Code mods, additional comments
    Cheers,
    Paul Edstein
    [Fmr MS MVP - Word]

  4. #4
    Forum Contributor
    Join Date
    06-09-2009
    Location
    Greece
    MS-Off Ver
    Office 365
    Posts
    133

    Re: Best way to find/replace Unicode characters

    Paul, wicked

    Just trying to understand how it works.

    I see in the line 3 StrFR = , then in lines 5 and 6 StrFR = StrFR . Why is that?

    Then I see at the beginning of a line

    "[" & ChrW(&H3AC) & ChrW(&H1F00)

    But further down the next character being separated via & "-" & ChrW(&H1F07)

    You said these characters denote a contiguous character range, but how do I figure out which of the characters I want to replace are contiguous or not.

    By the way, I use this to find the Unicode values of the characters (UTF-16 code units):
    http://rishida.net/tools/conversion/

  5. #5
    Forum Expert macropod's Avatar
    Join Date
    12-22-2011
    Location
    Canberra, Australia
    MS-Off Ver
    Word, Excel & Powerpoint 2003 & 2010
    Posts
    3,726

    Re: Best way to find/replace Unicode characters

    The first 'StrFR = ' line starts building the StrFR string. The next two, which use 'StrFR = StrFR &' add to that string. As I've coded it, each StrFR caters for one replacement letter; it doesn't have to be that way, but I thought it would make the code a tad easier to follow.

    If we look at the second StrFR line:
    StrFR = StrFR & "[" & ChrW(&H3AD) & ChrW(&H1F10) & "-" & ChrW(&H1F15) & ChrW(&H1F72) & ChrW(&H1F73) & "]|" & ChrW(&H3B5) & "|"
    the first thing you'll notice is that it's adding to the exsiting string.
    From the '[' to the ']' is all one wildcard Find expression. These square brackets tell Word to find any instance of whatever's between them. Usually, you might see something like '[A-Z0-9]', which would tell Word to find any capital letter or number. In this case, because we're looking for Unicode characters that you can't just type into the VBE, we have to tell Word what character values to look for. The 'ChrW(&H3AD) & ChrW(&H1F10)' is just two characters that are not part of a range of character values, just as 'Z0' in '[A-Z0-9]' are two unrelated characters. Where 'ChrW(&H1F10) & "-" & ChrW(&H1F15)', though, that's telling Word to look for anything in that range of character values, just as 'A-Z' is a range of character values.

    As for finding whether a series of characters occupies a contiguous value range, that's easily seen in the Insert|Symbol dialogue. If you see a bunch of accented α characters there, one after the other, it's a contiguous range. The dialogue also shows what values you need to use for them.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. [SOLVED] Find and replace special characters (unicode)
    By joevan1 in forum Excel Programming / VBA / Macros
    Replies: 5
    Last Post: 10-10-2013, 06:05 AM
  2. [SOLVED] find and replace to not replace characters found as wildcards
    By sabutler4 in forum Excel General
    Replies: 4
    Last Post: 07-03-2013, 06:48 PM
  3. Find and replace characters.
    By Cyberpawz in forum Excel Programming / VBA / Macros
    Replies: 10
    Last Post: 06-04-2012, 02:31 PM
  4. [SOLVED] Find and replace - max characters per row
    By dshilan in forum Excel General
    Replies: 1
    Last Post: 04-03-2012, 03:09 PM
  5. Find and Replace Characters
    By lytaylor in forum Excel Programming / VBA / Macros
    Replies: 0
    Last Post: 02-15-2005, 09:03 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1