utf-8 -> windows-1252 -> utf8 conversion in VBA

**JasperD** · 02-26-2016, 05:36 PM

Hi all,

I have a text file with millions of lines of text that has wrongly de/recoded text like: "fÃ¼r" instead of "für".
I know this is due to mix ups between UTF-8 and Windows-1252.
I see a C# solution here, but couldn't find a VBA solution.

If anyone can help out, that would be much appreciated!
Thanks,

Jasper

**thisoldman** · 02-27-2016, 08:19 AM

I found a possible vba solution at http://www.di-mgt.com.au/howto-conve...e-to-utf8.html.

I did not test this solution.

**JasperD** · 02-28-2016, 04:24 AM

Thanks for the suggestion, but I cannot get that to work...

**tony h** · 02-28-2016, 06:07 AM

it might be worth posting a sample of the text and what you think it should be. Otherwise it is difficult to test anything.

**thisoldman** · 02-28-2016, 07:34 AM

Excel's handling of UTF-8 is a frequent annoyance for me. I don't know if it's possible to re-read and correct strings once they've been loaded into Excel. I am very interested in that solution.

Another lead:

Reading UTF-8 into Excel using ADODB: http://www.ozgrid.com/forum/showthread.php?t=164547

My current work-around when I encounter this problem is to open the file in a code editor, Vim. From there it's trivial (if you know the proper incantations) to convert line endings from "dos" to "unix", set the encoding to UTF-8, and add a byte-order-mark, BOM.

I save the file with a .txt extension and Excel's text import wizard correctly recognizes the encoding and uses Windows code page 65001. If I forget to add the BOM, I have to manually set the "File origin" on the first page of Excel's import wizard.

I should note that I rarely work with non-latin characters: Greek, Cyrillic, Kanji, Hangul, Vietnamese, or any of the Chinese language encodings.

Most text editors choke on large files. 32-bit Vim can handle file sizes to 2Gb (2^31 - 1). 64-bit Vim should be able to handle into the exabyte range.
Good text/code editors can have atrociously steep learning curves. I'm looking at Vim and Emacs, especially. Or the software can have extremely heavy resource requirements, as with IDEs such as Eclipse.
UTF-8 encoding does not require a BOM. Microsoft's software chokes when the BOM is missing. Other software may choke when the BOM is included.
Microsoft documentation is not helpful when it implies that Unicode equals UTF-16.

utf-8 -> windows-1252 -> utf8 conversion in VBA

LinkBack

Thread Tools

Rate This Thread

Display

utf-8 -> windows-1252 -> utf8 conversion in VBA

Re: utf-8 -> windows-1252 -> utf8 conversion in VBA

Re: utf-8 -> windows-1252 -> utf8 conversion in VBA

Re: utf-8 -> windows-1252 -> utf8 conversion in VBA

Re: utf-8 -> windows-1252 -> utf8 conversion in VBA

Thread Information

Users Browsing this Thread

Similar Threads

Conversion from Windows XP to Windows 2007 has killed my macro - Run-time Error 1004

Utf8 encoding

Mac to Windows Excel Conversion Problem - formulas randomly convert to values

I have code to save a string as UTF8 - I need to save an entire Excel SHEET as UTF8!

Multiple windows problem after conversion to .xlw

VBA Excel =?windows-1252?Q?=93Set=94_question?=

[SOLVED] Deploying =?windows-1252?Q?=93Analysis_ToolPak=94?=

Bookmarks

Bookmarks

Posting Permissions