+ Reply to Thread
Results 1 to 10 of 10

Loading in UTF-8, not ASCII

  1. #1
    Registered User
    Join Date
    01-04-2013
    Location
    United Kingdom
    MS-Off Ver
    Office 2013
    Posts
    40

    Question Loading in UTF-8, not ASCII

    Hi there,

    I have a very large text file (some 400,000 lines of text) that I'm loading into memory with VBA, I then use VBA to analyse and sort the data, before putting it into Excel. I use the folling code to load the text file into memory...

    Please Login or Register  to view this content.
    This takes a few seconds to run and has been working fine, up until some special characters have appeared in the text file, and so when I load in the file they get messed up. So for example "Sion" becomes "SiA'n" (missing some accents there). It turns out that the program that creates the text file can only output in UTF-8 encoding, and seemingly VBA only really deals with ASCII, hence the text getting screwed up. I have tried the following code to allow me to load in UTF-8, but it is painfully slow to the extent I cannot use it for my application. So my question is, does anyone have any other options I can try that allows me to load in UTF-8 encoded text, but that's also fast?

    Please Login or Register  to view this content.

    Chris

  2. #2
    Forum Expert Olly's Avatar
    Join Date
    09-10-2013
    Location
    Darlington, UK
    MS-Off Ver
    Excel 2016, 2019, 365
    Posts
    6,284

    Re: Loading in UTF-8, not ASCII

    Tried Power Query? Text.FromBinary allows you to specify the encoding.
    let Source = #table({"Question","Thread", "User"},{{"Answered","Mark Solved", "Add Reputation"}}) in Source

    If I give you Power Query (Get & Transform Data) code, and you don't know what to do with it, then CLICK HERE

    Walking the tightrope between genius and eejit...

  3. #3
    Registered User
    Join Date
    01-04-2013
    Location
    United Kingdom
    MS-Off Ver
    Office 2013
    Posts
    40

    Re: Loading in UTF-8, not ASCII

    Thanks for your reply. At this stage I'd rather not go down the road of installing anything, it's on a network with many different users using the file so that would need to be a last resort really.

  4. #4
    Forum Expert
    Join Date
    11-24-2013
    Location
    Paris, France
    MS-Off Ver
    Excel 2003 / 2010
    Posts
    9,831

    Re: Loading in UTF-8, not ASCII


    Hi !

    As Excel can read txt file encoded in utf-8, just see within the import assistant …

  5. #5
    Registered User
    Join Date
    01-04-2013
    Location
    United Kingdom
    MS-Off Ver
    Office 2013
    Posts
    40

    Re: Loading in UTF-8, not ASCII

    All needs to be done entirely within VBA as it's an automated process.

  6. #6
    Forum Expert
    Join Date
    11-24-2013
    Location
    Paris, France
    MS-Off Ver
    Excel 2003 / 2010
    Posts
    9,831

    Re: Loading in UTF-8, not ASCII


    No problemo !

    As everyone can try manually and activate the Macro Recorder …

  7. #7
    Registered User
    Join Date
    01-04-2013
    Location
    United Kingdom
    MS-Off Ver
    Office 2013
    Posts
    40

    Re: Loading in UTF-8, not ASCII

    That certainly gets me the data into Excel, but it's not ideal for getting it into memory in one long string. I use Shell() to access the program that delivers me the UTF-8 text file, but I'm thinking I can use Windows powershell to convert the encoding, which you can do via Shell(). It's not the fastest of options, takes around 5 to 10 seconds to do the conversion depending on the file size, but then that allows me to load in the text file as before. That could be the best way, I shall do some testing.

  8. #8
    Forum Expert
    Join Date
    11-24-2013
    Location
    Paris, France
    MS-Off Ver
    Excel 2003 / 2010
    Posts
    9,831

    Re: Loading in UTF-8, not ASCII


    Yes but within a temp worksheet you can apply a filter, a sort, … and as Excel internal features it's pretty fast in VBA.

    The other way I use too is ADODB.Stream …

  9. #9
    Registered User
    Join Date
    01-04-2013
    Location
    United Kingdom
    MS-Off Ver
    Office 2013
    Posts
    40

    Re: Loading in UTF-8, not ASCII

    Ok I'm going to do a comparison test between Workbook.Open to load in UTF-8, and the powershell conversion, which I then need to load in. Whichever comes in fastest I shall use!

    ADODB.Stream is far too slow for the size of text files I'm dealing with sadly.

    Thanks!

  10. #10
    Registered User
    Join Date
    01-04-2013
    Location
    United Kingdom
    MS-Off Ver
    Office 2013
    Posts
    40

    Re: Loading in UTF-8, not ASCII

    Just to update, I've gone down the powershell route. The files I'm working with are actually XML files, but I need to read them in as text, Workbook.OpenText doesn't like that, and just tries to load the file as XML. So the faff of having to rename the file to .txt, load in as text, then get the text into memory from the worksheet and use Join() to decipher the XML was too long winded. Using powershell you can use the [io.file]::ReadAsText command, and then WriteAsText to output as ASCII, this is pretty much instant (with ~20MB files anyway), and then I just have to load this file in as normal. As this process has almost no overhead it's the obvious choice for me.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. ASCII file
    By Issacraj in forum Excel Formulas & Functions
    Replies: 5
    Last Post: 07-22-2012, 04:13 PM
  2. Save/Export ASCII to ASCII file
    By acrobaticgod in forum Excel General
    Replies: 1
    Last Post: 02-07-2012, 05:40 PM
  3. ASCII Format
    By hmenke in forum Excel General
    Replies: 1
    Last Post: 08-16-2007, 02:25 AM
  4. [SOLVED] Ascii symbols
    By Percy in forum Excel General
    Replies: 5
    Last Post: 07-21-2006, 10:00 PM
  5. VBA - Hex To ASCII
    By ajocius in forum Excel Programming / VBA / Macros
    Replies: 3
    Last Post: 05-22-2006, 11:20 AM
  6. ASCII format
    By jimbob in forum Excel General
    Replies: 1
    Last Post: 03-18-2006, 09:30 AM
  7. importing ASCII
    By tt in forum Excel General
    Replies: 1
    Last Post: 03-02-2006, 11:35 PM
  8. [SOLVED] Convert ASCII to Hex
    By Scott Sanford in forum Excel General
    Replies: 3
    Last Post: 04-26-2005, 06:06 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1