I have data files that get extracted from an online system which is saved in .csv format.
The row separator is tab-delimited and the column separator is pipe-delimited (|).
Currently the .csv file sizes are extremely big (300 to 330MB). When I convert them in Excel it's still quite big (150-190MB).
There are roughly 33 million cells of data (roughly 950 000 rows and 35 columns).
The files contain no formulas, conditional formatting or anything except the converted data, I did check for used range size and there are no extra empty cells after the last row or column.
I do realize the amount of data could be the problem of the file size?
I cannot just open the .csv file in Excel because it reads it as a comma separator and I lose some data on certain rows because 1 of the columns' data has a comma in.
Methods I've tried that works:
- f I import the .csv file in Excel and set the delimiter to pipe.
- I've also opened the .csv file in Notepad, save it as .txt and then import, which works
- Also opened .csv file in Notepad, copy data and paste in clean Excel workbook, then run text-to-column and set delimiter as pipe
- I'm trying to use Power Query to see if this works easier and quicker as an option at the moment.
I will have to extract and convert these files to Excel files on a daily basis, which is time-consuming since the process is tedious but also the big files take long to open.
So my questions are:
- Is is normal for the file sizes to be this big?
- Is there an easier way to do this conversion on a daily basis?
- Will Power Query work if I load these files into a Data model (The aim is to use the data on a weekly/monthly basis to create 20 different graphs that show shows either weekly or monthly comparisons)
If there are any other options I am open to it. I can do some VBA coding if that is needed.
Bookmarks