+ Reply to Thread
Results 1 to 11 of 11

Performance Issues on Large Spreadsheet

  1. #1
    Forum Contributor
    Join Date
    09-16-2009
    Location
    Dallas, TX
    MS-Off Ver
    Office 2013
    Posts
    107

    Performance Issues on Large Spreadsheet

    I have a spreadsheet I use at work to keep track of metrics for a group of vendors, whose job it is to repair watches and jewelry. The vendors submit an excel file each month, which consists of 3 tabs:

    Tab 1: Summary information about the vendor and the scorecard
    Tab 2: Repair Listing--one row for each Repair Order and lots of associated info
    Tab 3: SKU listing--one row for each service/SKU that the vendor performed (and thus will be paid for). Each SKU is linked to its Repair Order by the Repair #, so there is a one-to-many relationship between repair orders and repair SKUs.

    All the scorecards are brought into one workbook and then one page aggregates all the metrics onto one page. Then those metrics are boiled down further to a cover page that contains our monthly published metrics.

    The problem is that the worksheet that pulls in all the data from the scorecards takes a long time to run. I've removed most of the vendors in the attached version for easier analysis but it's still very slow and is much slower with three times as many vendors. I need help optimizing the formulas to be faster--a lot faster, if possible.

    I'd be open to VBA, but it makes the workbook less maintainable for my analysts, so I'd rather stay away from it if there's a good formula solution.

    On the attached example, I've tried to remove personal identifying information wherever possible, but if you run into some please ignore it. Thanks.

    Thank you in advance for your time; I've been working on this a while and could really use some help.

    Thanks,

    Dan
    Attached Files Attached Files

  2. #2
    Forum Guru TMS's Avatar
    Join Date
    07-15-2010
    Location
    The Great City of Manchester, NW England ;-)
    MS-Off Ver
    MSO 2007,2010,365
    Posts
    44,419

    Re: Performance Issues on Large Spreadsheet

    I suggest that you go to each sheet and press Ctrl-End. You have a lot of rows on several sheets but, on a few of them, Excel thinks the last used row is much further down than it should be.

    Where that is the case, you need to delete all rows below the last "real" row and all columns to the right of the last "real" column using the Delete icon on the ribbon, NOT the delete key on the keyboard.


    You use INDIRECT a lot. That is a volatile function and will recalculate whenever a change is made.

    It might be better t define the data as Tables then Excel will manage the ranges for you.


    Regards, TMS
    Trevor Shuttleworth - Retired Excel/VBA Consultant

    I dream of a better world where chickens can cross the road without having their motives questioned

    'Being unapologetic means never having to say you're sorry' John Cooper Clarke


  3. #3
    Forum Contributor
    Join Date
    09-16-2009
    Location
    Dallas, TX
    MS-Off Ver
    Office 2013
    Posts
    107

    Re: Performance Issues on Large Spreadsheet

    Thanks, TMS. The data actually comes from our vendors, and there are lots and lots of scorecards, so it's impractical to manually change every scorecard during the import process (i.e. deleting rows, defining tables, etc.). My changes pretty much need to be to the summary page that aggregates the metric.

    Is there a way to take advantage of your suggestions without needing to modify the original vendor submissions? I know Indirect is inefficient but I'm not sure how to get around it and retain the same flexibility.

    Thanks again for taking a look.

  4. #4
    Forum Guru TMS's Avatar
    Join Date
    07-15-2010
    Location
    The Great City of Manchester, NW England ;-)
    MS-Off Ver
    MSO 2007,2010,365
    Posts
    44,419

    Re: Performance Issues on Large Spreadsheet

    These formulae evaluate to SUMPRODUCT across full column references:

    Formula: copy to clipboard
    Please Login or Register  to view this content.


    Formula: copy to clipboard
    Please Login or Register  to view this content.


    Formula: copy to clipboard
    Please Login or Register  to view this content.


    Formula: copy to clipboard
    Please Login or Register  to view this content.


    Formula: copy to clipboard
    Please Login or Register  to view this content.


    Formula: copy to clipboard
    Please Login or Register  to view this content.



    As, with Excel 2007 and above, there are over a million rows in a worksheet. So, in each case, you are making a making a matrix of 1,000,000+ times however many columns. Guaranteed to be slow.

    I think you would be better making those references fixed and use an arbitrary high number of rows, say 30,000, 50,000 or 100,000 depending on how many transactions you are likely to need. Bit of a risk but a calculated one.

    As you are already preparing helper columns with row numbers, it would make sense if you did that for the transactions sheets. And better to do those calculations once and "build" the range as you have done and then refer to it as required. Then you could accurately calculate the range to use in your INDIRECTs.

    Regards, TMS

  5. #5
    Forum Expert newdoverman's Avatar
    Join Date
    02-07-2013
    Location
    Port Dover, Ontario, Canada
    MS-Off Ver
    2010
    Posts
    10,330

    Re: Performance Issues on Large Spreadsheet

    Is it possible to have the calculations made on each worksheet in designated cells (the same on each worksheet for the same value calculated) then on the 14May worksheet have simple formulae addressing those pre-calculated cells.

    Example:

    On worksheet 1W14May!AA1 have a formula like =COUNTIF(A:A,">0") Then just reference this cell in 14May!D7 with =INDIRECT(B7&"!AA1")

    If you could do this kind of thing, then the calculations on 14May worksheet would be greatly simplified.
    <---------If you like someone's answer, click the star to the left of one of their posts to give them a reputation point for that answer.
    Ron W

  6. #6
    Forum Guru TMS's Avatar
    Join Date
    07-15-2010
    Location
    The Great City of Manchester, NW England ;-)
    MS-Off Ver
    MSO 2007,2010,365
    Posts
    44,419

    Re: Performance Issues on Large Spreadsheet

    @newdoverman

    so it's impractical to manually change every scorecard during the import process
    If I understand correctly, calculation on the imported sheets would be good but maybe not possible

  7. #7
    Forum Expert newdoverman's Avatar
    Join Date
    02-07-2013
    Location
    Port Dover, Ontario, Canada
    MS-Off Ver
    2010
    Posts
    10,330

    Re: Performance Issues on Large Spreadsheet

    @TMS

    If the worksheets imported were created from templates with the formulae in place the formulae on the 14May worksheet would be dealing with retrieving constants for the most part. I think that what we don't see is probably more important than what we do see. With nearly 20K rows on some worksheets, it makes me wonder how many worksheets there are that have this amount or more data. Combine that with the full column references and that will "kill" most computer's performance.

    Maybe Excel isn't the best tool for this.

  8. #8
    Forum Guru TMS's Avatar
    Join Date
    07-15-2010
    Location
    The Great City of Manchester, NW England ;-)
    MS-Off Ver
    MSO 2007,2010,365
    Posts
    44,419

    Re: Performance Issues on Large Spreadsheet

    Maybe Excel isn't the best tool for this.
    I think we need some feedback from the OP.

    I think it is possible that, if those column references are adjusted, either to a fixed range or to a calculated range (somewhat less than a million rows), then performance might improve.

    If there are a lot of suppliers, it might be a pain to get them all to start using an updated template ... but f=good thought.

    Regards, TMS

  9. #9
    Forum Expert newdoverman's Avatar
    Join Date
    02-07-2013
    Location
    Port Dover, Ontario, Canada
    MS-Off Ver
    2010
    Posts
    10,330

    Re: Performance Issues on Large Spreadsheet

    I have taken some time to add tables to each of the worksheets and then used table nomenclature to make the calculations as much as I could. The formulae on the 14May worksheet have been changed to reference the calculated cells on the worksheets. I also ran an excess formats cleaner to rid the worksheet of cells not involved in the calculations.

    This was done using my slowest computer and I notice an increase in performance. This workbook is never going to be fast due to the massive number of calculations involved.
    Attached Files Attached Files

  10. #10
    Forum Contributor
    Join Date
    09-16-2009
    Location
    Dallas, TX
    MS-Off Ver
    Office 2013
    Posts
    107

    Re: Performance Issues on Large Spreadsheet

    Thanks for all the thoughts and advice. There are about 15 vendors, so about 3x as many as on the sample sheet. The average size of their submissions is probably consistent with the sample data.

    I've tried shrinking the reference ranges from the full column to detecting the end, the difference in performance has been undetectable and only serves to make the formulae more difficult to read. I suspect that's because Excel is aware of its last used row and doesn't calculate beyond that.

    I think Newdoverman hit the nail on the head here... maybe Excel just isn't the right tool for the job. We've discussed moving this into Access where the lookups would be dramatically faster, and I think we might be able to do so without switching templates. The only things is that Access is so much pickier about bad data than Excel, and our vendors aren't the most sophisticated businesses in the world.

    Thanks everyone for weighing in; I know most of you far exceed my Excel ability so I was hoping there was some magical optimization to my formulae that would improve processing time by 90%.

  11. #11
    Forum Expert newdoverman's Avatar
    Join Date
    02-07-2013
    Location
    Port Dover, Ontario, Canada
    MS-Off Ver
    2010
    Posts
    10,330

    Re: Performance Issues on Large Spreadsheet

    To shrink the ranges, I used the Clear Excess Formatting tool from Microsoft which is an add-on for Excel.
    http://xsformatcleaner.codeplex.com/releases/view/98007


    Did you try the method that I created for you using tables and calculations on the worksheets themselves and then just addressing the cells with the results. This takes the "load of calculation" off of the 14May worksheet?

    A database solution I think is the key for you especially when the number of rows being calculated must be over 100k.

    Good luck with the project.
    Last edited by newdoverman; 06-18-2014 at 09:30 AM.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Performance issues with LOOKUP
    By johannes121 in forum Excel Formulas & Functions
    Replies: 0
    Last Post: 03-25-2013, 06:47 AM
  2. XL 2010 performance issues
    By Stevef8 in forum Excel General
    Replies: 0
    Last Post: 05-31-2012, 03:29 PM
  3. Performance and efficiency issues
    By jacobleaps in forum Excel General
    Replies: 2
    Last Post: 08-10-2011, 11:14 PM
  4. Performance issue on large spreadsheet
    By edwar368 in forum Excel Programming / VBA / Macros
    Replies: 2
    Last Post: 06-14-2011, 12:44 PM
  5. VLOOKUP performance issues
    By jbernhard in forum Excel General
    Replies: 3
    Last Post: 01-11-2010, 06:50 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1