+ Reply to Thread
Page 2 of 2 FirstFirst 12
Results 16 to 29 of 29
  1. #16
    Forum Moderator shg's Avatar
    Join Date
    06-21-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2007
    Posts
    25,134

    Re: Excel 2007 CPU Usage

    The last full calc of the old sheet ended up taking ~165mins...just timed the new one at 28 seconds!
    Whoa! good job!

    EDIT: And for our listeners, thats a 300X improvemnt. Take heed.
    Last edited by shg; 03-12-2010 at 01:15 AM.
    Microsoft MVP - Excel
    Entia non sunt multiplicanda sine necessitate

  2. #17
    Forum Guru
    Join Date
    01-03-2006
    Location
    Basingstoke, England
    MS-Off Ver
    2007 (work & home)
    Posts
    2,240

    Re: Excel 2007 CPU Usage

    Fantastic improvement!

    If you're happy with the current improvement can you please mark the thread as Solved (see rules for instructions)?

    However, I'm still gunning for more improvement - you can't steal the teaser of an optimisation brain teaser that easily

    It's a little late in the thread for this..., but, because I haven't dealt with many large data sets, I still think 28 seconds is a long time - although, I stand to be over-ruled!
    (see http://www.decisionmodels.com/optspeed.htm).

    - Are all "repetitive calculations" (mentioned by DO) split into different columns, so they are only calculated once with the result cell then being referenced in subsequent formulae?
    - I may be way off the mark here, but with your use of conditional summing etc, would a pivot table approach (see Jon's site) be possible?
    (even if it is just as an intermediate step)

    Rob
    Rob Brockett
    Kiwi in the UK
    Always learning & the best way to learn is to experience...

  3. #18
    Forum Moderator DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Suffolk, UK
    MS-Off Ver
    2002, 2007 & 2010
    Posts
    21,379

    Re: Excel 2007 CPU Usage

    Echoing the others - top effort - and good on you for resolving under your own steam (even if we did miss out )

    Quote Originally Posted by JonnyB
    it ain't pretty and uses up a lot more cells than the original, but its lightning quick comparatively
    sad but true ... elegance <> efficiency

    Quote Originally Posted by JonnyB
    ...as well as many dynamic named range
    remember that though they

    a) create reduced range references

    b) improve readability (shorten syntax)

    Dynamic Named Ranges do involve overheads in so far as the calculations used to generate them are recalculated each time they are used.

    Note also that if they DNR's are created with OFFSET / INDIRECT etc they make all formulae utilising them Volatile by association.

    If we add to the above the fact that the greater majority of formulae use only the Used Range of whatever range is specified [edit: at least that is my understanding], eg:

    =SUMIF(A:A,A1,B:B)

    will not process all the rows of A & B - only the used range intersect of A:B

    then I would say that sometimes (ie where calculation speed is of paramount importance) the performance "cost" of using DNR's can actually outweigh the benefits previously outlined.
    The most obvious exceptions to the "Used Range" rule are SUMPRODUCT and Arrays (and I suspect SUMIFS / COUNTIFS though perhaps to a lesser extent) ... in such instances the use of non-volatile Named Ranges is often a no-brainer.

    Needless to say it all boils down to the context in which they're being used ie:

    1. which formulae they're being used with (might not be necessary)
    2. whether they're Volatile (created with OFFSET or INDEX for ex.)

    If you feel your model might include some of the above you could perhaps create an alternative version without them (where appropriate) and see if performance improves any further.
    Last edited by DonkeyOte; 03-12-2010 at 07:03 AM.

  4. #19
    Forum Moderator teylyn's Avatar
    Join Date
    10-28-2008
    Location
    New Zealand
    MS-Off Ver
    2003 & 2010
    Posts
    10,042

    Re: Excel 2007 CPU Usage

    This one should be thread of the month.

    ... now there's an idea! ... off to the Water Cooler ...
    teylyn
    Microsoft MVP - Excel
    At Excelforum, you can say "Thank you!" by clicking the icon below the post.

    Avoid pie charts with more than two data points. Why? See here (pdf, 559 kb). The only acceptable pie chart is here.

  5. #20
    Registered User
    Join Date
    02-12-2010
    Location
    Little Rock, Arkansas
    MS-Off Ver
    Excel 2007
    Posts
    29

    Re: Excel 2007 CPU Usage

    - Are all "repetitive calculations" (mentioned by DO) split into different columns, so they are only calculated once with the result cell then being referenced in subsequent formulae?
    Yes, I do think I caught all of these. And thanks for the pivot table suggestion, I could potentially use it in the results summary, but I don't think it would apply where the conditional formulas were originally doing the heavy lifting in the sheet.

    Dynamic Named Ranges do involve overheads in so far as the calculations used to generate them are recalculated each time they are used.

    Note also that if they DNR's are created with OFFSET / INDIRECT etc they make all formulae utilising them Volatile by association.
    Yep, this was one part of the reconstruction that gave me pause. The DNRs use OFFSET(). The amount of data input into the sheet will vary wildly...say from 10k rows to 150k or more. So the choice was either to make the named ranges dynamic using OFFSET() or make them static with more rows than I ever think I'll need, say 500k. Actually now that I think about it, I may have made the wrong choice. Just about the only function that is referencing these named ranges is INDEX() so I don't think having extraneous rows included would hurt performance at all (unless it came from increased memory demands). So by going with static named ranges, the sheet would be free of anything volatile. The obvious downside to going static of course is that if i ever exceed 500k rows, I'll need to redo all the ranges (if i remember, that is...will need to put a big red alarm cell somewhere).

    Anyway, I need to sleep on this potential change. Thank you all once again. If I don't make the change tomorrow and report back with the performance difference I'll be sure to have this marked as SOLVED.
    Last edited by DonkeyOte; 03-13-2010 at 04:21 AM. Reason: corrected tag

  6. #21
    Forum Moderator DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Suffolk, UK
    MS-Off Ver
    2002, 2007 & 2010
    Posts
    21,379

    Re: Excel 2007 CPU Usage

    Quote Originally Posted by JonnyB
    So the choice was either to make the named ranges dynamic using OFFSET() or make them static with more rows than I ever think I'll need, say 500k
    It is possible to create DNR's without them being Volatile c/o INDEX.

    Consider (for sake of example):

    Data in A:C
    Data type in A is Numeric
    Last row of data in Col A dictates the height of the 3 Column Range being created.

    Using OFFSET (all functions utilising the below DNR become Volatile by association):

    =OFFSET($A$1,,,COUNTA($A:$A),3)

    Using INDEX (semi-volatile - ie only on file open not thereafter)

    =$A$1:INDEX($C:$C,MATCH(9.99E+307,$A:$A))
    the latter will work even with non-contiguous data sets
    (if the data type is not consistent there are alternatives to cater for that)

    Whether or not a Volatile DNR will have a significant impact depends largely on the functions utilising it - if they are lightweight and efficient then the impact is lessened ... obviously if the opposite is true then avoiding the Volatility in the DNR becomes something of a no-brainer IMO.

    To reiterate though - if the formula are relatively efficient then hard wiring a range may not be a bad idea - best way would be to test of course.

    Let us know how you get on - I for one am quite interested in the performance impact (if any - and what it amounts to).

  7. #22
    Forum Guru
    Join Date
    01-03-2006
    Location
    Basingstoke, England
    MS-Off Ver
    2007 (work & home)
    Posts
    2,240

    Re: Excel 2007 CPU Usage

    hi all,

    Quote Originally Posted by DonkeyOte View Post
    Originally Posted by JonnyB
    So the choice was either to make the named ranges dynamic using OFFSET() or make them static with more rows than I ever think I'll need, say 500k
    It is possible to create DNR's without them being Volatile c/o INDEX.
    ...
    Let us know how you get on - I for one am quite interested in the performance impact (if any - and what it amounts to).
    I'm interested too :-)

    If you have a lot of columns that you'd like to create "index based" DNR's for you may find the below macro useful. Note, the constants will need to be changed to match your file layout & the number of rows in the DNR's on the sheet are all based the last row in one column (Colno). This makes the created DNR's less flexible than DonkeyOte's version, but on the plus side, it is much quicker to set up a lot of them for a specific sheet.

    DO/Shg/others,
    Feel free to make comments for improving my modified version of Roger's code or if there are other major weaknesses. I've been tempted to put it in its own thread but it works "as is" so I haven't bothered.
    (one day I might throw in an inputbox to get user interaction... + change the IIF structure based on comments in another thread & check the contents of the specchararr)


    Code:
    Option Explicit
    Sub CreateNames()
    ' written by Roger Govier, Technology4U
    'RB: 14/10/2009, modified after being sourced from http://www.contextures.com/xlNames03.html
    Dim WB As Workbook, ws As Worksheet
    Dim lrow As Long, lcol As Long, i As Long
    Dim myName As String, Start As String
        '26/02/2010, RB: Define a sheetname prefix to prevent confusion with existing dynamic ranges on other sheets.
    Const wsAbbrev As String = "RT_"
        ' set the row number where headings are held as a constant
        ' change this to the row number required if not row 1
    Const Rowno As Long = 5
        ' set the Offset as the number of rows below Rowno, where the
        ' data begins
    Const ROffset As Long = 1
        ' set the starting column for the data, in this case 1
        ' change if the data does not start in column A
    Const Colno As Long = 1
        ' Set an Offset from the starting column, for the column number that
        ' will always have data entered, and will therefore be used in calculating lrow
    Const COffset As Long = 0  ' in this case, the first column will always contain data.
        On Error GoTo CreateNames_Error
        Set WB = ActiveWorkbook
        Set ws = ActiveSheet
        ' count the number of columns used in the row designated to
        ' have the header names
        With ws
            lcol = .Cells(Rowno, .Columns.Count).End(xlToLeft).Column
            lrow = .Cells(.Rows.Count, Colno).End(xlUp).Row
            Start = .Cells(Rowno, Colno).Address
        End With
        With WB.Names
            .Add Name:=wsAbbrev & "lcol", RefersTo:="=COUNTA($" & Rowno & ":$" & Rowno & ")"
            .Add Name:=wsAbbrev & "lrow", RefersToR1C1:="=COUNTA(R" & Rowno & "C" & Colno + COffset & ":R" & Rows.Count & "C" & Colno + COffset & ")"
            .Add Name:=wsAbbrev & "DynSourceDataForPT", RefersTo:= _
                 "=" & Start & ":INDEX($" & Rowno & ":$" & Rows.Count & "," & wsAbbrev & "lrow," & wsAbbrev & "Lcol)"
        End With
        For i = Colno To lcol
            ' if a column header contains spaces, replace the space with an underscore
            ' spaces are not allowed in range names.
            'RB: originally        myName = Replace(Cells(Rowno, i).Value, " ", "_")
            'modified to...
            myName = wsAbbrev & CleanStr(Cells(Rowno, i).Value)
            '26/02/2010, RB: the next IF statement should not bew needed due to the addition of "wsAbbrev & "
            If IsNumeric(Left(myName, 1)) Then myName = "_" & myName
            If myName = "" Then
                ' if column header is blank, warn the user and stop the macro at that point
                ' names will only be created for those cells with text in them.
                'MsgBox "Missing Name in column " & i & vbCrLf _
                 & "Please Enter a value in the Header row and run macro again"
                'Exit Sub
            Else
                WB.Names.Add Name:=myName, RefersToR1C1:="=R" & Rowno + ROffset & "C" & i & ":INDEX(R" & Rowno + ROffset & "C" & i & ":R" & Rows.Count & "C" & i & "," & wsAbbrev & "lrow - 1)"
            End If
        Next i
        On Error GoTo 0
        MsgBox "All dynamic Named ranges have been created"
        Set ws = Nothing
        Set WB = Nothing
        Exit Sub
    CreateNames_Error:
        MsgBox "Error " & Err.Number & " (" & Err.Description & _
               ") in procedure CreateNames of Module Technology4U"
        Stop    'to allow for investigation while the macro is still "running"
        Set ws = Nothing
        Set WB = Nothing
    End Sub
    Private Function CleanStr(ByVal Str As String) As String
    '14/10/2009 modified from "CleanFolderName" to meet the needs of Excel Names ie no spaces or &'s
    ' 14/11/2008, sourced from: _
      http://www.dailydoseofexcel.com/archives/2006/05/24/creating-folders-with-mkdir/
    'I changed it from the posted version of "For i = 1 To Len(Str) _
       Select Case Mid$(Str, i, 1) _
       Case "/", "\", ":", "*", "?", "", "|" _ ..." to use "specchararr"
    Dim i As Long
    Dim sTemp As String
    Dim SpecCharArr As Variant
        SpecCharArr = Array("/", "\", ":", "*", "?", "", "|", " ", "&", "'", "(", ")", "/", "-")
        For i = LBound(SpecCharArr) To UBound(SpecCharArr)
            sTemp = Replace(IIf(i = LBound(SpecCharArr), Str, sTemp), SpecCharArr(i), "_")
        Next i
        CleanStr = sTemp
    End Function

    hth
    Rob
    Last edited by broro183; 03-13-2010 at 09:20 AM.
    Rob Brockett
    Kiwi in the UK
    Always learning & the best way to learn is to experience...

  8. #23
    Registered User
    Join Date
    02-12-2010
    Location
    Little Rock, Arkansas
    MS-Off Ver
    Excel 2007
    Posts
    29

    Re: Excel 2007 CPU Usage

    Hi Everyone,

    My apologies for being unattentive to this thread, I came down with a virus and didn't do much for the better part of a week. I'm finally just getting caught up on my work now.

    As far as the changes to the DNRs are concerned (removing OFFSET())...unfortunately, there was no material performance difference. However, I did restructure the sheet and add some VBA code which further cut the recalc time from 28 secs to 18 secs. Which is great, but I'm noticing that of those 18 seconds, 10 of them are coming from one column. So I think I can get it down even further (I'm finally making good on providing an optimization challenge).

    The attached sheet demonstrates the problem (Column C). The signal IDs in column A represent a unique pair (i.e. there are two #1s, two #2s, etc.). On the second occurrence of a signal ID (reading top-to-bottom), Column C aims to return the value from Column B when the signal ID from the current row first occurred (that is rather confusing to read - it should be made clear in the example). As you'll see the calc time is approximately 10 seconds (actually only around 8 in this sample sheet). I could live with this, but the real problem is that filling down and deleting the formulas takes forever.

    [I cannot sort the order of signal IDs and have everything running on manual calc]

    Would greatly appreciate any ideas -- even if the calc time doesn't improve, something that will delete faster would help a lot.

    -JonnyB
    Attached Files Attached Files

  9. #24
    Forum Moderator DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Suffolk, UK
    MS-Off Ver
    2002, 2007 & 2010
    Posts
    21,379

    Re: Excel 2007 CPU Usage

    If you can't sort the data you will not really be in a position to improve performance with formulae I'm afraid - sorting the data allows for binary search which on large data sets is hugely beneficial (only a tiny subset of the data is ever reviewed).

    You could perhaps adopt a VBA based solution - ie something along the lines of:

    Code:
    Sub Example()
    Dim rngCopy As Range
    Application.ScreenUpdating = False
    With Range(Cells(6, "A"), Cells(Rows.Count, "A").End(xlUp)).Resize(, 2)
        .Copy .Offset(, 4)
        With .Offset(, 6).Resize(, 1)
            .FormulaR1C1 = "=ROW(RC)"
            .Value = .Value
        End With
        Set rngCopy = .Offset(, 4).Resize(, 3)
        With rngCopy
            .Sort Key1:=.Columns(1), Order1:=xlAscending, Key2:=.Columns(3), order2:=xlDescending
        End With
        With .Offset(, 2).Resize(, 1)
            .FormulaR1C1 = "=IF(VLOOKUP(RC1,C5:C7,3)=ROW(RC),"""",VLOOKUP(RC1,C5:C6,2))"
            .Value = .Value
        End With
        rngCopy.Clear
        Set rngCopy = Nothing
    End With
    Application.ScreenUpdating = True
    End Sub
    I'm sure the VBA guys can improve on the above - but on your data set the above would be pretty instantaneous.

    In essence it copies the existing data set (Cols A:B) to a new temp table, sorts it, applies a formula (col C) which utilises binary search against the sorted table and then subsequently overrides the formulae with results and removes the temp table.
    Last edited by DonkeyOte; 03-27-2010 at 05:01 AM. Reason: modified formula to account for 0 being first value

  10. #25
    Registered User
    Join Date
    02-12-2010
    Location
    Little Rock, Arkansas
    MS-Off Ver
    Excel 2007
    Posts
    29

    Re: Excel 2007 CPU Usage

    Thanks for taking the time with your suggestion DonkeyOte. That would work nicely as you displayed...the only issue is that Column B is actually formula driven as well, so using VBA code wouldn't quite fit. I was able to come up with a non-VBA solution, and to be honest I'm not quite sure why it works so much better, but it deletes instantly. I know part of it is because the if statement reduces by half the number of times the formula is called, but they still doesn't explain all of it. In case its of interest to anyone, I've attached a sheet with the solution (had to delete a bunch of rows to get the file size down, but the gist of it is there). Thanks again.
    Attached Files Attached Files

  11. #26
    Forum Moderator DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Suffolk, UK
    MS-Off Ver
    2002, 2007 & 2010
    Posts
    21,379

    Re: Excel 2007 CPU Usage

    Quote Originally Posted by JonnyB
    That would work nicely as you displayed...the only issue is that Column B is actually formula driven as well, so using VBA code wouldn't quite fit.
    I'm not really sure why the use of formulae in B precludes VBA - you would want to avoid using .Copy in the form presented but you could paste values only to the temp table either via pastespecial or by simply using range.value = range.value approach (avoiding clipboard).

    Quote Originally Posted by JonnyB
    I was able to come up with a non-VBA solution, and to be honest I'm not quite sure why it works so much better, but it deletes instantly.
    As you say the pre-emptive IF will have a large impact - esp. with 20k rows.

    OFFSET is regarded as one of the quicker functions (albeit Volatile) and INDEX one of slower functions (albeit non-Volatile) though I have to say using your sample file and the following approaches:

    =IF(A6=1,"",OFFSET(C5,$D$2-ROW()+MATCH(B6,$B$5:B5,0),))
    =IF(A6=1,"",VLOOKUP(B6,$B$5:$C5,2,0))
    =IF(A6=1,"",INDEX($C$5:$C5,MATCH($B6,$B$5:$B5,0)))

    all calculated in around 2 seconds over the 14k rows... the VLOOKUP was the slowest, INDEX second slowest and OFFSET was quickest (but being Volatile were the formulae to persist they would calculate more often than perhaps necessary).

    It might be worth outlining what you're doing exactly - given you mention deleting etc ...

  12. #27
    Registered User
    Join Date
    02-12-2010
    Location
    Little Rock, Arkansas
    MS-Off Ver
    Excel 2007
    Posts
    29

    Re: Excel 2007 CPU Usage

    DonkeyOte- The problem I face with a VBA solution and sorting the entire range is that there are formula dependencies in Column B on previous values of Column C. So I cannot assume that all Column B values are known and then just sort and lookup for the entirety of Column C. I'm not sure how to get around this...I would guess that a loop would take a while. I'm not as encouraged as I first was about the OFFSET() solution...I tested on 35k rows and it was considerably slower (20 seconds).
    Last edited by JonnyB; 03-29-2010 at 01:04 AM.

  13. #28
    Forum Moderator DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Suffolk, UK
    MS-Off Ver
    2002, 2007 & 2010
    Posts
    21,379

    Re: Excel 2007 CPU Usage

    Quote Originally Posted by JonnyB
    The problem I face with a VBA solution and sorting the entire range is that there are formula dependencies in Column B on previous values of Column C. So I cannot assume that all Column B values are known and then just sort and lookup for the entirety of Column C. I'm not sure how to get around this...
    I confess you've lost me now I'm afraid.

    Your prior file(s) implied that PrevSize was derived from Size based on first instance of SignalID ... you're now implying that PrevSize is itself derived from Size - ie a circular reference of sorts.
    (edit: are you perhaps saying that Size is derived from PriorSize in prior rows ?)

    To be clear though - the suggestion I posted previously does not in any way affect the ordering of the original table.

    The code:

    i) copied the original table to a new temporary location
    (given we now know formulae reside in original we would revise to a values based approach)

    ii) sorted the temp table

    The sorting of the data set thus permits use of Binary Search based Lookups to calculate PrevSize which will obviously have significance gain when compared to the 20k exact match tests.

    Once PrevSize was calculated the code removed the formulae (references to the temp table) and the temp table itself was purged.

    It would seem however that I'm missing something pretty fundamental in the process so I'll keep quiet until such time as the "penny drops" (if and when).
    (edit: perhaps post an example which illustrates the relationship between Size and PrevSize - ie formulae in place re: Size)
    Last edited by DonkeyOte; 03-29-2010 at 04:35 AM.

  14. #29
    Registered User
    Join Date
    02-12-2010
    Location
    Little Rock, Arkansas
    MS-Off Ver
    Excel 2007
    Posts
    29

    Re: Excel 2007 CPU Usage

    Okay, I realize I have done a poor job demonstrating this. I hope this sheet will illustrate what I'm saying. Basically, one of the columns that you want to include in the table for binary search depends on previous values from the column that we are aiming to populate. The column titled "Exit Size" in this sheet is the one of interest.

    What I have in here is the best solution I was able to come up with. The values and order of signal IDs do not change during the scenario runs, so I initially set up the sheet by having VBA calculate the offset numbers in Column C and then setting them equal to values so that deleting won't be an issue later. From there, the 'Exit Size' column uses those helper cells to do an offset. What do you think? The recalc time is now excellent ~1 second...but the set up of the sheet with VBA takes 2-3 mins...not great, but acceptable.
    Attached Files Attached Files

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.2.0