Hello,
I am using the below formula to distinctly count the number of customers that match the criteria that I have in Cells C7 and B10. The data is in a separate worksheet, that I am showing Named Detail of which will be changing on a monthly basis, so a pivot table does not want to be used. The detail data ranges from row 7-40,000, and the file is currently 8610KB's, and can potentially grow.
=SUM(IF(FREQUENCY(IF(Detail!$A$7:$A$40000=C$7,IF(Detail!$B$7:$B$40000=$B10,IF(Detail!$D$7:$D$40000<> "",MATCH(Detail!$D$7:$D$40000,Detail!$D$7:$D$40000,0)))),ROW(Cust)-MIN(ROW(Detail!$D$7:$D$40000))+1),1))
This formula works but takes an excessive amount of time for one caluclation, and I need this for multiple column and row critera. So, can this calculation be changed in order to get the same result with faster calculation time? I am using Excel 2003.
Thank You
Last edited by janetp3; 02-19-2010 at 10:58 AM.
Hi,
That's an array formula you have there, and these are notoriously time consuming even for relatively small numbers of rows. So I'm not surprised that 40000 rows is taking an inordinate time.
I don't see why you can't use a pivot table. Just create a dynamic range name that covers the columns of the raw data so that each time you import new data the range name will adjust accordingly. then all you need do is specify the range name in the pivot table set up.
If by any chance the data contains blank lines or other superfluous data just sort it first and delete any relevant rows - a macro can do that for you.
Regards
Richard Buttrey
If this was useful then please rate it appropriately.
Click the small star iconat the bottom left of my post.
I didn't think I could do a distict count in a pivot table.
Not sure what you mean by a 'distict count' but PTs are pretty powerful and I'd be surprised if you can't achieve what you want. You can get either counts of data or sums of values and can use calculated fields that don't exist in the raw data.
Once you've got a straightforward 2 dimensional table of data the world is your oyster with PTs.
Rgds
Richard Buttrey
If this was useful then please rate it appropriately.
Click the small star iconat the bottom left of my post.
Okay, I resorted to trying the pivot table. The problem I am having comes when I try to add a criteria. I can get my Distinct count of customers, but I need to know how many buy a particular size. I don't have enough room on the worksheet to put all the sizes for each distinct customer on the pivot table.
I have a huge database...with a column named Size and one for Customer. I want to count every unique customer with a size that I am referring to on another worksheet
For example.
I have two sheets Detail and Summary. The summary sheet has a list of all the sizes with no customers. The detail contains multiple customers and sizes. I did the following as an array, and get the correct count, but it takes entirely too long. I have 40,000 rows: eek:
=SUM(IF(Size=B2,IF(MATCH(Customer,Customer,0)>=(ROW(Customer)-MIN(ROW(Customer))+1),1,0)))
Hope that helps explain my problem, and thanks.
XL is not really a purpose built data warehouse... using a purpose built database for this volume of data would be advisable in my opinion (FWIW).
However, if the intention is to keep everything within XL you should do two things to aid calculation efficiency
1 - sort your data on detail sheet - by customer, size
2 - add a 1/0 key adjacent to each transaction
You can then use SUMIF etc to calculate as necessary.
To elaborate, if data is sorted by Customer & Size you can then add a further column.
I will for sake of demo assume Customers reside in B7:B40000 and Size in C7:C40000
Detail!E7:
=--AND(B7<>B6,C7<>C6)
copied down to E40000
Your Array thus becomes:
Summary!?
=SUMIF($C$7:$C$40000,B2,$E$7:$E$40000)
which should be more efficient.
Now, it goes without saying that depending on "other" calculations you may be looking to do (other than the above) it is almost inevitable that the above approach will need to be enhanced to to account for additional complexities.
Invariably however the underlying premise of sorting data (and using lightweight helper calcs) leads to greater efficiencies and the greater the volume of data being processed the greater the savings.
Last edited by DonkeyOte; 02-18-2010 at 04:20 PM.
My Recommended Reading:
Volatility
Sumproduct & Arrays
Pivot Intro
Email from XL - VBA & Outlook VBA
Function Dictionary & Function Translations
Dynamic Named Ranges
Many thanks. I will use that. Thank you very much.
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks