Effectively listing data from a table/array

**Drew Goldberg** · 05-07-2009, 05:53 PM

Hello:

I'm currently looking to develop a few formulas that can review individual lines in a table, identify specific values, associate a tag for that value, and then count the number of instances the unique value occurs.

I've had sucess with VLOOKUP and COUNT IF for the back end but don't know where to start on the front end.

In addition I'm unsure whether this is better suited to a Macro since the number of rows created will vary with the data table/array being reviewed.

To keep the rambling at a minimum, I've attached an example with the desired results.

Thank you in advance for any assistance.

**DonkeyOte** · 05-08-2009, 05:06 AM

A VBA solution may be preferable - I'm not sure you've illustrated all of the various nuances of the required solution, for ex. what happens when a given incident re-occurs for a given brand - is this to be listed once or twice ? Does the range expand horizontally as well as vertically ?

I've attached a quick approach that utilises one set of array formulae to outline those cells requiring review with the results being in a table O1:S12.

For the sake of reference the formulae used can be outlined as follows:

O1: =SUMPRODUCT(--(B2:I4<>1),--(A2:H4<>B2:I4))
This calculates how many Down Times have occurred and are thus to be listed

O2: =IF(ROWS(O$2:O2)>$O$1,"",SMALL(IF((($B$2:$I$4<>1)*($B$2:$I$4<>$A$2:$H$4)),ROW($B$2:$I$4)+(COLUMN($B$2:$I$4)/1000)),ROWS(N$2:N2)))
committed with CTRL + SHIFT + ENTER
copied down to O12

This is one set of array formulae that permit us to easily identify the cells that mark the commencement of any given downtime.

Name:
P2: =IF(N($O2),INDEX($A$1:$A$4,INT($O2)),"")
copied down

Code:
Q2: =IF(N($O2),INDEX($A$1:$I$4,INT($O2),MOD($O2,1)*1000),"")
copied down

Description:
R2: =IF(N($O2),VLOOKUP($Q2,$K$2:$L$7,2,0),"")
copied down

Hours:
S2: =IF(N($O2),COUNTIF(INDEX($A$1:$H$4,INT(O2),MOD(O2,1)*1000):INDEX($A$1:$I$4,INT($O2),IF(INT($O2)=INT(N($O3)),(MOD($O3,1)*1000)-1,9)),$Q2),"")
copied down

I trust the above / attached should be sufficient to get you going ?

**Drew Goldberg** · 05-11-2009, 07:15 PM

Thanks for the assistance!

This will get the process started.

**Drew Goldberg** · 05-12-2009, 11:52 PM

Still noodling with this trying to learn how this works...

I'll be sure to post a follow-up.

Thanks,

Drew

**Drew Goldberg** · 05-27-2009, 12:10 AM

DonkeyOte:

Thank you for your assistance. All the formulas worked as designed.

I have never used the majority of the key formulas used in your solutions so I'm going through the excercise of trying to learn their function:

Small - returns the smallest value of a data set
Index - return the value of a cell at the intersection of a particular cell & row
Mod - returns the remainder of a number by a divisor
INT -rounds a number down to the nearest integer
Row - returns the number of rows in a reference and array
Column - returns the number of colums in a reference or array
N ...unable to find what this formula is
--
<>

I'm using the recommended readings to work through these and appreciate the suggestions.

If I may, I'd like to ask for one additional request for advice.

I am looking to change the time related cells from hour 1, 2, 3 to a 10 minute intervals such as 12:00, 12:10, 12:30, 12:40, 12:50, 13:00.

How would I modify the last formula to reflect how many minutes a given downtime condition exists?

**Drew Goldberg** · 05-27-2009, 12:34 AM

As I look at the formula, how the time is displayed looks irrelevant. I need to adjust the formula to recognize that each unit or position represents 10 min.

**Drew Goldberg** · 05-27-2009, 12:41 AM

I think I figured it out. I changed the value 9 to the number of 10 minute intervals +1 and then multiplied the value by 10...

**DonkeyOte** · 05-27-2009, 03:44 AM

Originally Posted by Drew Goldberg

I have never used the majority of the key formulas used in your solutions so I'm going through the excercise of trying to learn their function

Good for you - most would not bother so kudos.

To help out where I can I will, using the file I uploaded, go through a few formulae to try and explain the mechanics...

Please Login or Register  to view this content.

What this is doing is counting the downtime instances... these are denoted by:

Please Login or Register  to view this content.

So using the example data of Brand A in Row 2 such that B2:I2:

Please Login or Register  to view this content.

Here we have 3 instances ... denoted as follows:

Please Login or Register  to view this content.

We use O1 to enable us to restrict the amount of calculations we perform... ie we need only calculate x number of rows where x = number of downtimes to be listed.

For more detailed explanation of Sumproduct see Bob Phillips' white paper on the Function - see SUMPRODUCT link in my sig.

Please Login or Register  to view this content.

When faced with lengthy formulae it's best to split out into component parts... we can see in short the function follows the basic construct of:

Please Login or Register  to view this content.

So let's first look at the test in the IF:

Please Login or Register  to view this content.

This is where we're establishing as to whether or not we need conduct calcs based on number of instances to be reported... if the Rows processed thus far in our table exceed the value of O1 we know there are no more instances to report so if this test returns TRUE we simply return a Null value to O, if FALSE we will continue and calculate the SMALL function. We can establish ROWS processed by seeing how many rows are included from the first row of the results table (O2) to the current row ... in the case of O2 the answer is obviously 1 (ie ROWS(O$2:O2) - 1 row in range), whereas as the formula is copied down the latter range will increment... so by the time we get to O8 we have 7 rows (ROWS(O$2:O8)), this would exceed O1 (6) so we return a Null.

That's the simple bit over...

Please Login or Register  to view this content.

What we're trying to achieve in the above is to return a unique number which will give us all of the info we need to determine the location of a given downtime which we need to report... in essence cells can be located by their row & column positions... ie row 10 column 4 is D10 as row 1 column 1 is A1, so what we're trying to do is build a number which gives us the row & column information.

I decided to do this using a number convention of:

Please Login or Register  to view this content.

where 10.004 would be D10, 1.001 would be A1, 3.027 would be AA3 etc...

What the formula does is to first identify the appropriate cells which mark the commencement of a downtime (using similar logic to the Sumproduct as used in O1), ie:

Please Login or Register  to view this content.

and the number is generated using

Please Login or Register  to view this content.

ie take the ROW of the cell and add to that the COLUMN of the cell divided by 1000 (such that the COLUMN value becomes a decimal value and does not affect therefore affect the ROW value)

So put together this becomes:

Please Login or Register  to view this content.

We put this into a SMALL function such that we can generate an Array of all the values representing all the instances of downtimes:

Please Login or Register  to view this content.

The key then becomes the value of k ...

SMALL as you will have worked out works along the lines of:

Please Login or Register  to view this content.

eg

Please Login or Register  to view this content.

returns the 3rd smallest value in the array of values, ie 3

In our case k is determined by how many rows we've processed in our results table... if we're on the first row of our results table then we want the smallest value, if we're on the 6th row of our results table we want the 6th smallest value (ie the largest given we have only 6 values in our array (O1))

So this as you can see is the same logic as used previously to establish whether or not to return a Null... ie use ROWS(N$2:N2) to establish k... as you can see I've used N here - there is no significance to the Column ... it could be A, ZZ etc... what is important is the row value .. the first row reference ($2) is absolute (first row in table) whereas the latter row is relative such that as we progress the formula down the table the count of rows will increment.

Please Login or Register  to view this content.

So using this Array formula in O we generate a list of unique numbers and Nulls, using the sample file we end up with:

Please Login or Register  to view this content.

From these values we can now retrieve information given we know thesee values translate to:

Please Login or Register  to view this content.

I will do the remainder of formulae in a subsequent post...

**DonkeyOte** · 05-27-2009, 05:11 AM

Time for another coffee perhaps...

Part Two: using the ROW.COLUMN value (Column O) to retrieve info.

It's best first to explain the INDEX function as used in this context, namely to retrieve a value from a range using specified row & column position:

Please Login or Register  to view this content.

So for ex.

Please Login or Register  to view this content.

Where our Range is a Vector (ie one row or one column) we need only specify the row/column position (as appropriate) not both, eg:

Please Login or Register  to view this content.

Given the above we can look at each of the retrieval formulae for your results table...

Column P: Retrieving Name

Please Login or Register  to view this content.

Re: N function: Returns a value converted to a number
So if O2 held Text N(O2) would return 0, if O2 held 2 it would return 2

In XL only the value 0 equates to FALSE all other values are TRUE. Given this fact we can use the value (as determined by N($O2)) to determine our TRUE/FALSE in our IF test.

So we say if O2 is anything other than 0 then perform the INDEX else return a Null.

In our INDEX we want to use only the Integer portion of our unique number in O, the column portion (the decimal) is irrelevant as our Range is a Vector (1 column, A1:A4).

Using the example of O2: 2.005 - the INT is 2, so our INDEX becomes:

Please Login or Register  to view this content.

Column Q: Retrieving Downtime Code

In this case we will be using INDEX with a Matrix rather than a Vector so we must specify both row & column... we have this information remember as denoted in O

Please Login or Register  to view this content.

So in the above the ROW is again determined by the Integer value of O2, the COLUMN is determined by the Decimal remainder of O2.

Again using O2 example of 2.005

Please Login or Register  to view this content.

Obviously 0.005 isn't a valid Column reference... we need to multiply the decimal remainder by 1000 to get back to a true Column number

Please Login or Register  to view this content.

So our INDEX becomes:

Please Login or Register  to view this content.

Column R: Retrieving Downtime Description

Please Login or Register  to view this content.

This is a standard VLOOKUP based on value as returned in Q - I'll not insult you by explaining this

Downtime to follow...

**DonkeyOte** · 05-27-2009, 05:12 AM

I couldn't get the Downtime section into 1 post so have added in another post here...

Column S: Retrieving Downtime Hours

Please Login or Register  to view this content.

This is obviously the most complex of all the formulae used in the results table (outside of O). Here we are trying to ascertain how many hours downtime occurred for a given downtime code...

Ignoring the IF(N etc which we've already discussed we're left with:

Please Login or Register  to view this content.

Simplified this is obviously a standard:

Please Login or Register  to view this content.

The criteria is determined by the value in $Q2 (ie the downtime code)

So essentially we're trying to count how many times the downtime code has appeared within our specified range.

It is of course the Range that is complex to deduce given we could have:

Please Login or Register  to view this content.

which would equate to 2 different downtimes... a simple COUNTIF(row,31) would generate spurious downtime hours... the first downtime should be 2 hours and the 2nd only 1 ... whereas the simple COUNTIF(row,31) would generate 3 hours for both downtimes which would be wrong (plainly).

So what we need to do is ensure that the Range we create for use in the COUNTIF is specific to each downtime and does not overlap the next.

It is here that we begin to use INDEX in a different manner to before...

Please Login or Register  to view this content.

INDEX is a very powerful function as we can use it to return a Value OR even a Range. In the above context we're using it to create a Range for use in our COUNTIF (Reference Form) and not return a value (Array Form) as we have used previously.

The above can perhaps be simplified into a basic example of:

Please Login or Register  to view this content.

If we look at each INDEX separately we get

Please Login or Register  to view this content.

so this becomes

Please Login or Register  to view this content.

In our case things are a little more complex, however, we know the starting point of our range given the value in O...

Please Login or Register  to view this content.

This is the same as before - ie Row is determined by Integer of O and Column is decimal remainder of O * 1000

The potentially difficult part is establishing the end point of our range

Please Login or Register  to view this content.

We know the ROW remains constant so we just continue to use the INT value of O. The Column is a little trickier because we need to base that partly on where the next downtime occurs... if there is another downtime on the same row as the current downtime then we must establish the column position of the next downtime and not look at that column or any column beyond that point, if on the other than there are no more downtimes beyond the current downtime on a given row we can simply look up to and including the last column of data. This column position we determine with:

Please Login or Register  to view this content.

So by comparing the Integer value of O2 to O3 we can see if the next Downtime (O3) occurs on the same row as the current.

Please Login or Register  to view this content.

If the Integer values are the same we must establish the column position of the next downtime (the decimal remainder of O3 * 1000) and subtract one from it so as to ensure our Range does not overlap downtimes.

Using the example file we had:

Please Login or Register  to view this content.

So here we can see the INT values of both O2 & O3 are the same (2) - ie they both appear on Row 2.
This means that when processing O2 we must use the Column position of the next downtime (2.007) to establish the last column to look at when trying to identify hours related to downtime of O2 ... in this case it would be up to and including column 6 given column 7 contains the next downtime, ie:

MOD(2.007,1) --> .007 --> * 1000 --> 7 --> -1 --> 6

If the Integer values of O2 and O3 were not the same we can just use the column number of the last column in our INDEX range (in the example this was 9 - ie last data was in I)

So going back to the example of

Please Login or Register  to view this content.

this:

Please Login or Register  to view this content.

becomes

Please Login or Register  to view this content.

which is equivalent to

Please Login or Register  to view this content.

We then use this range in our COUNTIF, ie:

Please Login or Register  to view this content.

so this simply counts how many times the downtime code as specified in Q2 (31) occurs in the range E2:F2... given each column represented 1 hour this total equated to total hours downtime for that given incident.

I hope that helps...

**franciz** · 05-30-2009, 02:06 PM

Hey Donkey

That's a great explanation on the formulas you used here.
I for one, am sure to benefit from this.

Effectively listing data from a table/array

LinkBack

Thread Tools

Rate This Thread

Display

Effectively listing data from a table/array

Re: Effectively listing data from a table/array

Re: Effectively listing data from a table/array

Re: Effectively listing data from a table/array

Re: Effectively listing data from a table/array

Re: Effectively listing data from a table/array

Re: Effectively listing data from a table/array

Re: Effectively listing data from a table/array

Re: Effectively listing data from a table/array

Re: Effectively listing data from a table/array

Re: Effectively listing data from a table/array

Thread Information

Users Browsing this Thread

Tags for this Thread

Bookmarks

Bookmarks

Posting Permissions