# Finding/counting consecutive # of values in a row - frequency function?

1. ## Finding/counting consecutive # of values in a row - frequency function?

Hello all,

I have a column with daily price data of stock price changes. These are positive or negative. The magnitude doesn't matter, so it can be formatted to binary data (0 or 1).

I'd like to count the number of consecutive positive days in a row. For example:

2 positive days in a row = 300
3 positive days in a row = 200
4 positive days in a row = 100
...
10 positive days in a row = 2 (happened only two times)

As an example (figures are random).

Then, I'd like to do the same for negative days in a row.

I'll attach a sample sheet of what the result table could look like, but I'm at a loss with regards to how to calculate it. I imagine the Frequency function might be suitable?

Elijah

2. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Hi,

I'd do it with a couple of helper columns to count the number of consecutive runs and count those - see attached.

3. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Try these array formulas**:

Entered in G7 and copied down:

=SUM(IF(FREQUENCY(IF(C\$7:C\$1594>=0,ROW(C\$7:C\$1594)),IF(C\$7:C\$1594<0,ROW(C\$7:C\$1594)))=F7,1))

Entered in J7 and copied down:

=SUM(IF(FREQUENCY(IF(C\$7:C\$1594<0,ROW(C\$7:C\$1594)),IF(C\$7:C\$1594>=0,ROW(C\$7:C\$1594)))=I7,1))

** array formulas need to be entered using the key
combination of CTRL,SHIFT,ENTER (not just ENTER).
Hold down both the CTRL key and the SHIFT key
then hit ENTER.

You may have to replace the commas with semi-colons.

4. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Richard Buttrey
Hi,

I'd do it with a couple of helper columns to count the number of consecutive runs and count those - see attached.
Thanks a lot, Richard!

This seems to be correct, yes?

Would you mind explaining these helper columns?

If I read the sheet correctly, the runs are not "double counted", i.e., the # of 2 day runs are not included in the # of 3 day runs.

Originally Posted by Tony Valko
Try these array formulas**:

Entered in G7 and copied down:

=SUM(IF(FREQUENCY(IF(C\$7:C\$1594>=0,ROW(C\$7:C\$1594)),IF(C\$7:C\$1594<0,ROW(C\$7:C\$1594)))=F7,1))

Entered in J7 and copied down:

=SUM(IF(FREQUENCY(IF(C\$7:C\$1594<0,ROW(C\$7:C\$1594)),IF(C\$7:C\$1594>=0,ROW(C\$7:C\$1594)))=I7,1))

** array formulas need to be entered using the key
combination of CTRL,SHIFT,ENTER (not just ENTER).
Hold down both the CTRL key and the SHIFT key
then hit ENTER.

You may have to replace the commas with semi-colons.
Thank a lot, Tony!

However, I'm sad to say that the formula simply returns a zero. I replcaed all commas with semi-colons. Any chance I botched up elsewhere?

Regards.

5. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Elijah
Thank a lot, Tony!

However, I'm sad to say that the formula simply returns a zero. I replcaed all commas with semi-colons. Any chance I botched up elsewhere?
Here's your file with the formulas implemented.

6. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Elijah
Thanks a lot, Richard!

This seems to be correct, yes?

Would you mind explaining these helper columns?

If I read the sheet correctly, the runs are not "double counted", i.e., the # of 2 day runs are not included in the # of 3 day runs.

.
Correct. There is no double counting. e.g. The runs of 2 are not included in the runs of 3 or 4...etc.

The helper columns first test whether the binary is a 1 or zero since you want to count these separately.
Then the second part of the helper column checks whether the binary value is the same as the binary value in the preceding column D cell. If it is it adds one to the previous helper column value.

The column G & J are regular COUNTIF formulae that count the values in the helper columns.

7. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Tony Valko
Here's your file with the formulas implemented.
Fantastic! Looks great and very neat. Much obliged.

I can now use this set-up also for other data that is binary in nature.

Is it possible to set up something similar for data that is categorical?

Let's say I have a column with variables A,B,C,D, E.

Could I then look at runs for each single variable?

8. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Richard Buttrey
Correct. There is no double counting. e.g. The runs of 2 are not included in the runs of 3 or 4...etc.

The helper columns first test whether the binary is a 1 or zero since you want to count these separately.
Then the second part of the helper column checks whether the binary value is the same as the binary value in the preceding column D cell. If it is it adds one to the previous helper column value.

The column G & J are regular COUNTIF formulae that count the values in the helper columns.
Thanks, Richard.

If I can be frank, I have to say I appreciate the neatness of Tony's set-up without helper columns.

I'm noticing also that the two tables have different results?

9. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Elijah
Is it possible to set up something similar for data that is categorical?

Let's say I have a column with variables A,B,C,D, E.

Could I then look at runs for each single variable?
Something like this...

Data Range
 A B C D E 1 ------ ------ ------ ------ ------ 2 D A 1 2 3 B 2 1 4 A 3 1 5 B 6 A 7 C 8 B 9 C 10 A 11 A 12 D 13 A 14 A 15 A

This array formula** entered in E2 and copied down:

=SUM(IF(FREQUENCY(IF(A\$2:A\$15=C\$2,ROW(A\$2:A\$15)),IF(A\$2:A\$15<>C\$2,ROW(A\$2:A\$15)))=D2,1))

** array formulas need to be entered using the key
combination of CTRL,SHIFT,ENTER (not just ENTER).
Hold down both the CTRL key and the SHIFT key
then hit ENTER.

You may have replace the commas with semi-colons.

10. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Elijah
Thanks, Richard.

I'm noticing also that the two tables have different results?
That's not surprising since the two tables are counting different things. Runs of 1s and runs of 0s, or the positives and negatives you mentioned in your original.

Array formulas allow you to avoid helper columns as you've found and with small data sets you see little difference in speed. When data runs into thousands of rows and changes are made then they will slow down. Inevitably as with a lot of aspects of Excel it's a trade off depending on particular circumstances.

11. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Richard Buttrey
That's not surprising since the two tables are counting different things. Runs of 1s and runs of 0s, or the positives and negatives you mentioned in your original.

Array formulas allow you to avoid helper columns as you've found and with small data sets you see little difference in speed. When data runs into thousands of rows and changes are made then they will slow down. Inevitably as with a lot of aspects of Excel it's a trade off depending on particular circumstances.
To clarify, and apologies if it's my english that is off, but I were referring to the difference between the results in your Excel file and the one from Tony.

You were both calculating the same thing, right?

12. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Tony Valko
Something like this...

Data Range
 A B C D E 1 ------ ------ ------ ------ ------ 2 D A 1 2 3 B 2 1 4 A 3 1 5 B 6 A 7 C 8 B 9 C 10 A 11 A 12 D 13 A 14 A 15 A

This array formula** entered in E2 and copied down:

=SUM(IF(FREQUENCY(IF(A\$2:A\$15=C\$2,ROW(A\$2:A\$15)),IF(A\$2:A\$15<>C\$2,ROW(A\$2:A\$15)))=D2,1))

** array formulas need to be entered using the key
combination of CTRL,SHIFT,ENTER (not just ENTER).
Hold down both the CTRL key and the SHIFT key
then hit ENTER.

You may have replace the commas with semi-colons.
Thanks a bunch! I think I understand it.

I will try implementing it later today.

13. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Elijah
To clarify, and apologies if it's my english that is off, but I were referring to the difference between the results in your Excel file and the one from Tony.

You were both calculating the same thing, right?
Hi,

Mea Culpa. I should have said that there is double counting not that there isn't in post #6.
To avoid double counting the G7 formula would need modifying to

Formula:
`Please Login or Register  to view this content.`

and similarly for J7 but with an additinal -1 introduced.

Formula:
`Please Login or Register  to view this content.`

14. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Thanks a lot, Richard. That makes sense!

15. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Okay, so I'm trying to get a sense of this data and understand/read it correctly, but I'm not sure if get this correctly.

1. I summarized the total # of runs both for positive and negative days. Interestingly, both summarized to the same number; 402.

Beats me. Any reason why?

2. I interpret the run = 1 for positive days to mean that these are the days that are squeezed in between two negative days: - + -

Ain't that correct?

3. When combining both positive and negative runs for the value 1, I get 399. I don't consider the value "1" a run since the next and prior day was opposite.

The total number of days in the sample is 1588. Is it then correct to conclude that (1588-399)/1588 = 75% of the time in this sample there is a run in motion, either up or down?

From the 399 1-run days, is it correct to conclude that for any given day where the preceding day is the opposite direction, there's a 25% chance the next day will be in the opposite direction?

3. I also made a percentage statistic for each # run as a percentage of total runs in that direction.

So, looking at for example 3 consecutive positive runs:

For all positive runs - 14,18% were of 3 days in a row (57 occurences).

Or is it more correct to look at the number of days instead?

There were 57 occurences of a 3-day positive run --> That's 57*3 = 171 days of the entire sample or for the positive days. So, one could also say that 171/871=19,63% of the positive days were contained in a 3 day run?

Thanks in advance for any pointers. I'm not a statistician, but trying to get a grasp of all this.

Regards

16. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Elijah
Okay, so I'm trying to get a sense of this data and understand/read it correctly, but I'm not sure if get this correctly.

1. I summarized the total # of runs both for positive and negative days. Interestingly, both summarized to the same number; 402.

Beats me. Any reason why?

Thanks in advance for any pointers. I'm not a statistician, but trying to get a grasp of all this.

Regards
It's not surprising that the number of runs for + & - are the same. Take a data series consisting solely of 9 + and 1 -.
i.e. a run of 9 and 1 run of negative. Now change one of the series of 9 +s, say the third item to a zero. Now you have 1 run of 2 positives and 1 run of 6 positives,(2 runs in total) and 2 runs of 1 negative (again 2 runs in total). However you split the data you always change the number of +/- runs by the same factor.

If you can explain what sort of information you're trying to draw from your data, maybe we can suggest a significant statistical measure.

17. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Richard Buttrey
It's not surprising that the number of runs for + & - are the same. Take a data series consisting solely of 9 + and 1 -.
i.e. a run of 9 and 1 run of negative. Now change one of the series of 9 +s, say the third item to a zero. Now you have 1 run of 2 positives and 1 run of 6 positives,(2 runs in total) and 2 runs of 1 negative (again 2 runs in total). However you split the data you always change the number of +/- runs by the same factor.
That's a very lucid explanation. Thanks!

Originally Posted by Richard Buttrey
If you can explain what sort of information you're trying to draw from your data, maybe we can suggest a significant statistical measure.
Well, I've learned a few things already. Such that the longest run of positive days are of the count 8 and it happened only 3 times.

I'm not expecting to be able to make any predictions on this alone, but I just wanted to become more familiar with runs in my data set (stock market) and how common it is for example that the market goes up 3 or 4 days in a row.

Expressing it in percentages usually makes it more easy to visualize and understand. I also wanted to make sure that I did in fact make the correct conclusions and read the data correctly.

18. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Elijah
Okay, so I'm trying to get a sense of this data and understand/read it correctly, but I'm not sure if get this correctly.

1. I interpret the run = 1 for positive days to mean that these are the days that are squeezed in between two negative days: - + -

Ain't that correct?

2. When combining both positive and negative runs for the value 1, I get 399. I don't consider the value "1" a run since the next and prior day was opposite.

The total number of days in the sample is 1588. Is it then correct to conclude that (1588-399)/1588 = 75% of the time in this sample there is a run in motion, either up or down?

Given the 399 1-run days, is it correct to conclude that for any given day where the preceding day is the opposite direction, there's a 25% chance the next day will be in the opposite direction?

3. I also made a percentage statistic for each # run as a percentage of total runs in that direction.

So, looking at for example 3 consecutive positive runs:

For all positive runs - 14,18% were of 3 days in a row (57 occurences).

Or is it more correct to look at the number of days instead?

There were 57 occurences of a 3-day positive run --> That's 57*3 = 171 days of the entire sample or for the positive days. So, one could also say that 171/871=19,63% of the positive days were contained in a 3 day run?

Thanks in advance for any pointers. I'm not a statistician, but trying to get a grasp of all this.

Regards
Bumping this!

Would be very happy if anyone could confirm if I'm thinking correctly on this or reading the data correctly.

19. ## Re: Finding/counting consecutive # of values in a row - frequency function?

I'm not a statistician either so I can't offer any opinion on the analysis of the results.

20. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Before anyone can comment with any sort of authority we need to know what the data is, whether it's truly random and if not what factors are driving the degree of movement up or down each time and additionally - if relevant, what additional factors might trigger a reversal of direction.

Without knowing this we certainly can't say that just because x% of the preceding days were in the opposite direction there's a y% chance the succeeding day will reverse.

Is this the whole population of data or a sample subset?

21. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Tony Valko
I'm not a statistician either so I can't offer any opinion on the analysis of the results.
Allright. Appreciate you getting back to me, anyway.

Originally Posted by Richard Buttrey
Before anyone can comment with any sort of authority we need to know what the data is, whether it's truly random and if not what factors are driving the degree of movement up or down each time and additionally - if relevant, what additional factors might trigger a reversal of direction.

Without knowing this we certainly can't say that just because x% of the preceding days were in the opposite direction there's a y% chance the succeeding day will reverse.

Is this the whole population of data or a sample subset?
Well, I think that's a topic of controversy, actually, i.e., are stock market prices random or not?

This is roughly 5 years of daily %Change in an index (S&P 500).

I don't expect to be making any predictions or such based on this alone. I was merely looking to see if I could get a sense of what's "normal behaviour" so to speak with regards to runs in the data.

Or in plain words: How often does the market go up 3 days in a row? 4 days in a row? 8 days in a row? (pretty rare based on the results of the Excel file you guys helped me with here).

Or if someone says: The market's up/down 4 days in a row. It has to reverse omorrow. Then I can look at my data and say: Well, maybe, but in fact it's had runs of 5+ quite a few times as well, so anything can happen.

Based on what I already read from the Excel file, I feel I've kind of accomplished this already, but I was a little uncertain about how to express it in percentages of probabilities.

22. ## Re: Finding/counting consecutive # of values in a row - frequency function?

By the way, Tony!

If I can bother you again...

I have some other metric which is binary, but expressed in text values: LH and HL.

So, I could translate that in binary terms and have LH = 1 and HL = 0.

That's exactly what I did by using simple IF analysis.

I'd like to examine runs in this data as well.

I figured that for this I could simply copy and paste this data into the sheet you already made for me, but upon doing so, I get an error and it beats me why? I tried opening the cell and re-entering the formula using CTRL, SHIFT and ENTER, but it doesn't seem to help.

Any ideas?

Attaching this file below.

23. ## Re: Finding/counting consecutive # of values in a row - frequency function?

There are only 2 variables to count for, 1 or 0.

The formula you're using will always return 0 because you're using these logical tests:

>=0
<0

Since ALL the data is >=0 and NONE of the data is <0 the formula result will be 0.

To count the consecutive 1s:

=SUM(IF(FREQUENCY(IF(A\$7:A\$1594=1,ROW(A\$7:A\$1594)),IF(A\$7:A\$1594<>1,ROW(A\$7:A\$1594)))=C7,1))

To count the consecutive 0s:

=SUM(IF(FREQUENCY(IF(A\$7:A\$1594=0,ROW(A\$7:A\$1594)),IF(A\$7:A\$1594<>0,ROW(A\$7:A\$1594)))=C7,1))

Both still array entered.

24. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Ah. That makes a lot of sense, Tony! I obviously missed that fine detail there.

Thanks again!

I think I managed to get it correctly. Surprisingly though, there were gaps in the runs.

For instance 0 runs at 8, but 1 run at both 9 and 10.

For negative values it were even more gaps in between observations.

I'll attach the file if someone would like to take a peek.

25. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Originally Posted by Elijah
Surprisingly though, there were gaps in the runs.

For instance 0 runs at 8, but 1 run at both 9 and 10.

Data Range
 A 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 1 10 1 11 0 12 ------

The formula would evaluate that as 1 instance of 10 consecutive 1s. Nothing else.

26. ## Re: Finding/counting consecutive # of values in a row - frequency function?

Yes. And I'm sure the conclusions are correct if I entered the formula correctly.

I was just surprised myself to find 0 runs in between, but for these rare occurences at the tail of the distribution of runs, it probably ain't that odd anyway.

So, I'm learning stuff from my data set from this and that's very useful.

Thanks again for all help!

27. ## Re: Finding/counting consecutive # of values in a row - frequency function?

You're welcome!

There are currently 1 users browsing this thread. (0 members and 1 guests)

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1