# Statistical Anomalies

1. ## Statistical Anomalies

Lets try this again, submited this post 10 minutes ago and havn't been able to find it. It did prompt me to login when I clicked preview post so maybe it bugged some home. Very happy I was able to regain this post via the back browser button though :P

Hello there, first time reader first time poster. My company has set the task in my hands of comming up with a time managment workbook. So far I've made a workbook for each month of the year. Contained in each of the workbooks is an individual sheet for each week that tracks each employee's hours based on his clock in/clock out time. Also, there are sheets for each day that tracks what time they were called in for work, their load time, and the time they were told to go home.

Taking the information that was inputed we calculate the their pre-trip time, which is the time it takes for them to clock in to the time they are loaded, as well as the cleanup time, the time from when they are told to go home till they actually clock out. At the end of the workbook I have a sheet that processes all this information. Giving you an avg Call in, avg Clock In, Clock in Low, Clock in High, Avg Pre-Trip, Pre-Trip Low, Pre-Trip High, avg In-Yard, avg Clock Out, Clock Out Low, Clock Out High, avg Cleanup, Cleanup Low, Cleanup High.

All of the numbers we input into our system are in 1/100ths of an hour. So 7:30 is actually 7.50, and 6:40 is actually 6.67. So the code for say the Average Call in looks like
``Please Login or Register  to view this content.``
But unforunately the upper management doesn't want to convert each number from 1/100ths to 1/60ths so I used the mid function to determine what the last 2 digits would be and multiplied them by 0.6 to convert them to 1/60ths then did another mid search for the first 2 numbers and added them together.

Now problem number one I'm having comes up when either the last digit is a 0 such as 12.50 or 11.00. Excel doesn't seem to recognize a 0 as a digit when it comes to the mid function and skips right over it thus when I do (mid(A3;3;2)*.6) *where A3 = 12.00* It returns Err:502 because excel won't multiple zeros. Im considering adding a fake value such as .005 to each value so that there is something to mutiply the decimal place by then rounding it up to the nearest 1/100th, and then subtracting that .005 out after the equation is completed. Not sure if there is an easier route to go to find the mid of a list of cells where the average number has a zero in the end.

Next problem with such forumal is quite the reverse, when the first two digits are not two digits. Such as 9.50, since my mid function starts at the first digit which is 9 and counts 2 integers over it gives me 9.5. I've tried to add a leading zero to each cell and again Excel doesn't read zeros.

Now each of those work beautifully if there are two whole digits before and after the decimal place. They convert from 1/100th of an hour to 1/60th of an hour. Again managment doesn't want to convert each of the numbers to 1/60th on their own *lazy bums.*

Next up we have the entire reason for this drawn out thread. The Min and Max values for the data. This is the code for determing the Max value for a drivers Pre-Trip Time and rounding that value to the nearest minute.
``Please Login or Register  to view this content.``
Now what I'm trying to do is to nock out any statistical anomalies. Such as a rogue value of 350 minutes *because of canceled orders, truck broken ect* while the rest of the values are an average of 10-30 minutes. Having that one value of 350 minutes mis-represents his longest time because it is an statistical anomalie. I'm trying to figure out *with no such luck* how to rule out the lowest and the highest values to get a "true average."

Any help would be greatly appreciated in this matter. If you need further explanation please ask and I'd be happy to further explain the workbook.

Thanks,
Matt

2. Take a look at this link on decimal time values, it may help?

http://www.cpearson.com/excel/datetime.htm

3. Thank you for your post Oldchippy. I shall skim that site and see what I can come up with. For now I'm just going to convert everything over to hh:mm format instead of hh:1/100hr. Be easier than trying to struggle thru it at the moment.

The second question in my post was kinda confusing. Found an easier way to explain it and ask for help . How do you find the 2nd highest value in a set of cells. max() will find your highest. Is there a built in function that will find the second highest?

I think i've come up with a way to determine the 2nd highest value. The only problem is that it involves a very long nested if statement. if(a=1;max(2:20));if(a=2;max(1;3:20));if(a=3;max(1:2;4:20)) ect ect. Where a is the max() value of a range of cells. and 1-20 are the cells that contain each variable to be tested in the max() function. Unfortunately that would require a ton of hand coding of each of the if statements, and I have about 10 different max() functions to test for the 2nd highest max value.

Again, any help would be appreciated,
Matt

4. Can I give you this link to read now, it's about ranking data in lists, this should work for your second maximum value

http://www.cpearson.com/excel/rank.htm

5. Again Oldchippy your the man . This should work very well, thank you very much.

~Matt

6. You've got the work to do, I'm only pointing you in the right direction - thanks for the feedback

7. Well after 3 hours of coding its a success

Now that I had the data sorted in a nice fashion, selecting the 2nd highest number was easy. Determining what the 2nd lowest value would be was a little more difficult as the number of values continues to grow as the month goes on.

I first had to do a count() of the unique rank column so that I could determine how many variables there were, aka how many days they've worked. Next up was another offset()
``Please Login or Register  to view this content.``
except this time I had to add w3, which is where my count() is located, to
``Please Login or Register  to view this content.``
. Now for some reason I had to subtract 1 from the 1st driver grouping on each row, then subtract 2 from the 2nd driver grouping on each row, and also had to subtract 3 from the 3rd driver grouping on each row. Now I know why I would have to subtract -1, as that would give me the 2nd lowest number. But I'm unsure why I had to subtract 2 and 3 from the following drivers . But with a little persistance and some trial and error I was able to make it work *which sometimes is all that matters.*

So thank you again Oldchippy. I now know something new ^^

~Matt

8. Hi Matt,

Glad you got it sorted out, but that's what it's all about, a bit of trial and error.

You say your list grows as the months go on, well you probably need a Dynamic Named Range to take account of this. Follow this link for more information.

http://www.ozgrid.com/Excel/DynamicRanges.htm

Don't forget there are also plently of links on the link below oldchippy

There are currently 1 users browsing this thread. (0 members and 1 guests)

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1