ADUG
Home
About Us
Services
Meetings
Fees
Mailing List
Rules
Reference Papers
Downloads
Apply to Join
Delphi Jobs
Special Offers
Maths Corner

 

by Glenn Crouch

 

Using Statistics in Delphi - Part III

This Issue we continue our several issue look at developing Statistical Routines to use in your Delphi Applications.  These will be designed to use Open Array Parameters where possible so that you can use them for Standard Arrays or with the new Dynamic Arrays.

Quartiles

Just as the Median - see Part I - measured the middle, thus dividing the Data in half, Quartiles divide the Data in Quarters.

So 2nd Quartile = Median

Now whilst most people agree on how Quartiles are to be calculated when it comes to continuous functions, when we have a collection of Data (also called Discrete Data) there is some debate over the best way.

The First Quartile is any value such that 25% of the Data Values are less than or equal to it, and 75% of the Data Values are greater than or equal to it. Thus in fact for our sort of Data, the Quartile is not unique but lies within a range.

Similarly, the Third Quartile is any value such that 75% of the Data Values are less than or equal to it and 25% of the Data Value are greater than or equal to it.

I am going to stick with the way I was taught, which coincides with the method used by Leonard J Kazmier in Schaum's Outline Series: Theory and Problems of Business Statistics, published by McGraw-Hill. Please note that this gives slightly different answers then those supplied by Microsoft Excel, but it is pretty easy to implement.

For n Data Items:

That is the position of the First Quartile is at position n / 4 + 0.5

If this is an Integer then we use that Data Value. Otherwise, we take the integer portion, I, and the Data Value at that position and add to it the fractional portion multiplied by the difference between the I and I + 1 Data Values.

Similarly:

That is the position of the Third Quartile is at position 3 * n / 4 + 0.5

This gives us the following Delphi Procedure to calculate Quartiles, given that we have a sorted array (see discussion in Stats Part I on Median):

procedure GetQuartiles (const SortedX: array of Extended;
  var Q1, Q3: Extended);
// Returns the 1st and 3rd Quartile
// Note: Assumes Array starts at 0 and ends at n-1
var
  J: Single;
  I: Integer;
begin
  if
High (SortedX) < 0 then
    raise Exception.Create ('Array is Empty!')
  else if High (SortedX) = 0 then
  begin
    Q1 := SortedX [0];
    Q3 := SortedX [0];
  end
  else
  begin

   
//Calculate 1st Quartile

    J := (High (SortedX) + 1) / 4 + 0.5;
    I := Trunc (J);
    J := Frac (J);
    if I - 1 < High (SortedX) then 
      Q1 := SortedX [I - 1] + (SortedX [I] 
        - SortedX [I - 1]) * J
    else // Take End Value
      Q1 := SortedX [I - 1];

   
//Calculate 3rd Quartile


    J := 3 * (High (SortedX) + 1) / 4 + 0.5;
    I := Trunc (J);
    J := Frac (J);
    if I - 1 < High (SortedX) then
      Q3 := SortedX [I - 1] + (SortedX [I]
        - SortedX [I - 1]) * J
    else // Take End Value
      Q3 := SortedX [I - 1];
  end;
end;

Calculating the Inter-Quartile Range

Since the Range is easily effected by extreme outliers, many people use the IQR, Inter-Quartile Range, as a measure of dispersion, since it contains 50% of the values.

Once we have calculated the First and Third Quartile, the IQR is simply:

IQR := Q3 - Q1;

Calculating the Coefficient of Variation

The Coefficient of Variation gives us information about the Standard Deviation relative to the mean. Thus it could be thought of as the magnitude of the Standard Deviation. This is known as a measurement of Relative Dispersion.

 It is calculated as follows:

 

Which in Delphi translates to:

if FloatIsZero (Mean) then // See Article on Rounding
  raise Exception.Create ('Value does not Exist')
else
  CoeffVariation := StdDev / Mean;

Quartile Coefficient of Variation

Quartile Coefficient of Variation is another common measurement of Relative Dispersion, and is quite easy to calculate once we have the Quartiles. It is calculated as follows:

which in Delphi translates to:

if FloatIsZero (Q3 + Q1) then // See Article on Rounding
  raise Exception.Create ('Value does not Exist
')
else
  QCoeffVariation := (Q3 - Q1) / (Q3 + Q1);

 

Stating your Methodology

Given that different people and different packages calculate Quartiles in different ways - most of which can be justified - you should start to see that it is important to state which methods you are using when you use Mathematical and Statistical routines. We have also seen that a Sample Standard Deviation is calculated differently to a Population Standard Deviation - though some texts just used the Population formula for both.

It is good practice to list all Mathematical Assumptions, Formulae and Techniques used within your Application e.g. make this a section in your Help File.

Conclusion

Next Issue we will continue are look at Statistics as we look at Normal Distributions.

 

Maths Corner Home

 Copyright © 2001 Australian Delphi User Group and respective copyright owners.
All Rights Reserved | Disclaimer