Pages

Tuesday, February 1, 2011

Built-in Functions - Aggregate Functions

Views
Aggregate functions return a single value summarizing a given data set. All aggregate functions are deterministic. NOTE: AVG, SUM, STDEV, STDEVP, VAR and VARP functions cannot operate on BIT data types; they can operate on all other numeric data types.

Contents

[hide]

COUNT Function

The COUNT function returns a count of rows based on certain criteria. The syntax is:
1.COUNT(ALL or DISTINCT expression) or COUNT(*)
Keyword ALL is optional and is assumed by default.
If you specify a "*" as the criterion, COUNT returns the total number of rows in a table; for example, the following query counts rows in the FactFinance table of Adventure Works DW database:
1.SELECT COUNT(*)
2.FROM FactFinance
3. 
4.Results:
5.-----------
6.39409

If you join multiple tables then COUNT(*) returns the number of rows satisfying the join criterion, as in the following:
1.SELECT COUNT(*)
2.FROM FactFinance a INNER JOIN DimOrganization b
3.ON a.OrganizationKey = b.OrganizationKey
COUNT(*) cannot be used with DISTINCT; nor can you specify any other parameter - this variation of the function automatically counts every single row in a single or multiple joined tables.
Unlike all other aggregate functions, COUNT does not ignore NULL values.
If you need to find the count of unique items within a column in a table use COUNT (DISTINCT column_name) syntax. For example, the following query counts unique organization keys within the FactFinance table:
1.SELECT COUNT(DISTINCT OrganizationKey)
2.FROM FactFinance
3. 
4.Results:
5.-----------
6.9

COUNT_BIG Function

The COUNT_BIG function is identical to the COUNT function, but returns a BIGINT data type, whereas COUNT returns an INT. The upper limit for INT data type is 2 billion; therefore if you anticipate counting more than two billion rows use the COUNT_BIG function. Any attempt to use the COUNT function when counting over 2 billion rows returns an error.

MAX Function

The MAX function returns the biggest value within a given set. The syntax is:
1.MAX(ALL or DISTINCT expression)

The ALL keyword is optional and is the default unless DISTINCT is specified. The DISTINCT keyword specifies that each unique value should be considered. This keyword really has no use with MAX function since it returns a single value; however the keyword is supported for ANSI compatibility.


For example, the following returns the greatest amount from the FactFinance table:
1.SELECT
2.MAX(amount)
3.FROM FactFinance
4.Results:
5.----------------------
6.4820988

Note that the parameter of MAX can be any valid expression, including string columns, as in the following:
1.SELECT
2.MAX(EnglishProductName)
3.FROM DimProduct
"Women's Tights, S" is returned because it is the last value in the alphabetically ordered list of product names, not because it is the longest product name. You can combine MAX function with LEN function to return the products with the longest names, as follows:
01.SELECT
02.EnglishProductName
03.FROM DimProduct
04.WHERE LEN(EnglishProductName) = (
05. SELECT MAX(LEN(EnglishProductName))
06. FROM DimProduct)    Results:    EnglishProductName
07.--------------------------------------------------
08.ML Mountain Frame-W - Silver, 40
09.ML Mountain Frame-W - Silver, 42
10.ML Mountain Frame-W - Silver, 46
11.ML Mountain Frame-W - Silver, 38

MIN Function

The MIN function returns the biggest value within a given set. The syntax is:
1.MIN(ALL or DISTINCT expression)

The ALL keyword is optional and is the default unless DISTINCT is specified. The DISTINCT keyword specifies that each unique value should be considered. This keyword really has no use with MIN function since it returns a single value; however the keyword is supported for ANSI compatibility.

For example, the following returns the greatest amount from the FactFinance table:
1.SELECT
2.MIN(amount)
3.FROM FactFinance
4.Results:
5.----------------------
6.-1121918

Note that the parameter of MIN can be any valid expression, including string columns, as in the following:
1.SELECT
2.MIN(EnglishProductName)
3.FROM DimProduct   
4. 
5.Results:
6.--------------------------------------------------
7.Adjustable Race

Note that "Adjustable Race" is returned because it is the first value in the alphabetically ordered list of product names, not because it is the shortest product name. You can combine MIN function with LEN function to return the products with the shortest names, as follows:
01.SELECT
02.EnglishProductName
03.FROM DimProduct
04.WHERE LEN(EnglishProductName) = (
05.SELECT MAX(LEN(EnglishProductName))
06.FROM DimProduct)   
07. 
08.Results:   
09.EnglishProductName
10.--------------------------------------------------
11.Stem

AVG Function

The AVG function returns the average of the values within a column. Unlike MIN and MAX, AVG can only accept a numeric expression as a parameter. The syntax is:
1.AVG(ALL or DISTINCT numeric column)

ALL keyword is optional and is assumed by default. You can specify the DISTINCT keyword if you want to take into account only the unique occurrence of each value.
The following query returns the average amount from FactFinance table:
1.SELECT
2.AVG(Amount)
3.FROM FactFinance   
4. 
5.Results:
6.-----------
7.34475.3841178411

The next example uses DISTINCT keyword to return the average of distinct values:
1.SELECT
2.AVG(DISTINCT Amount)
3.FROM FactFinance   
4. 
5.Results:
6.-----------
7.95592.8708615292

The AVG function returns the same (or a similar) data type as the group of values it accepts. So if you pass an integer, expect an integer back; if you examine float values with the AVG function you will get a float value back.


SUM Function

The SUM function returns the sum of all or unique values. Unlike MIN and MAX, SUM can only accept a numeric expression as a parameter. The syntax is:
1.SUM(ALL or DISTINCT numeric column)

ALL keyword is optional and is assumed by default. You can specify the DISTINCT keyword if you want to take into account only the unique occurrence of each value.


The following query returns the sum of amounts from FactFinance table:
1.SELECT
2.SUM(Amount)
3.FROM FactFinance   
4. 
5.Results:
6.-----------
7.1358640412.7

The next example uses DISTINCT keyword to return the sum of distinct values:
1.SELECT
2.SUM(DISTINCT Amount)
3.FROM FactFinance   
4. 
5.Results:
6.-----------
7.1251597458.19

The SUM function returns the same (or a similar) data type as the group of values it accepts. So if you pass an integer, expect an integer back; if you examine float values with the SUM function you will get a float value back.

STDEV Function

The STDEV function calculates the standard deviation for all items in the SELECT statement. This function can only be used with numeric columns. The syntax is:
1.STDEV(ALL or DISTINCT numeric expression)

ALL keyword is optional and is assumed by default. You can specify the DISTINCT keyword if you want to take into account only the unique occurrence of each value. The STDEV function always returns a FLOAT data type value.


The following example returns the standard deviation of amounts from FactFinance table:
1.SELECT
2.STDEV(amount)
3.FROM FactFinance   
4. 
5.Results:
6.----------------------
7.164041.892959458


STDEVP Function

The STDEVP function calculates the standard deviation for the population of items in the SELECT statement. This function can only be used with numeric columns. The syntax is:
1.STDEVP(ALL or DISTINCT numeric expression)

ALL keyword is optional and is assumed by default. You can specify the DISTINCT keyword if you want to take into account only the unique occurrence of each value. The STDEVP function always returns a FLOAT data type value.


The following example returns the standard deviation of population for amounts in the FactFinance table:
1.SELECT
2.STDEVP(amount)
3.FROM FactFinance   
4. 
5.Results:
6.----------------------
7.164039.811671762

VAR Function

The VAR function calculates the statistical variance of all values in the SELECT statement. This function can only be used with numeric columns. The syntax is:
1.VAR(ALL or DISTINCT numeric column)

ALL keyword is optional and is assumed by default. You can specify the DISTINCT keyword if you want to take into account only the unique occurrence of each value. The VAR function always returns a FLOAT data type value.


The following example returns the variance of amounts from FactFinance table:
1.SELECT         VAR(amount)  FROM FactFinance  
2.  
3.Results: 
4.---------------------- 
5.26909742645.7224

VARP Function

The VARP function calculates the statistical variance for the population of values in the SELECT statement. This function can only be used with numeric columns. The syntax is:
1.VARP(ALL or DISTINCT numeric column)

ALL keyword is optional and is assumed by default. You can specify the DISTINCT keyword if you want to take into account only the unique occurrence of each value. The VARP function always returns a FLOAT data type value.


The following example returns the variance for the population of amounts from the FactFinance table:
1.SELECT         VARP(amount)  FROM FactFinance   
2. 
3.Results: 
4.---------------------- 
5.26909059813.3073

CHECKSUM_AGG Function

The CHECKSUM_AGG function can be used to detect data changes in a table. This function can only work on integer data type; otherwise the functionality is very similar to other aggregate functions like SUM or AVG. The syntax is:
1.CHECKSUM_AGG(ALL or DISTINCT integer expression)

ALL keyword is optional and is assumed by default. You can specify the DISTINCT keyword if you want to take into account only the unique occurrence of each value. The CHECKSUM_AGG function always returns a INT data type value.


The following query returns the checksum of amount column in FactFinance table; then it updates some rows in the table; the second checksum is different from the first indicating that the data values have changed:
1.SELECT         CHECKSUM_AGG(CAST(amount AS INT)) AS first_checksum  FROM FactFinance   
2.UPDATE FactFinance  SET amount = Amount * 1.1  WHERE timekey = 1   
3.SELECT         CHECKSUM_AGG(CAST(amount AS INT)) AS second_checksum  FROM FactFinance
Results:
1.first_checksum 
2.-------------- 
3.-5793162   
4. 
5.second_checksum 
6.--------------- 
7.-6063450

CHECKSUM Function

The CHECKSUM function returns the checksum value computed over a row in a given table; alternatively it can return the checksum of the specified list of values. This function is intended for building hash indexes and always returns an INT data type value. The syntax is:
1.CHECKSUM(*) or CHECKSUM(expression1, expression2, รข€¦ expressionN)

The CHECKSUM(*) flavor examines all columns of a given table; it returns an error if the table contains TEXT, NTEXT or IMAGE data type columns. If you specify expressions instead of a table then expressions must be of any data type other than TEXT, NTEXT, IMAGE, CURSOR or sql_variant containing one of the preceding data types.


The following example returns a CHECKSUM of the first row in the DimProduct table:
1.SELECT TOP 1 CHECKSUM(*)  FROM DimProduct   
2. 
3.Results: 
4.----------- 
5.1502641847

The next example returns a CHECKSUM of given values:
1.SELECT  CHECKSUM(222, 333, 222, 999)   
2. 
3.Results: 
4.----------- 
5.828167

To add a hash index to a table you need to add a computed column using CHECKSUM function and then create an index on the computed column, as in the following:
1.ALTER TABLE DimProduct  ADD hash_index_column AS CHECKSUM(EnglishProductName) 
2.GO   
3.CREATE INDEX EnglishProductName_index ON DimProduct(hash_index_column)

GROUPING Function

The GROUPING function returns a value of 1 if the row is added to the result set by either ROLLUP or CUBE clause of the SELECT statement. CUBE and ROLLUP extensions allow generating very simple textual reports in a query window. Majority of business reports will use more sophisticated reporting tools than the Query Analyzer or SQL Server Management Studio, therefore CUBE and ROLLUP extensions, as well as the GROUPING function have limited usage. The syntax of the GROUPING function is:
1.GROUPING (column)

The column is a column referred to in the GROUP BY statement.


The following query returns sum of amounts per organization and total of all organization amounts combined as a separate "grouping row":
01.SELECT OrganizationName,    SUM(Amount) AS total_amount,    GROUPING(OrganizationName) AS 'Grouping Row'  FROM FactFinance a  INNER JOIN DimOrganization b ON a.OrganizationKey = b.OrganizationKey  GROUP BY OrganizationName WITH ROLLUP 
02. 
03.Results: 
04.OrganizationName                                   total_amount           Grouping Row 
05.-------------------------------------------------- ---------------------- ------------ 
06.Australia                                          35553074.99            0 
07.Canadian Division                                  292174782.72           0 
08.Central Division                                   149032081.6            0 
09.France                                             66963799.96            0 
10.Germany                                            30339804.03            0 
11.Northeast Division                                 134003346.4            0 
12.Northwest Division                                 138922888.8            0 
13.Southeast Division                                 279284663.6            0 
14.Southwest Division                                 232365970.6            0 
15.NULL                                               1358640412.7           1

The ROLLUP extension and the GROUPING function are more useful when the report is grouping records based on multiple columns. For example, the following query groups the output by product class, style and English name:
1.SELECT EnglishProductName,   class, style,    SUM(SalesAmount) AS total_amount,    GROUPING(EnglishProductName)  AS 'Grouping Row'  FROM FactResellerSales a  INNER JOIN DimProduct b ON a.ProductKey = b.ProductKey  WHERE class IN ('l')  GROUP BY class, style, EnglishProductName WITH ROLLUP
Note that a grouping row is generated for each class, each class and style combination and the grand total of all sales.

0 comments:

Post a Comment

 

Web Design Company karimnagar, Web Designing warangal, Logo Design Company nizamabad, Indian Website Design Company, maddysoft.co.in