When an aggregate value is been computed upon a group of rows, it is said to be an analytical function. It is not similar to Aggregate functions, as multiple rows are returned for each group. But in analytical function the group of rows are defined by a clause i.e. analytical_clause, this is also knows as a window. For each and every row a sliding window is been defined, to decide the range of rows to calculate. The clause size can be any physical number of rows or can be a logical interval for example time.
Analytical functions can be performed before ORDER BY clause or in the last set of operations in a query.
Note: All WHERE clause, all the join condition, GROUP BY clause and HAVING clauses are completed before processing the analytical functions. Analytical functions can be appeared in select list or order by clause. Mostly analytical functions are used to compute cumulative, moving, centered, and reporting aggregates.
analytic_function([ arguments ]) OVER (analytic_clause)
Figure 1: analytic_function
Description for analytic_function as per the above Figure 1 is as below:
[ query_partition_clause ]
[ order_by_clause [ windowing_clause ] ]
The easy way to understand analytic functions is by looking at aggregate functions.
An aggregate function is that which aggregate data from all rows into one single record or single result set.
Let us take ORDER_ITEM table from OE sample schema which is provided by Oracle. You can also get this schema when we install Oracle 11g on our systems. But for using this we need to login as SYSDBA and unlock its account for further uses.
So if I want to know the average quantity that was ordered for all products, then my query will be like as bellow:
SQL> select avg(quantity) from ORDER_ITEMS;
Figure 2: Aggregate function example for getting average quantity for all orders
We can apply aggregate functions in a GROUP BY clause to get the subsets of the rows, for example, if we want to display the average ordered quantity for each product then my query would be like as below:
SQL> SELECT PRODUCT_ID, AVG(QUANTITY) FROM ORDER_ITEMS GROUP BY product_id ORDER BY PRODUCT_ID;
Figure 3: aggregate function with GROUP BY clause
In the above examples we can see that the aggregate function reduces the number of records returned by the queries.
We can use Analytic functions to operate on the subsets of rows as like aggregate functions in GROUP BY clause used above mentioned queries. But using Analytic functions the rows are not going to be reduced.
But if we want to know the quantity for each product and average quantity for the same product that was ordered , then our query will be like below:
SQL> SELECT product_id, quantity, AVG(QUANTITY) over (partition BY PRODUCT_ID) AS avg_qty_order FROM ORDER_ITEMS;
Figure 4: Example of Analytic function
The arguments from 0-3 can be passed to Analytical functions. These arguments can be numeric or non-numeric datatypes which are implicitly converted to a numeric datatypes.
The arguments are determined by taking the highest numeric at first order and it converts the rest of the arguments to the same datatype along with the return type.
To operate the function on a result of a query is signified by using OVER analytic_caluse. This can be used after FROM, WHERE, GROUP BY, and HAVING clauses, this can also be used in the select list or order by clause. At first the result is textually encapsulated by using analytic_clause in the query and then it filters the result of the nested sub query.
Few things to remember while using analytic_clause:
We can use analytic_clause with user defined or built-in analytic functions.
We cannot textually encapsulate within the function by specifying any analytic function in any part of the analytic_clause.
Figure 5: analytic_clause
Description for analytic_function as per the above Figure 5 is as below:
[ query_partition_clause ]
[ order_by_clause [ windowing_clause ] ]
This clause is used to partition or group the result of the query. If we do not use this clause then the function will treat all the result from the query as a single group.
For example in the below query the partition clause is not been used.
SQL> SELECT product_id , quantity, AVG(quantity) over() AS avg_qty from order_items order by product_id;
Figure 6: without using partition clause
The result of the above query is not that meaningful as shown in Figure 6, here we are getting the average quantity for all the product , but if we want to show more meaningful result like average quantity for each product that was ordered then the query will look like the below and result as shown in Figure 7:
SQL> SELECT product_id , quantity, avg(quantity) over(partition BY product_id) as avg_qty from order_items order by product_id;
Figure 7: Using query_partition_clause
Which give me more depth report on average product ordered based on product ID.
To set the order (ordering the data in sequence) of the data retrieved from a query having analytic function used in it , we can use order_by_clause. But we cannot use it for all analytic functions. For example , using PERCENTILE_CONT and PERCENTILE_DISC , it will only take a single key. We specify multiple order sequence within each function using in a query. Using order_by_clause is very useful when we make use of RANK function for getting rank values which result the tie between the identical values for the first expression.
For example , below query provides the report on RANK for employees (table that is used is taken from taken from demo HR schema) from department number “60” and based on salary and commission:
SQL> SELECT department_id, last_name, salary, commission_pct, RANK() OVER (PARTITION BY department_id ORDER BY salary DESC, commission_pct) "Rank" from employees WHERE department_id = 60;
Figure 8: Using order_by_clause in RANK function
Here in the above query and figure 8 (result from the query) provide a report. In this Figure 8 we can see that the order_by_clause is first used in partition where it selects the records by ordering salary in descending order along with commission.
Another example for order by clause can be use to sort the result sets. Let us use the CUSTOMER table from OE demo schema. If I want to know how many customers are there under “GOLD”, “Platinum” and in “Silver” status for each NLS Territory then my query would be like as below:
SQL> SELECT DISTINCT nls_territory, status , COUNT(CUSTOMER_ID) over (partition BY status) AS avg_product_buyer FROM customers ;
Figure 9: Without using order_by_clause
Now we can use the order by clause at the end of the section like the below query:
SQL> SELECT DISTINCT nls_territory, status , COUNT(CUSTOMER_ID) over (partition BY status ) AS avg_product_buyer from customers order by nls_territory;
Below query uses analytic function and even it will give same result as the above one, here we are using the order by clause inside the query:
SQL> SELECT DISTINCT nls_territory, status , COUNT(CUSTOMER_ID) over (partition BY status order by nls_territory ) AS avg_product_buyer FROM customers ;
Below figure 10 shows the result of order_by_clause example:
Figure 10: Using order_by_clause
Few restrictions on the ORDER BY CLAUSE:
· order_by_clause must take an expression.
· The keyword SIBLINGS is not valid, so cannot be used in order_by_clause.
· Position and column aliases are also not valid.
Few analytic functions allow windowing_clause for listing analytic result at the end, this type if clause is always followed by an asterisk sign (*).
In other words windowing_clause can be said like for further sub-partitioning the result set and applying the aggregate function. For example we can use windowing_clause in aggregate function like sum, avg, count, first_value, RANK etc.
ROWS | RANGE in windowing_clause:
For each physical (ROWS) or logical (RANGE) set of rows ROWS or RANGE keyword is defined for getting the function result and then applying it to all the rows in the window.
BETWEEN … AND in windowing_clause:
To specify a start point and end point for the window BETWEEN … AND are used. Before “AND” keyword the first expression is start point and after “AND” keyword the second expression is the end point for a window clause. In case if we miss the keyword BETWEEN then Oracles treats the end point as a start point and current row as end point.
We can also say windowing clause is an extension to order_by_caluse. These windowing_clause two forms which are based on the above mentioned ROWS|RANGE, between clauses:
· RANGE BETWEEN start_point AND end_point
· ROWS BETWEEN start_point AND end_point
For describing the start point and end pints we can have to understand the below clause for using inside a query.
This keyword indicates the window to start at the first row of the partition and this cannot be used as the ending point.
This keyword indicates the windows end point or the last row of the partition and this cannot be used as the starting point.
This keyword can be used at starting or at ending point of the window at current row or value.
value_expr PRECEDING :
This keyword can be said as a constant or expression for a logical (RANGE) or physical (ROWS) offset. If we are using it with RANGE clause then it can be treated as a interval literal only if the order_by_clause is a DATE column.
value_expr FOLLOWING :
It is also similar to the value_expr PRECEDING but only used after the current row.
Below are few functions where we can use analytic functions, for getting more information of the below mentioned function login to http://www.oracle.com/index.html.
Then go to Oracle Database 11g Documentation and check for Analytical functions.
Here is the direct link to get the details:
· AVG function
· CORR function
· COVAR_POP function
· COVAR_SAMP function
· COUNT function
· FIRST_VALUE function
· LAG function
· LAST function
· LAST_VALUE function
· LEAD function
· MAX function
· MIN function
· NTILE function
· PERCENT_RANK function
· PERCENTILE_CONT function
· PERCENTILE_DISC function
· RANK function
· RATIO_TO_REPORT function
· REGR_ (Linear Regression) Functions
· ROW_NUMBER function
· STDDEV function
· STDDEV_POP function
· STDDEV_SAMP function
· SUM function
· VAR_POP function
· VAR_SAMP function
· VARIANCE function