Friday, October 28, 2005
Average mistake
Mathematics teaches average is sum of elements divided by number of the elements.
I.e.,
Average = SUM(elements) / NUMBER(elements).
Let us try to represent this into a T-SQL statement. SUM() function can be used to find the sum of elements and COUNT(*) function can be used for counting the elements. In the other hand we have AVG() used to find the average.
So, SUM(column) / COUNT(*) should be equal to AVG(column). But the reality is, they are not always equal. Let we try this T-SQL code,
-------- Begining of Code -------
CREATE TABLE #tbl_elements (element int)
GO
INSERT INTO #tbl_elements VALUES (5)
INSERT INTO #tbl_elements VALUES (10)
INSERT INTO #tbl_elements VALUES (NULL)
INSERT INTO #tbl_elements VALUES (15)
GO
SELECT
SUM(element)/COUNT(*) AS [Average through Formula],
AVG(element) AS [Average Through Function]
FROM #tbl_elements
GO
DROP TABLE #tbl_elements
GO
-------- End of Code -------
The result of the T-SQL statement is:
Average through Formula - 7
Average Through Function - 10
Why this difference???
The aggregate functions ignore NULL values when aggregating a column which contains both NULL and non-null values. COUNT(*) aggregate function is an exception. This function includes NULL values when it counts. If we need to exclude NULL values in the counting, specify the column name instead of ‘*’ in the function. That is COUNT(*) will count all elements including NULLs. However, COUNT(column) counts only the non-NULL values.
Now, let us re-write the T-SQL code as,
SELECT
SUM(element)/COUNT(element) AS [Average through Formula],
AVG(element) AS [Average Through Function]
FROM #tbl_elements
GO
We need to give extra care when we deal with NULL values to avoid this kind of mistakes.
I.e.,
Average = SUM(elements) / NUMBER(elements).
Let us try to represent this into a T-SQL statement. SUM() function can be used to find the sum of elements and COUNT(*) function can be used for counting the elements. In the other hand we have AVG() used to find the average.
So, SUM(column) / COUNT(*) should be equal to AVG(column). But the reality is, they are not always equal. Let we try this T-SQL code,
-------- Begining of Code -------
CREATE TABLE #tbl_elements (element int)
GO
INSERT INTO #tbl_elements VALUES (5)
INSERT INTO #tbl_elements VALUES (10)
INSERT INTO #tbl_elements VALUES (NULL)
INSERT INTO #tbl_elements VALUES (15)
GO
SELECT
SUM(element)/COUNT(*) AS [Average through Formula],
AVG(element) AS [Average Through Function]
FROM #tbl_elements
GO
DROP TABLE #tbl_elements
GO
-------- End of Code -------
The result of the T-SQL statement is:
Average through Formula - 7
Average Through Function - 10
Why this difference???
The aggregate functions ignore NULL values when aggregating a column which contains both NULL and non-null values. COUNT(*) aggregate function is an exception. This function includes NULL values when it counts. If we need to exclude NULL values in the counting, specify the column name instead of ‘*’ in the function. That is COUNT(*) will count all elements including NULLs. However, COUNT(column) counts only the non-NULL values.
Now, let us re-write the T-SQL code as,
SELECT
SUM(element)/COUNT(element) AS [Average through Formula],
AVG(element) AS [Average Through Function]
FROM #tbl_elements
GO
We need to give extra care when we deal with NULL values to avoid this kind of mistakes.
Tuesday, October 25, 2005
SQL Server - Application role - II
Continuation of SQL Server - Application role - I...
Application role is one of the undervalued features available in SQL Server 2000.
I think most of the DBAs neglecting to utilize this feature in their SQL Server security setup. Some of the special characteristics of the application roles:
• Unlike other roles (fixed server roles, fixed database roles and user defined roles), application roles do not contain any members in it.
• To activate application role, it requires a password.
• When a user activates application role, his connection (only that session) get associated with the application role. Hence the session loses its all permissions and inherits permission from the application role.
• Application roles are database specific (the application roles are available only to the database they exist).
• Once it is associated with the standard SQL login (NT login or SQL Login), that SQL user session can not access the objects in other database.
• The associated session can access the other database objects only through GUEST granted permissions. Hence, if the guest user account does not exist in a database, the session can not gain access to the database.
• The associated session does not permit to execute “USE DATABASE” T-SQL command. Therefore, the session needs to address the other database object using three part naming convention.
Application role is one of the undervalued features available in SQL Server 2000.
I think most of the DBAs neglecting to utilize this feature in their SQL Server security setup. Some of the special characteristics of the application roles:
• Unlike other roles (fixed server roles, fixed database roles and user defined roles), application roles do not contain any members in it.
• To activate application role, it requires a password.
• When a user activates application role, his connection (only that session) get associated with the application role. Hence the session loses its all permissions and inherits permission from the application role.
• Application roles are database specific (the application roles are available only to the database they exist).
• Once it is associated with the standard SQL login (NT login or SQL Login), that SQL user session can not access the objects in other database.
• The associated session can access the other database objects only through GUEST granted permissions. Hence, if the guest user account does not exist in a database, the session can not gain access to the database.
• The associated session does not permit to execute “USE DATABASE” T-SQL command. Therefore, the session needs to address the other database object using three part naming convention.
Monday, October 17, 2005
Space matters
Further to my blog post "T-SQL funs from community sites", I would like to add some points on that.
SQL Server ignores the trail space in the strings when comparing two string values (char and varchar values). We can check the behavior through this code.
----------------Begining of the script ------------
declare @myStr varchar(10)
declare @urStr varchar(10)
set @myStr = 'String'
set @urStr = 'String '
select len(@myStr) as [MyString Length], len(@urStr) as [UrString Length]
----------------End of the script ------------
The above code will return length 6 for both the strings. Further more, the following code returns “Equal” even though the variable @urStr contains the trail space in it.
----------------Begining of the script ------------
declare @myStr varchar(10)
declare @urStr varchar(10)
set @myStr = 'String'
set @urStr = 'String '
if @myStr = @urStr
print 'Equal'
else
print 'Not Equal'
----------------End of the script ------------
If we are concerned about the trail space and want to know accurate length, DATALENGTH() function comes handy. This function won’t ignore the trail space. This updated code will demonstrate that.
----------------Begining of the script ------------
declare @myStr varchar(10)
declare @urStr varchar(10)
set @myStr = 'String'
set @urStr = 'String '
select datalength(@myStr) as [MyString Length], datalength(@urStr) as [UrString Length]
if @myStr = @urStr and datalength(@myStr) = datalength(@urStr)
select 'Equal'
else
select 'Not Equal'
----------------End of the script ------------
Now the code returns accurate length and comparision. :)
SQL Server ignores the trail space in the strings when comparing two string values (char and varchar values). We can check the behavior through this code.
----------------Begining of the script ------------
declare @myStr varchar(10)
declare @urStr varchar(10)
set @myStr = 'String'
set @urStr = 'String '
select len(@myStr) as [MyString Length], len(@urStr) as [UrString Length]
----------------End of the script ------------
The above code will return length 6 for both the strings. Further more, the following code returns “Equal” even though the variable @urStr contains the trail space in it.
----------------Begining of the script ------------
declare @myStr varchar(10)
declare @urStr varchar(10)
set @myStr = 'String'
set @urStr = 'String '
if @myStr = @urStr
print 'Equal'
else
print 'Not Equal'
----------------End of the script ------------
If we are concerned about the trail space and want to know accurate length, DATALENGTH() function comes handy. This function won’t ignore the trail space. This updated code will demonstrate that.
----------------Begining of the script ------------
declare @myStr varchar(10)
declare @urStr varchar(10)
set @myStr = 'String'
set @urStr = 'String '
select datalength(@myStr) as [MyString Length], datalength(@urStr) as [UrString Length]
if @myStr = @urStr and datalength(@myStr) = datalength(@urStr)
select 'Equal'
else
select 'Not Equal'
----------------End of the script ------------
Now the code returns accurate length and comparision. :)
Tuesday, October 11, 2005
Deferred named resolution - I
Deferred named resolution
Yesterday I have noticed an interesting behavior of SQL Server 2000 when writing a procedure. I came across a situation to define a variable data type based on a condition. Say, if the condition meets true then the variable needs to initiate as char otherwise as varchar. I have written the scriptlet similar to the following.
create proc test1 @type char(1)
as
if @type = 'c'
declare @var1 as char(10)
else
declare @var1 as varchar(10)
When I execute (even parse the query by pressing Ctrl+F5) the query, it thrown an error
Msg 134, Level 15, State 1, Procedure test1, Line 6
The variable name '@var1' has already been declared. Variable names must be unique within a query batch or stored procedure
If we look at the query it is practically not possible to satisfy both the status (true and false), hence the error thrown is unexpected. After referring Khen’s excellent book “Guru’s guide to SQL Server stored procedures, HTML and XML”, I found that this strange behavior is cause of SQL Server 2000’s name resolution. SQL Server won’t resolve the object names when we create procedures. The object name get resolve at run-time not at design time. But this will not apply for variables and temporary tables. So, we can not declare a variable more than one place in a single stored procedure even you declare them mutually exclusive or logically independent blocks.
Yesterday I have noticed an interesting behavior of SQL Server 2000 when writing a procedure. I came across a situation to define a variable data type based on a condition. Say, if the condition meets true then the variable needs to initiate as char otherwise as varchar. I have written the scriptlet similar to the following.
create proc test1 @type char(1)
as
if @type = 'c'
declare @var1 as char(10)
else
declare @var1 as varchar(10)
When I execute (even parse the query by pressing Ctrl+F5) the query, it thrown an error
Msg 134, Level 15, State 1, Procedure test1, Line 6
The variable name '@var1' has already been declared. Variable names must be unique within a query batch or stored procedure
If we look at the query it is practically not possible to satisfy both the status (true and false), hence the error thrown is unexpected. After referring Khen’s excellent book “Guru’s guide to SQL Server stored procedures, HTML and XML”, I found that this strange behavior is cause of SQL Server 2000’s name resolution. SQL Server won’t resolve the object names when we create procedures. The object name get resolve at run-time not at design time. But this will not apply for variables and temporary tables. So, we can not declare a variable more than one place in a single stored procedure even you declare them mutually exclusive or logically independent blocks.