Temporary tables/views go light weight in T-SQL 2005
Well, here I am, once again leaving a trace of what I've learned so far so I can use it as reference later.
Unfortunately for the T-SQL lovers that come from the Oracle world, the T-SQL enhancements
in SQL Server 2005 do not include arrays among the data structures, the workaround for array use is
to get hands on the SQLCLR and use the .NET framework...
Ok, but this post was about Common Table Expressions, what does arrays have to do with it?
Most of the time, T-SQL developers would use temporary tables whenever they could have used arrays,
to somehow keep a set of values for future use inside the batch.
The way of building temporary structures, name it tables, in your sprocs has evolved from sql server 7.0.
Initially you had the temporary tables such as #people:
Local Temporary Tables
CREATE TABLE #people
(
id INT,
name VARCHAR(32)
)
A temporary table is created and populated on disk, in the system database tempdb — with a
session-specific identifier packed onto the name, to differentiate between similarly-named #temp tables
created from other sessions. The data in this #temp table (in fact, the table itself) is visible only to the current scope
(usually a stored procedure, or a set of nested stored procedures). The table gets cleared up automatically when
the current procedure goes out of scope, but you should manually clean up the data when you're done with it:
DROP TABLE #people
Or the
Global Temporary Tables
CREATE TABLE ##people
(
id INT,
name VARCHAR(32)
)
Global temporary tables operate much like local temporary tables; they are created in tempdb and cause less locking and logging than permanent tables. However, they are visible to all sessions, until the creating session goes out of scope (and the global ##temp table is no longer being referenced by other sessions). If two different sessions try the above code, if the first is still active, the second will receive the following:
Server: Msg 2714, Level 16, State 6, Line 1
There is already an object named '##people' in the database.
I have yet to see a valid justification for the use of a global ##temp table. If the data needs to persist to multiple users, then it makes much more sense, at least to me, to use a permanent table. You can make a global ##temp table slightly more permanent by creating it in an autostart procedure, but I still fail to see how this is advantageous over a permanent table. With a permanent table, you can deny permissions; you cannot deny users from a global ##temp table.
In sql server 2000 it was introduced the concept of table variables:
Table Variables
DECLARE @people TABLE
(
id INT,
name VARCHAR(32)
)
A table variable is created in memory, and so performs slightly better than #temp tables (also because there is even less locking and logging in a table variable). A table variable might still perform I/O to tempdb (which is where the performance issues of #temp tables make themselves apparent), though the documentation is not very explicit about this.
Table variables are automatically cleared when the procedure or function goes out of scope, so you don't have to remember to drop or clear the data (which can be a good thing or a bad thing; remember "release early"?). The tempdb transaction log is less impacted than with #temp tables; table variable log activity is truncated immediately, while #temp table log activity persists until the log hits a checkpoint, is manually truncated, or when the server restarts.
Then why not using always tale variables... Performance is the answer. I'm not going to go deep into this now but just post the url for the official white paper:
Limitations of table variables http://support.microsoft.com/default.aspx/kb/305977
And finally SQL Server 2005 Common Table Expressions, I guess I could call them the light weight temporary tables. their syntax is even lighter than
the syntax in table variables.
Pseudocode and Semantics
The recursive CTE structure must contain at least one anchor member and one recursive member. The following pseudocode shows the components of a simple recursive CTE that contains a single anchor member and single recursive member.
WITH cte_name ( column_name [,...n] )
AS
(
CTE_query_definition –- Anchor member is defined.
UNION ALL
CTE_query_definition –- Recursive member is defined referencing cte_name.
)
-- Statement using the CTE
SELECT *
FROM cte_name
The semantics of the recursive execution is as follows:
1. Split the CTE expression into anchor and recursive members.
2. Run the anchor member(s) creating the first invocation or base result set (T0).
3. Run the recursive member(s) with Ti as an input and Ti+1 as an output.
4. Repeat step 3 until an empty set is returned.
5. Return the result set. This is a UNION ALL of T0 to Tn.
A simple CTE declaration:
USE AdventureWorks;
GO
WITH DirReps(ManagerID, DirectReports) AS
(
SELECT ManagerID, COUNT(*)
FROM HumanResources.Employee AS e
WHERE ManagerID IS NOT NULL
GROUP BY ManagerID
)
GO
An even simpler use:
USE AdventureWorks;
GO
SELECT ManagerID, DirectReports
FROM DirReps
ORDER BY ManagerID;
GO
So with CTEs there is no need of formally declaring your column types, only the column names, as the types are assumed from the
query definition. The syntax is for sure lighter. I'm not aware of performance issues related to CTEs, probably some kind soul could ad a comment here and let me know.
We so far cannot pass table variables and CTEs as parameters in stored procedures, this is definitely something that should come in future versions.
Bye for now.
2 Comments:
I have read and read many similar articles on CTEs. I need to be able to find the first occurence of an item with an associated assembly id. The assembly IDs are in the assembly table and each has it's child ID listed. Where/how can this assembly ID be set up as a variable parameter?
Thank you for your comments. This is indeed an interesting subject. Recursion in T-SQL had been achieved before with cursors or implementing the recursive algorithm at the data access layer with ADO.NET for example.
I presume you want to do some recursion based on the parent assemblyID, what is not clear to me and please explain further is the recursion that you are trying to do, as as far as I see you have a simple primary or surrogate key to foreign key, with a 1 to many relationship and two tables???
Or is it a single table with the IDs that have the children IDs as columns?
The novel thing of CTE is that it can refer itself in order to create recursive queries. First on the CTE definition you define an anchor which will be the highest level on the hierarchy of your result set, then define your recursion query which will reference your anchor, the name right after the keyword WITH, and finally use that CTE declaration on a straight query.
If you would explain the hierarchical levels of the table containing the assemblyIDs and the table containing the children assemblyIDs I'll be glad to help...The examples I have seen so far is only one result set (table or view) that you query recursively to display hierarchical data. What is the hierarchy that you want to query is what is not quite clear to me.
Post a Comment
<< Home