TIMESTAMP and ROWVERSION datatypes and the Behavior of the SQL Server @@DBTS Counter

The TIMESTAMP datatype in SQL Server is in no way related to any Date/Time datatype or any operation related to dates or time. It is a simple monotonously incrementing counter that increases by one every time there is an INSERT or UPDATE operation in the database. It is stored as BINARY but can be converted to BIGINT or INT type to make the values seem more user-friendly. The IDENTITY property of a table is quite similar when we think about a monotonously incrementing value. While the IDENTITY counter is tracked at a table level, the TIMESTAMP counter is tracked at a database level. The current value of the TIMESTAMP counter can be returned by using the Configration Function @@DBTS. We can assume a TIMESTAMP to be a GUID in the smaller context of a database. So theoretically no two tables in a database can have the same TIMESTAMP value in any of their rows. But practically it is possible to go around this via SELECT..INTO syntax to create a new table, as I’ll demostrate later in this post.

As calling it a TIMESTAMP has caused confusion among SQL developers, Microsoft introduced ROWVERSION as a new name for it. Calling it ROWVERSION also emphasizes its utility for implementing optimistic locking in the code. TIMESTAMP has been deprecated and will be discontinued in a future SQL Server release so use ROWVERSION in all new development work.

It is optional to give the column a name in the DDL with the TIMESTAMP datatype. If a column name is not specified then SQL Server uses the default name of “TIMESTAMP” for the column. Helpful, eh?

On the other hand, if using the ROWVERSION as datatype, a column name must be given just like creating a column any other datatype.

A table can have only one column of TIMESTAMP (or ROWVERSION) type.

You cannot enter/update a value explicitly into a column of TIMESTAMP or ROWVERSION datatype. The database manages it internally. If you do try then an error is returned –

Msg 273, Level 16, State 1, Line 1

Cannot insert an explicit value into a timestamp column. Use INSERT with a column list to exclude the timestamp column, or insert a DEFAULT into the timestamp column.

The DBTS value starts from 2000 in a brand new database without any table containing a TIMESTAMP/ROWVERSION column.

CREATE DATABASE [TestDB] 
GO

-- Find the default value of 
-- the TIMESTAMP/ROWVERSION counter 
SELECT @@DBTS AsBinary, 
       CAST(@@DBTS AS BIGINT) AsBigint 

/* Result
AsBinary                AsBigint 
0x00000000000007D0      2000 
*/

Does the value of the counter change after new table is created with a TIMESTAMP/ROWVERSION?

No. Just creating a table does not change the counter.

CREATE TABLE a (
       id INT IDENTITY, 
       TIMESTAMP)
GO
SELECT @@DBTS AsBinary, 
       CAST(@@DBTS AS BIGINT) AsBigint 
/* Result
AsBinary               AsBigint 
0x00000000000007D0     2000 
*/

Insert a row in a table and check counter value. It increases by one.

INSERT INTO a DEFAULT VALUES
GO
SELECT *, 
       CAST(timestamp AS BIGINT) 
FROM a 

SELECT @@DBTS AsBinary, 
       CAST(@@DBTS  AS BIGINT) AsBigint 
/* Result
AsBinary               AsBigint 
0x00000000000007D1     2001 
*/

Create a new table, this time with ROWVERSION and insert a row. The counter still increases by one.

CREATE TABLE b (
       id INT IDENTITY, 
       col2  ROWVERSION) 
GO
INSERT INTO b DEFAULT VALUES
GO
SELECT *, 
       CAST(col2  AS BIGINT) 
FROM b 

-- Find the value of the counter after 
-- a row is inserted 
SELECT @@DBTS AsBinary, 
       CAST(@@DBTS  AS BIGINT) AsBigint 
/* Result
AsBinary               AsBigint 
0x00000000000007D2     2002 
*/

Do the INSERTS in the same transaction have the same counter value? Let us try.

BEGIN TRAN 
INSERT INTO a DEFAULT VALUES
INSERT INTO b DEFAULT VALUES
COMMIT

SELECT *, 
       CAST(timestamp AS BIGINT) 
FROM a 

SELECT *, 
       CAST(col2 AS BIGINT) 
FROM b 

-- Find the value of the counter after rows 
-- are inserted in multiple tables in a transaction. 
-- Each row still gets a unique counter value.
SELECT @@DBTS AsBinary, 
       CAST(@@DBTS  AS BIGINT) AsBigint 
/* Result
AsBinary               AsBigint 
0x00000000000007D4     2004 
*/

Does the counter increase in case of a failed transaction?

BEGIN TRAN 
INSERT INTO a DEFAULT VALUES
INSERT INTO b DEFAULT VALUES
ROLLBACK

SELECT *, 
       CAST(timestamp AS BIGINT) 
FROM a 

SELECT *, 
       CAST(col2 AS BIGINT) 
FROM b 

-- Find the counter value after a failed transaction
-- Yes. The counter value still increases.
SELECT @@DBTS AsBinary, 
CAST(@@DBTS AS BIGINT) AsBigint 
/* Result
AsBinary               AsBigint 
0x00000000000007D6     2006 
*/

Is is possible to have duplicate TIMESTAMP values across tables of a database?
Yes. Although values in TIMESTAMP/ROWVERSION columns cannot be explicitly inserted or modified they can be copied from one table to another using the SELECT..INTO syntax, which in turn could result in duplicate counter values across tables

SELECT *
INTO c -- New table
FROM a

SELECT *, 
       CAST(timestamp AS BIGINT) 
FROM a -- Source table 

SELECT *, 
       CAST(timestamp AS BIGINT) 
FROM c -- New table 

-- The counter value does not change.
SELECT @@DBTS AsBinary, 
       CAST(@@DBTS AS BIGINT) AsBigint 
/*
AsBinary               AsBigint 
0x00000000000007D6     2006 
*/

Does the counter continue with insertions into new table that was created with SELECT…INTO? Yes, the counter changes because the new table gets the column definition of the source table including the ROWVERSION column.

INSERT INTO c DEFAULT VALUES
SELECT *, 
       CAST(timestamp AS BIGINT) 
FROM c -- New table 

-- Find the value of the counter after a row 
-- is inserted into the new table.
-- The counter value does change.
SELECT @@DBTS AsBinary, 
       CAST(@@DBTS AS BIGINT) AsBigint 
/*
AsBinary               AsBigint 
0x00000000000007D7     2007 
*/

Final review of the counter values in tables

SELECT *, CAST(timestamp AS BIGINT) 
FROM a 

SELECT *, CAST(col2 AS BIGINT) 
FROM b 

SELECT *, CAST(timestamp AS BIGINT) 
FROM c 

 

 

Using a Meaningful Application Name in the OLEDB Connection String

The default connection string for a OLEDB connection looks something like this –

Data Source=(local);Initial Catalog=ReportServer;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;Application Name=SSIS-DemoOLEDBPackage-{9758C970-4AAC-449E-8390-FA7EAC5CD52A}(local).ReportServer;
 

Notice the Application Name parameter in bold font. It has the name of the package, a unique Id for the connection, the SQL server name and the database name. This application name will be exported out to the config file too. Also, this is what you will see in the Program Name column of SP_WHO2 results when the package is running.

When the developers create the packages on a development server, the connection string includes the development server name. At the time of migrating the package and its config file to the QA and Production environment, the developers usually change the Data Source and Initial Catalog in the connection string but leave the Application Name as it is. Anyone monitoring the Production server with a SP_WHO2 would see a package running with a funny long name including GUID and the development server name. It is certainly not user-friendly.

This can be fixed easily with any one of the following methods.

Method 1 – Properties Window

Right click the OLEDB connection and select Properties (or select the connection and press F4). The Properties window will show up. Edit the Connection String property to change the Application Name parameter of the connection in this window.

Method 2 – Config File

Edit the Connection String property to change the Application Name parameter of the connection in the config file. As we know, the values in the config file override the design time values when the package is running. Although I prefer making the initial change in the package itself before creating the config file.

Method 3 – The Connection Manager Wizard

  1. Double click the OLEDB Connection.
  2. Click on the All Tab.
  3. Edit the Application Name.

SSISOLEDBConnectionStringApplicationName

Now the connection string would look like the following, which is a lot cleaner than the default string –

Data Source=(local);Initial Catalog=ReportServer;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;Application Name=SSIS-DemoPackage;
 

I prefer using a _ convention based on the target environment e.g. SSIS-DemoPackage_TEST, SSIS-DemoPackage_QA or SSIS-DemoPackage_PROD in the different config files. You can plan to put in other information to make it more meaningful in your context.

I’d love to hear your thoughts on this!

SSIS Variable Naming Convention Based on Origin of Value

There are many schools of thought about naming conventions, not just about SSIS variables, but about naming anything that can be given a name! One point that everyone agrees on is – use a convention that suits your environment, and stick to it.

Related Blog Post – SSIS component naming convention

One issue that I face while looking at packages with many variables is that there is no straightforward way to get a summary or a quick glance of all variables that have expressions defined for them. The BIDS Helper add-in for Visual Studio assists in this regard by adding a triangular icon overlay over the variables with an expression. SQL Server 2012 is also said to have a similar feature but I haven’t tried that yet. On the other hand, if I need to see a summary of variables that have values coming from a config file, I have to use the Configuration Editor to edit the config file and reach its last summary screen.

I’ve started using a prefix-based convention to name my SSIS variables recently based on the origin of the variable value. I distinguish between the variables as having –

  1. a value set from a config file at execution time.
  2. a value calculated in an expression at execution time. The expression could be based on other user variables, system variables, functions or literals.
  3. a static value that will not change from its design time value.
  4. a value assigned in a Script Task, a For Each Loop Container, Row Count etc.

If I know at design time that the value of the variable will be passed from a configuration file, I give it a prefix of “config”, e.g. configEmailDistributionList. If the value would be set via an expression then the variable gets a prefix “expr”, e.g. exprDirectoryForArchive. Other variables do not have any prefix, simply a name that is descriptive enough.

Advantages:

  • At the time of creating a config file, the prefix makes it clear that all variables with a “config” prefix must be included in the configuration. So the config file now has only such variables along with other task or connection manager properties.
  • On opening up a package, I can easily figure out where should I be looking for the value of the variable, the config file or the Expression Editor.
  • Alphabetical ordering in the Variables window groups similar variables together.
  • Helps in development because I don’t have to constantly check back variable properties if it has an expression or not.
  • Helps in standardizing the names and expressions for some of the most frequently used variables.
  • Helps in debugging.

Disadvantages:

The argument against this convention would be the same as given for any prefix-based convention (e.g. Hungarian Notation) that if the origin of the variable value changes then the prefix would also need to change. Which in turn, may need a change at all those places where the variable is used. As an example, if a variable was getting its value from a config file but would now have a static value, then the “config” prefix would have to go away and all the expressions, properties etc. using that variable would have to be updated. Right now I’m counting on doing a Find/Replace in the XML code of the package.

I’ll try to see if using variable Namespaces would be a better idea than prefixes and share the experience in a blog post later.

References –

(not specifically for Hungarian Notation, but in context of any prefix-based convention)

Hungarian notation (Wikipedia)
http://en.wikipedia.org/wiki/Hungarian_notation

Hungarian Notation (MSDN)
http://msdn.microsoft.com/en-us/library/aa260976(v=VS.60).aspx

Hungarian Notation – The Good, The Bad, and The Ugly
http://ootips.org/hungarian-notation.html

TSQL Gotcha – Quotes are optional for a Stored Procedure VARCHAR parameter without spaces

If you want to assign some default value to a parameter of type VARCHAR in a stored procedure and the parameter value does not have any spaces then the quotes around the value are optional.

create procedure TestProc
(
@a varchar(20) = 'spaces in string', --must use quotes for string with spaces
@b varchar(20) = 'nospacesinstring',
@c varchar(20) = nospacesinstring --quotes are optional if no spaces in string
)
as
select @a, @b, @c;
;
exec TestProc;

TSQL Gotcha – ISNUMERIC Function Results

The ISNUMERIC function (MSDN documentation) is used to validate if a string is a valid number. In the demonstration that follows, you would expect the results from 1 to 5 but look closely at the results 6, 7 and 8. There are multiple commas in the string and the last one is not even a number!

Well, a note at the MSDN documentation explains the True result for the ‘$’ sign at Id 8 –

ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($). For a complete list of currency symbols, see Using Monetary Data.

For the strings at Id 6 and 7 with  commas, the function behaves that if the string is convertible to a number then the result will be True. Something to remember while writing those data validation routines.

DECLARE @t TABLE
( id int, col1 VARCHAR(10) )

INSERT INTO @t (id, col1)
VALUES (1, '-1001'),
(2, '0'),
(3, '1001.50'),
(4, '1,500'),
(5, 'abc'),
(6, '15,0,1'),
(7, '15,0,1.75'),
(8, '$')

SELECT id,
col1,
Isnumeric(col1) AS [IsNumeric]
FROM @t

SQLGotchas-IsNumeric

TSQL Gotcha – Miss a comma

Some minor omissions can go unnoticed and produce completely unexpected results.

The following code example has two seemingly similar SELECT statements but they produce different results. All because I missed out a comma in the second SELECT statement! SQL Server did not raise any error and quietly assumed that I have tried to use an alias name for the First Name column!

CREATE TABLE employee
(
employee_id INT,
firstname VARCHAR(50),
middleinitial VARCHAR(1),
lastname VARCHAR(50)
)
INSERT INTO employee
(employee_id, firstname, middleinitial, lastname) VALUES
(1001, 'Timothy', 'B', 'Parker'),
(1002, 'Henry', 'F', 'Williams'),
(1003, 'Jeff', NULL, 'Smith'),
(1004, 'Mary', 'M', 'Marsh'),
(1005, 'Miles', 'V', NULL)

--Correct statement
SELECT employee_id, firstname, middleinitial, lastname
FROM employee

-- Wrong statement with a comma missing
-- between the firstname and middleinitial columns.
-- SQL Server thinks it is an alias for firstname.
SELECT employee_id, firstname middleinitial, lastname
FROM employee

sqlgotchas-missingcomma

TSQL Gotcha – Empty string in INTEGER type column

It is important to sanitize data.

If you try to put an empty string in an INTEGER type (or its cousin) columns, SQL Server converts it implicitly into a zero. This might not be the expected behavior for the application so it is good to be aware.

declare @v char(1) = '' -- this could have been a varchar too with the same results

declare @t table
(
colbigint  bigint,
colint int,
colsmallint smallint,
coltinyint tinyint
)

/* The wrong way that will change all blank strings to zero */
insert into @t
(
colbigint,
colint,
colsmallint,
coltinyint
)
values
(
@v,
@v,
@v,
@v
)

/* The correct way to sanitize */
insert into @t
(
colbigint,
colint,
colsmallint,
coltinyint
)
values
(
nullif(ltrim(@v), ''),
nullif(ltrim(@v), ''),
nullif(ltrim(@v), ''),
nullif(ltrim(@v), '')
)

select * from @t
go

SQLGotchas-BlankStringInInteger

Parent-Child Table Dependency Script

Requirement:

You need to delete and then insert data into tables which have parent-child relationships defined via primary and foreign keys. The delete and insert commands have to be executed in proper sequence.

The sequence for DELETE is Child-before-Parent.

The sequence for INSERT is Parent-before-Child.

I created a quick script in response to a forum post. The script is not the best out there because –

  • It does not handle circular references in relationships
  • It takes 30 seconds or more if run on a database with hundreds of tables

I’m sharing it for my future reference and probably others with smaller number of tables would find it useful too.

Download script at –
http://gallery.technet.microsoft.com/Parent-Child-Table-45fa5268

Further reference:
For a faster (but not with all these output columns) script, look at Erland Sommarskog’s response in the forum thread.

Script output would look like the example below –

TableDependencyReport

Calculate TSQL Stored Procedure Complexity

I had a recent project where we planned to re-factor an existing application’s .Net and TSQL code. I wanted to find the complexity of the stored procedures and arrive at an estimate of the effort required to review all them. I had in mind that something like number of lines of code, number of dependencies and parameters in the stored procedure would give a good starting  point to create such an estimate.

I patched together a script that produced a report similar to the example below. I used that output and put in some more formulas to assign a complexity level (simple, medium or complex) and the approximate number of minutes required to review that procedure. I have not included those calculations here because they depend entirely on the estimator’s perception of complexity and the developer’s skill level.

The number of lines of code is a subjective metric here because it depends on the developer’s coding style and use of whitespace. Even then, it could be a useful reference point.

Complexity Report Screenshot

In case this seems useful then the script to create that report is available for download at Technet.

Download and rate the script at: http://gallery.technet.microsoft.com/Calculate-TSQL-Stored-831b683a

CHANGE HISTORY:
2012/Dec/01 – Version 1
2013/Feb/15 – Version 1.1 Added the CASE statement for complexity.
2014/Apr/29 – Version 1.2 Added Database and Schema name to the report.
Included objects that may not be refering to other objects.
Included views, functions and triggers.

Please feel free to share you feedback in the comments.

SSIS Package Development Effort Estimation

I blogged about a DTS to SSIS conversion effort estimation project that I had a chance to work upon previously. I had created an Excel workbook to help in that estimation.

That workbook was not just limited to the DTS to SSIS conversion estimates. It can be used for regular SSIS development project estimations too. I’ve updated the workbook a bit to reflect its generic nature.

The Excel workbook can help in estimating the effort required to create new SSIS packages. The new SSIS could either be a conversion of an existing DTS package or a fresh requirement. The estimation methodology would be similar in both cases.

Benefits of Using this Estimation Methodology

  • Standardize the estimates by using the same rules instead of guessing randomly, which could return widely varying figures depending on whom you talk to.
  • Standardizing also means that the estimation process for small, medium or large projects is the same. The importance of having a standardized process is directly proportional to the project size, to reduce the variance.
  • The estimates can be fine-tuned over time with real-world feedback so that the estimates get better with each new project.
  • Flexible to include the developer skill factor.

Two Steps to Fill the Estimation Workbook

  1. Put in the estimated number of minutes required for –
    a) creating each type of component in SSIS. This time could vary depending on the developer’s experience and skill. Sample values are included in the workbook.
    b) standard time additions for activities like analysis, error handling, logging and documentation. Sample values are included in the workbook.
  2. Update the workbook with the details like the type and number of all tasks, connections, transformations etc. that would be used in the SSIS package.
    a) If the basis is a DTS package then open the DTS and count the tasks, connections etc.
    b) If the basis is a fresh requirement then put the details according to those requirements. This will also help you to plan the package development.

A brief Help section is also included in the workbook.

Download

Download at: http://gallery.technet.microsoft.com/DTS-to-SSIS-Conversion-7153a957