Aalam Rangi, Author at SQLErudition.com

SSIS Components Naming Conventions

May 28, 2014July 19, 2016Aalam Rangi3 Comments

Read a New Version of This Blog Post – SSIS Naming Convention – Updated for SSIS 2016

I wrote earlier about my naming conventions for SSIS variables that let me quickly figure out the source of the variable value very easily. This post continues on to the prefix-based naming conventions I use (or plan to use) for the 120+ components in SSIS packages.

While looking around online for SSIS component naming conventions, most of the suggestions are based on Jamie Thompson’s (Blog | @JamieT) post SSIS: Suggested Best Practices and naming conventions. He has tried to limit the prefixes to 3 or 4 characters and included most of the frequently used components. Using the prefixes as described in his post has helped me a lot in the maintenance phase of a package when my memory has faded about the functionality and the workflow. Some more motivation to adopt a naming convention can be found in 31-days of SSIS series by Jason Strate (Blog | @StrateSQL).

My goal was to make an exhaustive list of components with their prefixes so that –

I do not need to put on my thinking hat every time I start using a component for the first time.
The naming convention can be future-proof and avoid confusion with new components being introduced in SSIS platform.
Remove the guesswork (especially for new developers and maintenance teams who may not be SSIS developers primarily) by avoiding prefixes that sound too similar to another.

I created an Excel sheet of all the components (more than 120) that I found listed in MSDN, i.e. containers, connection managers, sources, destinations, tasks, and transformations. My prefixes are more elaborate than Jamie’s for the reasons mentioned above, but that is just my preference. You can use the Excel as a template for you organization and modify the prefixes to you liking.

The following is an example of one section from the Excel.

Download: Get the full Excel workbook at the TechNet Galleries

Example –

Spring Time!

Aside May 13, 2014June 15, 2016Aalam RangiLeave a comment

Keeping up with the spirit of spring season, I’ve refreshed the look of the blog by using the “Flounder” theme. I was using the “Enterprise” theme earlier.

What I like about Flounder is :

the colour scheme
the overall font choice
a well-defined sidebar
and most of all, its mobile device compatibility. It readjusts from 3-column layout to a 2-column or a 1-column layout as the screen size becomes smaller. That really allows very easy browsing on a tablet or a cell phone, for folks who don’t use the WordPress app, by completely avoiding horizontal scrolling.

I have also changed the blog’s tag line from “Learnings as a SQL Server DBA and Developer” to “Erudition as a SQL DBA and Developer”.

er·u·di·tion [er-yoo-dish-uhn, er-oo-]

noun
knowledge acquired by study, research, etc.; learning; scholarship.

And finally, I’ve added my Twitter handle @TsqlREX to the publicize section of the blog so that a new post is announced on Twitter too. I’m not a big Twitter user right now but I plan to learn more about it in this year.

[jetpack_subscription_form subscribe_button=”Sign Me Up”]

TSQL Gotcha – Putting a Condition in WHERE or ON clause: Does it Matter in TSQL?

May 12, 2014Aalam RangiLeave a comment

A filter condition can be syntactically put in the WHERE clause of the query or the ON clause too. But does it matter?

No, it does not matter for an INNER join. I found this one example of an exception in Martin Smith’s comment on Stack Overflow where using the GROUP BY ALL makes a difference in results. GROUP BY ALL is now a deprecated syntax (Feature ID 169) so probably we can ignore this exception. But it is always good to know.

On the other hand, Yes, it could affect the output in any of the OUTER joins.

This happens because when a filter criteria is made a part of the JOIN condition, it is applied at the time of reading the tables to identify the rows for the JOIN. The rows that do not satisfy the filter criteria are not included for the purpose of the JOIN and that causes NULLs in the OUTER JOINs.

However, with the filter criteria a part of the WHERE clause, it is applied after the query has been processed and the adhoc dataset has been created internally. The final dataset is reduced to just those rows which meet the filter criteria.

I will use a criteria of c.isactiveYN = ‘Y’ for demonstration below. Rows with a value of ‘Y’ are used for the join. The rows with ‘N’ or NULLs don’t match the criteria so they are left out.

As the examples show below, this seemingly minor difference can cause substantial difference in the query output of OUTER JOINs. This effect could cascade on to other subsequent tables in bigger queries that are joined to the original tables.

-- Create demo tables and insert some data
declare @customer table 
(id int, 
cname varchar(20), 
isactiveYN char(1))

declare @order table 
(id int, 
customerid int, 
orderdate date)

insert into @customer 
values
(1, 'Bob',   'Y'),
(2, 'Harry', 'N'),
(3, 'Max',   'Y')

insert into @order
values
(1, 1, '2014-1-1'),
(2, 1, '2014-2-2'),
(3, 2, '2014-3-3'),
(4, 2, '2014-4-4'),
(5, 3, '2014-5-5'),
(6, 3, '2014-6-6')

select * from @customer
select * from @order

Results :

Demonstration: The location of criteria has no effect on the results in INNER JOIN.

-- A simple INNER JOIN without any 
-- conditions to display all data:

-- INNER JOIN with out any conditions
select * 
from 
  @customer as c
inner join 
  @order as o
  on c.id = o.customerid

-- The following two queries differ
-- in the location of the criteria
-- but still show similar results

-- INNER JOIN with filter in WHERE clause
select * 
from 
  @customer as c
inner join 
  @order as o
  on c.id = o.customerid
where
  c.isactiveYN = 'Y'

-- INNER JOIN with filter in ON clause
select * 
from
  @customer as c
inner join 
  @order as o
  on c.id = o.customerid
  and c.isactiveYN = 'Y'

Results : The second and third results are similar.

Demonstration: LEFT OUTER joins with the condition in the WHERE clause and the ON clause

-- LEFT JOIN: Filter in WHERE clause
select * 
from 
  @customer as c
left outer join 
  @order as o
  on c.id = o.customerid
where 
  c.isactiveYN = 'Y'

-- LEFT JOIN: Filter in ON clause
select * 
from 
  @customer as c
left outer join
  @order as o
  on c.id = o.customerid
  and c.isactiveYN = 'Y'

Results : “Harry” is not in the first result-set because the row was filtered out by the WHERE clause. “Harry” is in the second result-set because it comes from the LEFT OUTER table but there is no matching row because it was not used for the JOIN.

Demonstration: RIGHT OUTER joins with the condition in the WHERE clause and the ON clause

-- RIGHT JOIN: Filter in WHERE clause
select * 
from 
  @customer as c
right outer join 
  @order as o
  on c.id = o.customerid
where 
  c.isactiveYN = 'Y'

-- RIGHT JOIN: Filter in ON clause
select * 
from 
  @customer as c
right outer join 
  @order as o
  on c.id = o.customerid
  and c.isactiveYN = 'Y'

Results : “Harry” is not in the first result-set because the row was filtered out by the WHERE clause. “Harry” is not in the second result-set because it has been filtered out and it is also not in the OUTER table.

Summary: When the criteria is in the WHERE clause, The results of the INNER, LEFT OUTER and the RIGHT OUTER queries are the same. Differences show up in the results of LEFT and RIGHT OUTER queries when the criteria is in the ON clause.

Folks, Give Your Eyes a Break

April 24, 2014Aalam RangiLeave a comment

I work long hours looking at computer monitor(s). On certain days, the eye-strain used to get too heavy on my eyes and I couldn’t even look at people straight in the eye!

I needed some kind of reminder to pause, look away from the monitors, and maybe take a water break.

There are many programs that show cute pop-up reminders. But they are on the same screen that I’m trying to look away from in the first place! Also most of them are easy to ignore. I didn’t want a system lock screen because logging in frequently becomes an irritant after some time.

I’ve started using the good old Windows Task Scheduler along with the NirCmd utility by NirSoft. It is a simple EXE file and does not need any installation. While it can do a lot of useful stuff but I was interested in just turning the monitor on/off. Especially good if you work on laptops that don’t have dedicated on/off switch for monitors.

NirCmd.exe concise documentation and download page (scroll to the bottom to see both 32-bit and 64-bit versions).
NirCmd.exe extensive documentation page.

I extracted the 64-bit zip in my Downloads folder and created a simple batch file with the following commands –

CD "C:UsersAalam.RangiDownloadsnircmd-x64"
nircmd.exe speak text "Turning monitor Off in 5 seconds"
timeout 5
nircmd.exe monitor off
timeout 5
nircmd.exe speak text "Turning monitor on in 30 seconds"
timeout 30
nircmd.exe monitor on

Yes, it speaks too! Then I opened the Windows Task Scheduler and scheduled the batch file as a 30 minute job.

There you go! It turns off my monitor every 30 minutes for 30 seconds. I get up from my chair for a mini stretch, take a gulp of water, look at the farthest end of the room or maybe out of the window, while the monitor turns back on.

Now there could be some folks whose workplace enterprise policy won’t let them download an EXE file from Internet. You could use a screensaver instead. E.g. the following line in the batch file will trigger the Mystify screensaver. A reminder enough for the break. Also, if it is just going to be a single command then maybe you don’t even need a batch file. You can put this command in the scheduled task by itself.

@START /WAIT /MIN %windir%system32mystify.scr -s

The downside is that this job keeps running even after I’ve left my office. Might startle the janitor 🙂

TIMESTAMP and ROWVERSION datatypes and the Behavior of the SQL Server @@DBTS Counter

April 9, 2014Aalam RangiLeave a comment

The TIMESTAMP datatype in SQL Server is in no way related to any Date/Time datatype or any operation related to dates or time. It is a simple monotonously incrementing counter that increases by one every time there is an INSERT or UPDATE operation in the database. It is stored as BINARY but can be converted to BIGINT or INT type to make the values seem more user-friendly. The IDENTITY property of a table is quite similar when we think about a monotonously incrementing value. While the IDENTITY counter is tracked at a table level, the TIMESTAMP counter is tracked at a database level. The current value of the TIMESTAMP counter can be returned by using the Configration Function @@DBTS. We can assume a TIMESTAMP to be a GUID in the smaller context of a database. So theoretically no two tables in a database can have the same TIMESTAMP value in any of their rows. But practically it is possible to go around this via SELECT..INTO syntax to create a new table, as I’ll demostrate later in this post.

As calling it a TIMESTAMP has caused confusion among SQL developers, Microsoft introduced ROWVERSION as a new name for it. Calling it ROWVERSION also emphasizes its utility for implementing optimistic locking in the code. TIMESTAMP has been deprecated and will be discontinued in a future SQL Server release so use ROWVERSION in all new development work.

It is optional to give the column a name in the DDL with the TIMESTAMP datatype. If a column name is not specified then SQL Server uses the default name of “TIMESTAMP” for the column. Helpful, eh?

On the other hand, if using the ROWVERSION as datatype, a column name must be given just like creating a column any other datatype.

A table can have only one column of TIMESTAMP (or ROWVERSION) type.

You cannot enter/update a value explicitly into a column of TIMESTAMP or ROWVERSION datatype. The database manages it internally. If you do try then an error is returned –

Msg 273, Level 16, State 1, Line 1

Cannot insert an explicit value into a timestamp column. Use INSERT with a column list to exclude the timestamp column, or insert a DEFAULT into the timestamp column.

The DBTS value starts from 2000 in a brand new database without any table containing a TIMESTAMP/ROWVERSION column.

CREATE DATABASE [TestDB] 
GO

-- Find the default value of 
-- the TIMESTAMP/ROWVERSION counter 
SELECT @@DBTS AsBinary, 
       CAST(@@DBTS AS BIGINT) AsBigint 

/* Result
AsBinary                AsBigint 
0x00000000000007D0      2000 
*/

Does the value of the counter change after new table is created with a TIMESTAMP/ROWVERSION?

No. Just creating a table does not change the counter.

CREATE TABLE a (
       id INT IDENTITY, 
       TIMESTAMP)
GO
SELECT @@DBTS AsBinary, 
       CAST(@@DBTS AS BIGINT) AsBigint 
/* Result
AsBinary               AsBigint 
0x00000000000007D0     2000 
*/

Insert a row in a table and check counter value. It increases by one.

INSERT INTO a DEFAULT VALUES
GO
SELECT *, 
       CAST(timestamp AS BIGINT) 
FROM a 

SELECT @@DBTS AsBinary, 
       CAST(@@DBTS  AS BIGINT) AsBigint 
/* Result
AsBinary               AsBigint 
0x00000000000007D1     2001 
*/

Create a new table, this time with ROWVERSION and insert a row. The counter still increases by one.

CREATE TABLE b (
       id INT IDENTITY, 
       col2  ROWVERSION) 
GO
INSERT INTO b DEFAULT VALUES
GO
SELECT *, 
       CAST(col2  AS BIGINT) 
FROM b 

-- Find the value of the counter after 
-- a row is inserted 
SELECT @@DBTS AsBinary, 
       CAST(@@DBTS  AS BIGINT) AsBigint 
/* Result
AsBinary               AsBigint 
0x00000000000007D2     2002 
*/

Do the INSERTS in the same transaction have the same counter value? Let us try.

BEGIN TRAN 
INSERT INTO a DEFAULT VALUES
INSERT INTO b DEFAULT VALUES
COMMIT

SELECT *, 
       CAST(timestamp AS BIGINT) 
FROM a 

SELECT *, 
       CAST(col2 AS BIGINT) 
FROM b 

-- Find the value of the counter after rows 
-- are inserted in multiple tables in a transaction. 
-- Each row still gets a unique counter value.
SELECT @@DBTS AsBinary, 
       CAST(@@DBTS  AS BIGINT) AsBigint 
/* Result
AsBinary               AsBigint 
0x00000000000007D4     2004 
*/

Does the counter increase in case of a failed transaction?

BEGIN TRAN 
INSERT INTO a DEFAULT VALUES
INSERT INTO b DEFAULT VALUES
ROLLBACK

SELECT *, 
       CAST(timestamp AS BIGINT) 
FROM a 

SELECT *, 
       CAST(col2 AS BIGINT) 
FROM b 

-- Find the counter value after a failed transaction
-- Yes. The counter value still increases.
SELECT @@DBTS AsBinary, 
CAST(@@DBTS AS BIGINT) AsBigint 
/* Result
AsBinary               AsBigint 
0x00000000000007D6     2006 
*/

Is is possible to have duplicate TIMESTAMP values across tables of a database?
Yes. Although values in TIMESTAMP/ROWVERSION columns cannot be explicitly inserted or modified they can be copied from one table to another using the SELECT..INTO syntax, which in turn could result in duplicate counter values across tables

SELECT *
INTO c -- New table
FROM a

SELECT *, 
       CAST(timestamp AS BIGINT) 
FROM a -- Source table 

SELECT *, 
       CAST(timestamp AS BIGINT) 
FROM c -- New table 

-- The counter value does not change.
SELECT @@DBTS AsBinary, 
       CAST(@@DBTS AS BIGINT) AsBigint 
/*
AsBinary               AsBigint 
0x00000000000007D6     2006 
*/

Does the counter continue with insertions into new table that was created with SELECT…INTO? Yes, the counter changes because the new table gets the column definition of the source table including the ROWVERSION column.

INSERT INTO c DEFAULT VALUES
SELECT *, 
       CAST(timestamp AS BIGINT) 
FROM c -- New table 

-- Find the value of the counter after a row 
-- is inserted into the new table.
-- The counter value does change.
SELECT @@DBTS AsBinary, 
       CAST(@@DBTS AS BIGINT) AsBigint 
/*
AsBinary               AsBigint 
0x00000000000007D7     2007 
*/

Final review of the counter values in tables

SELECT *, CAST(timestamp AS BIGINT) 
FROM a 

SELECT *, CAST(col2 AS BIGINT) 
FROM b 

SELECT *, CAST(timestamp AS BIGINT) 
FROM c

Awesome collection of free Microsoft eBooks

October 11, 2013Aalam RangiLeave a comment

Eric Ligman at Microsoft has posted a series of blogs posts with an awesome collection of free ebooks (in multiple formats) related to Microsoft technologies like SQL Server, Azure, SharePoint etc.

I’m posting the links to the original posts with full list of books here.

http://blogs.msdn.com/b/mssmallbiz/archive/2012/07/27/large-collection-of-free-microsoft-ebooks-for-you-including-sharepoint-visual-studio-windows-phone-windows-8-office-365-office-2010-sql-server-2012-azure-and-more.aspx

http://blogs.msdn.com/b/mssmallbiz/archive/2012/07/30/another-large-collection-of-free-microsoft-ebooks-and-resource-kits-for-you-including-sharepoint-2013-office-2013-office-365-duet-2-0-azure-cloud-windows-phone-lync-dynamics-crm-and-more.aspx

http://blogs.msdn.com/b/mssmallbiz/archive/2013/06/18/huge-collection-of-free-microsoft-ebooks-for-you-including-office-office-365-sharepoint-sql-server-system-center-visual-studio-web-development-windows-windows-azure-and-windows-server.aspx

A TechNet wiki page with links to more books. There could be overlaps with the previous links.

http://social.technet.microsoft.com/wiki/contents/articles/11608.e-book-gallery-for-microsoft-technologies.aspx

Another post with links of free Microsoft Office products training –

http://blogs.msdn.com/b/mssmallbiz/archive/2013/09/17/free-microsoft-office-training-resources-and-more-how-to-get-them.aspx

Using a Meaningful Application Name in the OLEDB Connection String

April 14, 2013Aalam Rangi4 Comments

The default connection string for a OLEDB connection looks something like this –

Data Source=(local);Initial Catalog=ReportServer;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;Application Name=SSIS-DemoOLEDBPackage-{9758C970-4AAC-449E-8390-FA7EAC5CD52A}(local).ReportServer;

Notice the Application Name parameter in bold font. It has the name of the package, a unique Id for the connection, the SQL server name and the database name. This application name will be exported out to the config file too. Also, this is what you will see in the Program Name column of SP_WHO2 results when the package is running.

When the developers create the packages on a development server, the connection string includes the development server name. At the time of migrating the package and its config file to the QA and Production environment, the developers usually change the Data Source and Initial Catalog in the connection string but leave the Application Name as it is. Anyone monitoring the Production server with a SP_WHO2 would see a package running with a funny long name including GUID and the development server name. It is certainly not user-friendly.

This can be fixed easily with any one of the following methods.

Method 1 – Properties Window

Right click the OLEDB connection and select Properties (or select the connection and press F4). The Properties window will show up. Edit the Connection String property to change the Application Name parameter of the connection in this window.

Method 2 – Config File

Edit the Connection String property to change the Application Name parameter of the connection in the config file. As we know, the values in the config file override the design time values when the package is running. Although I prefer making the initial change in the package itself before creating the config file.

Method 3 – The Connection Manager Wizard

Double click the OLEDB Connection.
Click on the All Tab.
Edit the Application Name.

Now the connection string would look like the following, which is a lot cleaner than the default string –

Data Source=(local);Initial Catalog=ReportServer;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;Application Name=SSIS-DemoPackage;

I prefer using a _ convention based on the target environment e.g. SSIS-DemoPackage_TEST, SSIS-DemoPackage_QA or SSIS-DemoPackage_PROD in the different config files. You can plan to put in other information to make it more meaningful in your context.

I’d love to hear your thoughts on this!

SSIS Variable Naming Convention Based on Origin of Value

April 6, 2013June 24, 2016Aalam Rangi2 Comments

There are many schools of thought about naming conventions, not just about SSIS variables, but about naming anything that can be given a name! One point that everyone agrees on is – use a convention that suits your environment, and stick to it.

Related Blog Post – SSIS component naming convention

One issue that I face while looking at packages with many variables is that there is no straightforward way to get a summary or a quick glance of all variables that have expressions defined for them. The BIDS Helper add-in for Visual Studio assists in this regard by adding a triangular icon overlay over the variables with an expression. SQL Server 2012 is also said to have a similar feature but I haven’t tried that yet. On the other hand, if I need to see a summary of variables that have values coming from a config file, I have to use the Configuration Editor to edit the config file and reach its last summary screen.

I’ve started using a prefix-based convention to name my SSIS variables recently based on the origin of the variable value. I distinguish between the variables as having –

a value set from a config file at execution time.
a value calculated in an expression at execution time. The expression could be based on other user variables, system variables, functions or literals.
a static value that will not change from its design time value.
a value assigned in a Script Task, a For Each Loop Container, Row Count etc.

If I know at design time that the value of the variable will be passed from a configuration file, I give it a prefix of “config”, e.g. configEmailDistributionList. If the value would be set via an expression then the variable gets a prefix “expr”, e.g. exprDirectoryForArchive. Other variables do not have any prefix, simply a name that is descriptive enough.

Advantages:

At the time of creating a config file, the prefix makes it clear that all variables with a “config” prefix must be included in the configuration. So the config file now has only such variables along with other task or connection manager properties.
On opening up a package, I can easily figure out where should I be looking for the value of the variable, the config file or the Expression Editor.
Alphabetical ordering in the Variables window groups similar variables together.
Helps in development because I don’t have to constantly check back variable properties if it has an expression or not.
Helps in standardizing the names and expressions for some of the most frequently used variables.
Helps in debugging.

Disadvantages:

The argument against this convention would be the same as given for any prefix-based convention (e.g. Hungarian Notation) that if the origin of the variable value changes then the prefix would also need to change. Which in turn, may need a change at all those places where the variable is used. As an example, if a variable was getting its value from a config file but would now have a static value, then the “config” prefix would have to go away and all the expressions, properties etc. using that variable would have to be updated. Right now I’m counting on doing a Find/Replace in the XML code of the package.

I’ll try to see if using variable Namespaces would be a better idea than prefixes and share the experience in a blog post later.

References –

(not specifically for Hungarian Notation, but in context of any prefix-based convention)

Hungarian notation (Wikipedia)
http://en.wikipedia.org/wiki/Hungarian_notation

Hungarian Notation (MSDN)
http://msdn.microsoft.com/en-us/library/aa260976(v=VS.60).aspx

Hungarian Notation – The Good, The Bad, and The Ugly
http://ootips.org/hungarian-notation.html

TSQL Gotcha – Quotes are optional for a Stored Procedure VARCHAR parameter without spaces

March 27, 2013Aalam RangiLeave a comment

If you want to assign some default value to a parameter of type VARCHAR in a stored procedure and the parameter value does not have any spaces then the quotes around the value are optional.

create procedure TestProc
(
@a varchar(20) = 'spaces in string', --must use quotes for string with spaces
@b varchar(20) = 'nospacesinstring',
@c varchar(20) = nospacesinstring --quotes are optional if no spaces in string
)
as
select @a, @b, @c;
;
exec TestProc;

TSQL Gotcha – ISNUMERIC Function Results

March 24, 2013Aalam RangiLeave a comment

The ISNUMERIC function (MSDN documentation) is used to validate if a string is a valid number. In the demonstration that follows, you would expect the results from 1 to 5 but look closely at the results 6, 7 and 8. There are multiple commas in the string and the last one is not even a number!

Well, a note at the MSDN documentation explains the True result for the ‘$’ sign at Id 8 –

ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($). For a complete list of currency symbols, see Using Monetary Data.

For the strings at Id 6 and 7 with commas, the function behaves that if the string is convertible to a number then the result will be True. Something to remember while writing those data validation routines.

DECLARE @t TABLE
( id int, col1 VARCHAR(10) )

INSERT INTO @t (id, col1)
VALUES (1, '-1001'),
(2, '0'),
(3, '1001.50'),
(4, '1,500'),
(5, 'abc'),
(6, '15,0,1'),
(7, '15,0,1.75'),
(8, '$')

SELECT id,
col1,
Isnumeric(col1) AS [IsNumeric]
FROM @t

SQLErudition.com

Learning SQL Server

Author: Aalam Rangi