TSQL Gotcha – ISNUMERIC Function Results

The ISNUMERIC function (MSDN documentation) is used to validate if a string is a valid number. In the demonstration that follows, you would expect the results from 1 to 5 but look closely at the results 6, 7 and 8. There are multiple commas in the string and the last one is not even a number!

Well, a note at the MSDN documentation explains the True result for the ‘$’ sign at Id 8 –

ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($). For a complete list of currency symbols, see Using Monetary Data.

For the strings at Id 6 and 7 with  commas, the function behaves that if the string is convertible to a number then the result will be True. Something to remember while writing those data validation routines.

DECLARE @t TABLE
( id int, col1 VARCHAR(10) )

INSERT INTO @t (id, col1)
VALUES (1, '-1001'),
(2, '0'),
(3, '1001.50'),
(4, '1,500'),
(5, 'abc'),
(6, '15,0,1'),
(7, '15,0,1.75'),
(8, '$')

SELECT id,
col1,
Isnumeric(col1) AS [IsNumeric]
FROM @t

SQLGotchas-IsNumeric

TSQL Gotcha – Miss a comma

Some minor omissions can go unnoticed and produce completely unexpected results.

The following code example has two seemingly similar SELECT statements but they produce different results. All because I missed out a comma in the second SELECT statement! SQL Server did not raise any error and quietly assumed that I have tried to use an alias name for the First Name column!

CREATE TABLE employee
(
employee_id INT,
firstname VARCHAR(50),
middleinitial VARCHAR(1),
lastname VARCHAR(50)
)
INSERT INTO employee
(employee_id, firstname, middleinitial, lastname) VALUES
(1001, 'Timothy', 'B', 'Parker'),
(1002, 'Henry', 'F', 'Williams'),
(1003, 'Jeff', NULL, 'Smith'),
(1004, 'Mary', 'M', 'Marsh'),
(1005, 'Miles', 'V', NULL)

--Correct statement
SELECT employee_id, firstname, middleinitial, lastname
FROM employee

-- Wrong statement with a comma missing
-- between the firstname and middleinitial columns.
-- SQL Server thinks it is an alias for firstname.
SELECT employee_id, firstname middleinitial, lastname
FROM employee

sqlgotchas-missingcomma

TSQL Gotcha – Empty string in INTEGER type column

It is important to sanitize data.

If you try to put an empty string in an INTEGER type (or its cousin) columns, SQL Server converts it implicitly into a zero. This might not be the expected behavior for the application so it is good to be aware.

declare @v char(1) = '' -- this could have been a varchar too with the same results

declare @t table
(
colbigint  bigint,
colint int,
colsmallint smallint,
coltinyint tinyint
)

/* The wrong way that will change all blank strings to zero */
insert into @t
(
colbigint,
colint,
colsmallint,
coltinyint
)
values
(
@v,
@v,
@v,
@v
)

/* The correct way to sanitize */
insert into @t
(
colbigint,
colint,
colsmallint,
coltinyint
)
values
(
nullif(ltrim(@v), ''),
nullif(ltrim(@v), ''),
nullif(ltrim(@v), ''),
nullif(ltrim(@v), '')
)

select * from @t
go

SQLGotchas-BlankStringInInteger

Parent-Child Table Dependency Script

Requirement:

You need to delete and then insert data into tables which have parent-child relationships defined via primary and foreign keys. The delete and insert commands have to be executed in proper sequence.

The sequence for DELETE is Child-before-Parent.

The sequence for INSERT is Parent-before-Child.

I created a quick script in response to a forum post. The script is not the best out there because –

  • It does not handle circular references in relationships
  • It takes 30 seconds or more if run on a database with hundreds of tables

I’m sharing it for my future reference and probably others with smaller number of tables would find it useful too.

Download script at –
http://gallery.technet.microsoft.com/Parent-Child-Table-45fa5268

Further reference:
For a faster (but not with all these output columns) script, look at Erland Sommarskog’s response in the forum thread.

Script output would look like the example below –

TableDependencyReport

Calculate TSQL Stored Procedure Complexity

I had a recent project where we planned to re-factor an existing application’s .Net and TSQL code. I wanted to find the complexity of the stored procedures and arrive at an estimate of the effort required to review all them. I had in mind that something like number of lines of code, number of dependencies and parameters in the stored procedure would give a good starting  point to create such an estimate.

I patched together a script that produced a report similar to the example below. I used that output and put in some more formulas to assign a complexity level (simple, medium or complex) and the approximate number of minutes required to review that procedure. I have not included those calculations here because they depend entirely on the estimator’s perception of complexity and the developer’s skill level.

The number of lines of code is a subjective metric here because it depends on the developer’s coding style and use of whitespace. Even then, it could be a useful reference point.

Complexity Report Screenshot

In case this seems useful then the script to create that report is available for download at Technet.

Download and rate the script at: http://gallery.technet.microsoft.com/Calculate-TSQL-Stored-831b683a

CHANGE HISTORY:
2012/Dec/01 – Version 1
2013/Feb/15 – Version 1.1 Added the CASE statement for complexity.
2014/Apr/29 – Version 1.2 Added Database and Schema name to the report.
Included objects that may not be refering to other objects.
Included views, functions and triggers.

Please feel free to share you feedback in the comments.

How to monitor IDENTITY column values in SQL Server

One (lesser known) task of a DBA should be to keep an eye on the values of the IDENTITY columns so that they do not hit the ceiling for their data type and catch you by surprise. Not a pleasant one, of course. This is all the more important if you have tables that grow quickly.

Vyas Kondreddi has a nice script to look for current IDENTITY values in all tables of a database and compare that value against its datatype. It could be useful to automate it as an alert or run it manually once in a while. He has versions of the script for SQL Server 7.0, 2000, 2005 and later but all of them run against one database at a time which might make it a little inconvenient to run against all your 200 databases.

I’ve taken the script compatible with SQL Server 2005 and later, and enhanced it a bit to run it against all databases in one go using the undocumented stored procedure SP_MSFOREACHDB. There goes your excuse to procrastinate this!

Change the value against the TOP keyword to suit your preference. I’ve used TOP 5.

Download my script from the TechNet Gallery link –
http://gallery.technet.microsoft.com/Monitor-IDENTITY-column-fd9c6552

The output would look like the screenshot below.

MonitorIdentityColumns

How to Store (and Retrieve) Non-English Characters (e.g. Hindi, Czech, Arabic etc.) in SQL Server

Store multiple language strings in SQL Server
How to store text in multiple languages in SQL Server.

If you have to store and retrieve characters of any other language besides English in SQL Server, you must do the following –

  1. Use a Unicode compatible data type for the table column. NVACHAR, NCHAR, NTEXT are the datatypes in SQL Server that can be used for storing non-English characters.
  2. Precede the Unicode data values with an N (capital letter) to let the SQL Server know that the following data is from Unicode character set. Without the N prefix, the string is converted to the default code page of the database. This default code page may not recognize certain characters.
  3. The N should be used even in the WHERE clause.

REFERENCE: Microsoft Support KB 239530
You must precede all Unicode strings with a prefix N when you deal with Unicode string constants in SQL Server
http://support.microsoft.com/kb/239530

If the correct data-type is not used or the data is not preceded with an N, SQL Server will save the data to the table as ‘?’ or another garbled character.

The following scripts demonstrate saving and retrieving multi-lingual data to and from SQL Server. I have used Google Translate to get the characters of other languages. I left out far-east languages like Japanese and Chinese from the following example on purpose because those languages have a few other considerations that I’ll save for another blog post.

DROP TABLE dbo.unicodeData;
GO

CREATE TABLE dbo.unicodeData
( languageUsed VARCHAR(50)
, unicodeData NVARCHAR(200)
, nonUnicodeData VARCHAR(200) -- same data in a normal VARCHAR column for comparison
, comments VARCHAR(100)
);
GO

INSERT INTO dbo.unicodeData
( languageUsed
, unicodeData
, nonUnicodeData
, comments)
VALUES
('English'
, N'This is an example'
, N'This is an example'
, NULL)

,('Hindi'
, N'यह एक उदाहरण है.'
, N'यह एक उदाहरण है.'
, 'Using the preceding N in both strings but VARCHAR is still a ?')

,('Hindi'
, 'यह एक उदाहरण है.'
, 'यह एक उदाहरण है.'
, 'Not using the preceding N in both strings so both are a ?')

,('Kannada'
, N'ಈ ಒಂದು ಉದಾಹರಣೆಯಾಗಿದೆ.'
, N'ಈ ಒಂದು ಉದಾಹರಣೆಯಾಗಿದೆ.'
, NULL)

,('Arabic'
, N'هذا مثال على ذلك.'
, N'هذا مثال على ذلك.'
, NULL)

,('Czech'
, N'To je příklad.'
, N'To je příklad.'
, NULL);
GO

SELECT *
FROM dbo.unicodeData;
GO

-- Example of using N' in the WHERE clause
SELECT *
FROM dbo.unicodeData
WHERE unicodeData like N'%एक%';

Unicode Results
Unicode Results

Further Reading:

Comments in SQL and a case for multi-line vs single-line comments

Developers use comments within the code primarily for two purposes –

  • Documentation (for posterity)
  • Debugging (making quick comparisons of the impact of a change without discarding the original code)

It depends on the team’s culture and their coding standards as to how much they choose to comment for documentation. Ideally, there should be a change history included as a header at the top of the code and additional comments to highlight why a particular logic was used. Another developer (or even the original) can look at the code and understand what is it doing, but sometimes it is not apparent why was that particular implementation chosen over others. So it is a good idea to mention the rationale behind using a particular logic. Other interesting paradigms can be checked on the Wikipedia article on Comments in Programming.

TSQL supports two types of comments –

  • Single-line comments: These start with a double hyphen ‘- -‘ and the rest of the code on that line till the new-line (or carriage return) character is considered to be a comment.
  • Multi-line comments: These start with a ‘/*’ and end with a ‘*/’ either on the same line of the code or on another line and everything in between these two is considered to be a comment.

The implementation of comments has not changed much between various editions of SQL Server but there have been few considerations. The MSDN article for comments syntax in SQL Server 2000 mentions that comments cannot span multiple batches. If the batch delimiter keyword GO is within a multi-line comment then the ending */ will not be parsed correctly and cause syntax errors. The SQL Server 2005 or later versions could have issues when the ‘*/’ string is put in a multi-line comment. Try the code below in SQL Server 2005 or later versions to see an example –

DECLARE @s VARCHAR(10)
/*
SET @s = '*/'
*/
SELECT @s

Each comment type has its uses but I personally prefer and recommend using the multi-line syntax, even for short comments that could fit on a single line. My reason for that is that I frequently get requests to review queries and give suggestions on tuning them. Sometimes that query has been captured from DBCC INPUTBUFFER(spid) or the dm_exec_sql_text(sql_handle) DM function. Both of these do not preserve any formatting of the code. As a result I get the entire stored procedures’ code in one single long line!

I usually try to format such code using an excellent online code formatter by the name of SQLInForm. When the formatter comes across a double hyphen, it considers the rest of the line as a comment and leaves it as it is. At that point I have to make a choice either to painfully go through the remaining code and try to guess what is a comment and what is not. At times I send it back to the developer to provided a better fomatted code. Both these approaches increase the turnaround time for the whole tuning process.

If the developer had used the multi-line comment syntax in the first place, then the code formatter could have handled it and life would have been much easier.