SQL Server Business Intelligence

SQL SERVER 2005 & 2008 IS THE MAIN THEME OF THIS BLOG

Tuesday, July 31, 2007

How do we extract latest data from source system

Until SQL Server 2008 is released and implemented we all have to work with various work around to retrieve only latest data from source systems.

In the previous projects I was fortunate in a way that source system was either Oracle or flat files with only changed / new data. Oracle and other major DB vendors had Change Data Capture for a long time.

SQL Server 2008 is going to provide the same feature and functionality. You can find out more about SQL Server 2008 CDC feature here.

Each source table has either CreateDate and ModifyDate column. If these columns are there we extract all the records anything greater than we have extracted last. I store last extract datetime stamp for each table in a control table, which is used within SSIS package to extract only the latest record.

I use a SP similar to the one shown below to update the control table with the last extract datetime stamp.


CREATE PROCEDURE [dbo].[uspUpdateExtractionDate] (@LoadId INT)
AS
DECLARE @VarName VARCHAR(50)
DECLARE @ObjectName VARCHAR(50)
DECLARE @VarValue VARCHAR(50)
DECLARE @Tmp TABLE (VarName VARCHAR(50),ObjectName VARCHAR(50), VarValue VARCHAR(50))
BEGIN TRY
INSERT
INTO @Tmp
SELECT 'strLastExtractDate', 'pkgStgOpportunity',CONVERT(VARCHAR(23),MAX(CreateDate),121)
FROM scbOpportunity
WHERE LoadId = @LoadId
BEGIN TRAN
DECLARE Tmp_Cursor CURSOR FOR
SELECT *
FROM @Tmp
OPEN Tmp_Cursor
FETCH NEXT FROM Tmp_Cursor INTO @VarName, @ObjectName, @VarValue
WHILE @@FETCH_STATUS = 0
BEGIN
UPDATE BISystem.dbo.ConfigVar
SET VarValue = @VarValue
WHERE VarName = @VarName
AND ObjectName = @ObjectName
FETCH NEXT FROM Tmp_Cursor INTO @VarName, @ObjectName, @VarValue
END
CLOSE Tmp_Cursor
DEALLOCATE Tmp_Cursor
COMMIT TRAN
END TRY
BEGIN CATCH
ROLLBACK TRAN
EXEC BISystem.dbo.uspLogErrorInfo
END CATCH

Thanks
Sutha

Friday, July 27, 2007

Execute Package Task Might Fail Intermittently

I have used Execute Package Task before in various projects without any problems.

On my current project I tried to do the same I came across funny problem. Execute Package Task started to fail Intermittently. The only difference in this project to others I have implemented was that we are using ADO.Net Connection Manager. Thats where it seems to have a problem. It doesn't seem to acquire connection for the ADO.Net connection manager time to time.

Therefore I have decided to use script task to execute the package. When I use Script Task it works fine.

I need to investigate this further and maybe even log a call with MS.

Thanks
Sutha

Wednesday, July 25, 2007

Loading Data Into Table via SSIS when IGNORE_DUP_KEY is ON

Probably most of you aware of this issue and most definitely it is not a best practice, but if you wish to use in any unforseen reason please be aware.

For example we have a table with 3 columns. If you set a Unique Non-Clustered Index with all 3 columns and set IGNORE_DUP_KEY = ON, then you have to be aware of the following.

If you are using OLE DB Destination with OLE DB Provider for SQL Server it will fail the package even though it will insert correct number of rows by ignoring duplicate rows. If you want to populate this type of table without failing the package, while using OLE DB Destination, you would need to user Native OLEDB\SQL Native Client.

The Provider to use in this scenario is - SQLNCLI.1
The Provider NOT use in this scenario is - SQLOLEDB

Thanks
Sutha

Monday, July 23, 2007

Loading Date Dimension

I am sure all of you seen this script before. More than anything I am putting this here for my reference in the future.

Last couple of client sites I have worked, their fiscal year is aligned with calendar year and therefore it made life much easier. The SP below can populate the dates as you specify.


USE [Stage]
GO
CREATE PROCEDURE [dbo].[uspPopulateDate] (@LoadId INT)
AS
DECLARE @id INT
DECLARE @date DATETIME
DECLARE @Startdate DATETIME
DECLARE @Enddate DATETIME
SET @Startdate = '1950-01-01'
SET @EndDate = '2100-12-31'
SET @id = 0
SET @date = DATEADD(dd, @id, @startdate)
BEGIN TRY
BEGIN TRAN -- Start of a transaction
WHILE @date <= @enddate
BEGIN
INSERT
INTO scbDate
SELECT @date CalendarDate,
DATEPART(dd, @date) CalendarDayMonth,
DATEPART(dy, @date) CalendarDayYear,
DATEPART(dw, @date) CalendarDayWeek,
DATENAME(dw, @date) CalendarDayName,
LEFT(DATENAME(dw, @date),3) CalendarAbbDayName,
CASE
WHEN DATEPART(dw, @date) IN (1,7) THEN
'Y'
ELSE
'N'
END CalendarWeekday,
DATEPART(ww, @date) CalendarWeek,
'Week ' + RIGHT('0' + DATENAME(ww, @date), 2) CalendarWeekName,
DATEPART(mm, @date) CalendarMonth,
DATENAME(mm, @date) CalendarMonthName,
LEFT(DATENAME(mm, @date),3) CalendarAbbMonthName,
DATEPART(qq, @date) CalendarQuarter,
'Q' + DATENAME(qq, @date) + ' ' + DATENAME(yy, @date) CalendarQuarterName,
DATEPART(yy, @date) CalendarYear,
DATEPART(dw, @date) FiscalDayWeek,
DATEPART(yy, @date) FiscalYear,
DATEPART(mm, @date) FiscalPeriod,
DATEPART(ww, @date) FiscalWeek,
DATEPART(qq, @date) FiscalQuarter,
0 CurrentDay,
@LoadId LoadId,
'STA' DataSource
SET @id = @id + 1
SET @date = DATEADD(dd, @id, @startdate)
END
COMMIT TRAN
END TRY
BEGIN CATCH
ROLLBACK TRAN -- Rollback Transaction
END CATCH

Thanks
Sutha

Building Dynamic Excel Connection String

If you wish to build the Excel connection string dynamically inside SSIS and assign it to Excel Connection Manager you can use the following expression.

"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @[User::strWorkingFolder] +"\\"+ @[User::strExcelFileName] + ";Extended Properties=\"" + "Excel 8.0;HDR=YES" + "\""+";"

Where strWorkingFolder and strExcelFileName are user variables. For example, strWorkingFolder = C:\Temp and strExcelFileName = Sales.Xls

Please note the above syntax is for Excel 97-2003

Thanks
Sutha