SQL Server Business Intelligence

SQL SERVER 2005 & 2008 IS THE MAIN THEME OF THIS BLOG

Wednesday, August 29, 2007

Moved to http://www.beeii.com/

I have moved my blog to beeii.

Thanks
Sutha

Monday, August 20, 2007

Installing Proclarity Analytics Server

Please note the Sever Authentication should be set to "SQL Server and Windows Authentication Mode" before installing PAS.

Default is set to "Windows Authentication Mode". You can change this by selecting Properties on the server. This setting is under "Security" tab.

Just a reminder for me more than anything else.

Thanks
Sutha

Thursday, August 16, 2007

Performance Point CTP4 Available for Download

Performance Point CTP4 is available for download from MS Connect site.

Thanks
Sutha

Monday, August 06, 2007

How did I build Employee dimension using historical data?

I never ever had to this in the past. On my current project I had to build the employee dimension based on opportunity table in CRM system, not from an employee reference table. Opportunity table holds the following information OpportunityID, AccountManagerID, ActualCloseDate, ClosedDivision and ClosedRegion along with many other fields.

UserInfo table is the table which holds Employee details, but that table only holds current date. Going forward we will be OK as will do the type 2 on Region and Division.

The requirement was from the client is to map the Employee to an opportunity, which is fine, except the employee need to be mapped to the Division and Region which they closed the deal at. That’s why the ClosedDivision and ClosedRegion from an opportunity table comes handy.

I brought the data from source system into Stage database using query similar to the one below
SELECT o.accountmanagerid,
u.username,
o1.closed_region,
o1.closed_division,
MIN(o.actualclose) actualclose
FROM sysdba.opportunity o
INNER JOIN sysdba.opportunity_extra_1 o1
ON o.opportunityid = o1.opportunityid
INNER JOIN sysdba.userinfo u
ON o.accountmanagerid = u.userid
WHERE status LIKE 'Closed - %'
AND actualclose IS NOT NULL
AND closed_division IS NOT NULL
AND closed_region IS NOT NULL
GROUP BY o.accountmanagerid,
u.username,
o1.closed_region,
o1.closed_division

The above query will bring all the AccountManagerID (NK), Name, ClosedDivision and Region along with MINIMUM actual close date of an opportunity.

Source table example

Opp Emp ClosedDate ClosedRegion ClosedDivision
1234 ABC 01/04/1990 London South
2345 ABC 09/04/1990 Mids North
4567 ABC 06/08/2003 Mids North
3456 ABC 05/07/2007 Scotland Scotland

The Stage table will look like this

Emp ClosedDate ClosedRegion ClosedDivision
ABC 01/04/1990 London South
ABC 09/04/1990 Mids North
ABC 05/07/2007 Scotland Scotland

Now I have got the stage table populated, now how can I build the warehouse dimension table.

This package only to run one time, as it is going to build the employee dimension using historical data. This is how I built the dimension.

1. I read the stage table and assigned it to an object variable using the following query
SELECT DISTINCT AccountManagerId, ActualClose
FROM scbEmployeeHistory
ORDER BY 1,2
2. Please note that I am only selecting AccountManagerID (NK) and ActualClose (date).
3. Using Foreach Loop Container I loop through the Foreach ADO Enumerator and mapping the results to 2 variables.
4. Then I will build the dimension usual way, but the key is to pass the AccountManagerID and ActualCloseDate as parameter to the source querry.

If anyone would like to see an example feel free to drop me a mail and I will send it to you.

I would be interested to hear from others if you have done it differently.

Thanks
Sutha

Tuesday, July 31, 2007

How do we extract latest data from source system

Until SQL Server 2008 is released and implemented we all have to work with various work around to retrieve only latest data from source systems.

In the previous projects I was fortunate in a way that source system was either Oracle or flat files with only changed / new data. Oracle and other major DB vendors had Change Data Capture for a long time.

SQL Server 2008 is going to provide the same feature and functionality. You can find out more about SQL Server 2008 CDC feature here.

Each source table has either CreateDate and ModifyDate column. If these columns are there we extract all the records anything greater than we have extracted last. I store last extract datetime stamp for each table in a control table, which is used within SSIS package to extract only the latest record.

I use a SP similar to the one shown below to update the control table with the last extract datetime stamp.


CREATE PROCEDURE [dbo].[uspUpdateExtractionDate] (@LoadId INT)
AS
DECLARE @VarName VARCHAR(50)
DECLARE @ObjectName VARCHAR(50)
DECLARE @VarValue VARCHAR(50)
DECLARE @Tmp TABLE (VarName VARCHAR(50),ObjectName VARCHAR(50), VarValue VARCHAR(50))
BEGIN TRY
INSERT
INTO @Tmp
SELECT 'strLastExtractDate', 'pkgStgOpportunity',CONVERT(VARCHAR(23),MAX(CreateDate),121)
FROM scbOpportunity
WHERE LoadId = @LoadId
BEGIN TRAN
DECLARE Tmp_Cursor CURSOR FOR
SELECT *
FROM @Tmp
OPEN Tmp_Cursor
FETCH NEXT FROM Tmp_Cursor INTO @VarName, @ObjectName, @VarValue
WHILE @@FETCH_STATUS = 0
BEGIN
UPDATE BISystem.dbo.ConfigVar
SET VarValue = @VarValue
WHERE VarName = @VarName
AND ObjectName = @ObjectName
FETCH NEXT FROM Tmp_Cursor INTO @VarName, @ObjectName, @VarValue
END
CLOSE Tmp_Cursor
DEALLOCATE Tmp_Cursor
COMMIT TRAN
END TRY
BEGIN CATCH
ROLLBACK TRAN
EXEC BISystem.dbo.uspLogErrorInfo
END CATCH

Thanks
Sutha

Friday, July 27, 2007

Execute Package Task Might Fail Intermittently

I have used Execute Package Task before in various projects without any problems.

On my current project I tried to do the same I came across funny problem. Execute Package Task started to fail Intermittently. The only difference in this project to others I have implemented was that we are using ADO.Net Connection Manager. Thats where it seems to have a problem. It doesn't seem to acquire connection for the ADO.Net connection manager time to time.

Therefore I have decided to use script task to execute the package. When I use Script Task it works fine.

I need to investigate this further and maybe even log a call with MS.

Thanks
Sutha

Wednesday, July 25, 2007

Loading Data Into Table via SSIS when IGNORE_DUP_KEY is ON

Probably most of you aware of this issue and most definitely it is not a best practice, but if you wish to use in any unforseen reason please be aware.

For example we have a table with 3 columns. If you set a Unique Non-Clustered Index with all 3 columns and set IGNORE_DUP_KEY = ON, then you have to be aware of the following.

If you are using OLE DB Destination with OLE DB Provider for SQL Server it will fail the package even though it will insert correct number of rows by ignoring duplicate rows. If you want to populate this type of table without failing the package, while using OLE DB Destination, you would need to user Native OLEDB\SQL Native Client.

The Provider to use in this scenario is - SQLNCLI.1
The Provider NOT use in this scenario is - SQLOLEDB

Thanks
Sutha