SQL Server Business Intelligence

SQL SERVER 2005 & 2008 IS THE MAIN THEME OF THIS BLOG

Thursday, January 26, 2006

Steps Involved in Loading Text File Into SQL Server Using SSIS

How many steps involved in loading a simple text file straight into a SQL Server table using SSIS?

6 steps involved to complete this process and they are listed below:
1. Read data from the text file – Disk speed is significant.
2. Load data from the text file into the transport (SSIS) – Network speed is significant.
3. Open a transaction on SQL Server – Server contention is significant.
4. Pass data to the SQL Server interface
5. Load the data into SQL Server – Memory resource is significant.
6. Commit the transaction – Disk performance is important.

When loading from text to typed SQL Server columns (int, datetime etc) the data conversion must take place somewhere either step 2 or 5 in this above example. Is it necessary to use SSIS for this? Are the operations simple enough for BCP / BULK Import? With low volume of data the overhead of starting up SSIS package process may offset any performance gain over BCP or Bulk Insert. Where should SSIS run (Server1 or Server2)? SSIS On Server1 – Competes for memory and processor resources with other applications. Will data conversion on server1 reduce or increase the volume of data transferred across the network? But can not use the fast SSIS SQL server Destination. Where should SSIS run (Server1 or Server2)? SSIS On Server2 – Competes for memory and processor resources with SQL Server. Will pulling text over conversion be expensive (also consider transferring the file unparsed to server2 and read it locally from there). You can use the fast SSIS SQL Server Destination. If you were going to run in two servers you would need SQL Server licensee in both servers.

Thanks
Sutha

Sunday, January 15, 2006

SSIS Service

SSIS Windows Service – Monitors and manages running as well as stored packages, via SQL management studio.

It is installed when you install SSIS. Service is not required to design or execute packages. Windows events for service (Service start, service fail to start, package started, package stopped…).

In Management Studio you can see Stored Packages. When you create a folder(s) they are stored in MSDB database. In Management Studio you can see Running Packages.

As default it only shows the name(s) of the package(s) which are running at present. If you click on a “Report” button on the top menu, it will display information such as Package name, Executed by, Execution Instance Id, Execution Started, Execution duration, Package ID and Description. In Running Packages you can select the package and right click and “Stop” the package.

When you connect to SSIS server within Management Studio, you can select the server and Right click on it and select “View Logs”. This is limited Windows events.

Thanks
Sutha

How does SSIS work?

SSIS is a replacement product to SQL 2000 DTS. New archutecture and design. I have tried to give a summary explaining how SSIS works :

  • All data is loaded in tabular form into a memory buffer.
  • Data sources can be diverse, including custom or scripted adapters.
  • Transformation components shape and modify data in many ways.
  • Data is routed by rules or error conditions for cleansing and conforming.
  • Flows can be as complex as your business rules, but highly concurrent.
  • Finally data can be loaded in parallel to many varied destinations. As far as possible no copying of data (even in memory)

All in all more memory you have the better it is for performance of SSIS.

Thanks
Sutha

Wednesday, January 11, 2006

Cannot Set Connection String For Bulk Insert Source Connection

I want to go through a directory in my local machine and get files and assign the Fully Qualified File Name to a User Variable using Foreach Component from SSIS.

I want to use that variable as a connection string within Expression in Bulk Insert Task Source Connection, Data Flow Task and File System Task. It is working fine within Data Flow and File System task. It is NOT working for Bulk Insert.

- Expressions ConnectionString @[User::FileName]

This is the expression I want to use within ConnectionString in Bulk Insert, File System Task and Data Flow Task. Same expression works fine in all except Bulk Insert. You can see the same issue being raised in Forum.

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=188886&SiteID=1

From the forum I found out that I need to use Script within Foreach to set the connection string and then call Bulk Insert. You can set the connection strings as shown below (thanks to Mark)

------------------------------------------------------
Dim connmgr As ConnectionManager = Dts.Connections("[your connection name]")connmgr.ConnectionString = Dts.Variables("User::filename").Value.ToString()connmgr.AcquireConnection(Nothing)
------------------------------------------------------

I hope this helps
Thanks
Sutha

Friday, January 06, 2006

SSIS Package Template

The project I am working currently needed a template all ETL developers can use as a standard package. This will be the beginning point of each package. I have written a simple package to act as a template.

I wrote the package and called it as “Pkg_Template.dtsx”. Template package(s) need to locate in certain location for it to work. You would need to put this SSIS package in the following location on your machine

%ProgramFiles%\Microsoft Visual Studio 8\Common7\IDE\PrivateAssemblies\ProjectItems\DataTransformationProject\DataTransformationItems

Once you have placed the template package in the above location you can open the template package by Right click on a project and select “Add” and “New Item”. This will bring a screen like the one in the link below :
http://www.smart-bi-solutions.com/BlogStuff/06012005_Pic1.jpg


As you can see from the above screen shot the Pkg_Template does appear within BIDS template section. You can give new name at this point or it will give a default name of “Pkg_Template1.dtsx”.

The screen shot in the link below explains what template package does:
http://www.smart-bi-solutions.com/BlogStuff/06012005_Pic2.jpg

The above template does the following:

  1. Retrieves LoadId from SQL Server using Stored Procedure.
  2. Checks for LoadId Validity.
  3. Configures SSIS auditing using Execute SQL Task (Expressions),
  4. Uses “Environments.dtsconfig” to set the SQL Server connections.
  5. Package is configured to log to Windows Event Viewer if the package errors out.
  6. Event Handler to do auditing, for example, it logs what time task started and completed etc.
  7. I have put some comments for the new developers in the future to see what the package does etc.

I hope this is useful.

Thanks
Sutha

IDENTITY_INSERT

If you have a Dimension table for example with Identity column. You want to insert a default record with the Surrogate Key of -1 with description "Unknown".

Table with the following structure:
CREATE TABLE Dim_Manager
(
ManagerId INT IDENTITY (1,1) NOT NULL,
Name VARCHAR(50) NOT NULL
)

If you create the table with the above statement and would like to insert default dimension record. You can do that by doing the following:

SET IDENTITY_INSERT dbo.Dim_Manager ON
INSERT INTO Dim_Manager (ManagerId, Name)
VALUES('-1','Unknown')
SET IDENTITY_INSERT dbo.Dim_Manager OFF

You need to set the IDENTITY_INSERT to ON before inserting into Identity colum. Once you have inserted the record you need to set IDENTITY_INSERT to OFF as shown above.

Thanks
Sutha