SQL Server Business Intelligence

SQL SERVER 2005 & 2008 IS THE MAIN THEME OF THIS BLOG

Friday, December 30, 2005

Creating Random Data From Given Dimensions

I had to create millions of rows dummy fact table for testing purpose. I have got valid dimensions, but not fact. I had a requirement to create facts using random products (roughly 160 products across whole day). The fact table stored data to the minute. Each shop in average have 2000 transactions / day.

Each day shop in average sells 160 products. This is populated into a Tmp table as shown below:

SELECT TOP 160 productid
INTO TmpProduct
FROM dim_product
ORDER BY NEWID()

I have written couple of SQL statements to achieve this. They produced right result but unfortunately they are slow for my requirement. Therefore we have to do in C#.

Option 1
---------
SELECT TOP 2000 RANK() OVER (order by newid()) as RankRec,
dateid,
timeid,
shopid,
managerid,
TmpProduct.productid,
1 as Qty,
(timeid*rand()) as Amount
FROM dim_date
CROSS JOIN dim_shop
CROSS JOIN dim_time
CROSS JOIN dim_manager
CROSS JOIN dim_product
INNER JOIN TmpProduct
ON dim_product.productid = TmpProduct.productid
WHERE shopid = 39
AND dateid = 1
AND dim_shop.shopno = dim_manager.shopno


Option 2
---------

SELECT TOP 2000
dateid,
timeid,
shopid,
managerid,
TmpProduct.productid,
1 as Qty,
(timeid*rand()) as Amount
FROM dim_date
CROSS JOIN dim_shop
CROSS JOIN dim_time
CROSS JOIN dim_manager
CROSS JOIN dim_product
INNER JOIN TmpProduct
ON dim_product.productid = TmpProduct.productid
WHERE shopid = 39
AND dateid = 1
AND dim_shop.shopno = dim_manager.shopno
order by newid()

I am posting this for me as it will come handy during future projects.

Thanks
Sutha

0 Comments:

Post a Comment

<< Home