Creating Random Data From Given Dimensions
I had to create millions of rows dummy fact table for testing purpose. I have got valid dimensions, but not fact. I had a requirement to create facts using random products (roughly 160 products across whole day). The fact table stored data to the minute. Each shop in average have 2000 transactions / day.
Each day shop in average sells 160 products. This is populated into a Tmp table as shown below:
SELECT TOP 160 productid
INTO TmpProduct
FROM dim_product
ORDER BY NEWID()
I have written couple of SQL statements to achieve this. They produced right result but unfortunately they are slow for my requirement. Therefore we have to do in C#.
Option 1
---------
SELECT TOP 2000 RANK() OVER (order by newid()) as RankRec,
dateid,
timeid,
shopid,
managerid,
TmpProduct.productid,
1 as Qty,
(timeid*rand()) as Amount
FROM dim_date
CROSS JOIN dim_shop
CROSS JOIN dim_time
CROSS JOIN dim_manager
CROSS JOIN dim_product
INNER JOIN TmpProduct
ON dim_product.productid = TmpProduct.productid
WHERE shopid = 39
AND dateid = 1
AND dim_shop.shopno = dim_manager.shopno
Option 2
---------
SELECT TOP 2000
dateid,
timeid,
shopid,
managerid,
TmpProduct.productid,
1 as Qty,
(timeid*rand()) as Amount
FROM dim_date
CROSS JOIN dim_shop
CROSS JOIN dim_time
CROSS JOIN dim_manager
CROSS JOIN dim_product
INNER JOIN TmpProduct
ON dim_product.productid = TmpProduct.productid
WHERE shopid = 39
AND dateid = 1
AND dim_shop.shopno = dim_manager.shopno
order by newid()
I am posting this for me as it will come handy during future projects.
Thanks
Sutha
Each day shop in average sells 160 products. This is populated into a Tmp table as shown below:
SELECT TOP 160 productid
INTO TmpProduct
FROM dim_product
ORDER BY NEWID()
I have written couple of SQL statements to achieve this. They produced right result but unfortunately they are slow for my requirement. Therefore we have to do in C#.
Option 1
---------
SELECT TOP 2000 RANK() OVER (order by newid()) as RankRec,
dateid,
timeid,
shopid,
managerid,
TmpProduct.productid,
1 as Qty,
(timeid*rand()) as Amount
FROM dim_date
CROSS JOIN dim_shop
CROSS JOIN dim_time
CROSS JOIN dim_manager
CROSS JOIN dim_product
INNER JOIN TmpProduct
ON dim_product.productid = TmpProduct.productid
WHERE shopid = 39
AND dateid = 1
AND dim_shop.shopno = dim_manager.shopno
Option 2
---------
SELECT TOP 2000
dateid,
timeid,
shopid,
managerid,
TmpProduct.productid,
1 as Qty,
(timeid*rand()) as Amount
FROM dim_date
CROSS JOIN dim_shop
CROSS JOIN dim_time
CROSS JOIN dim_manager
CROSS JOIN dim_product
INNER JOIN TmpProduct
ON dim_product.productid = TmpProduct.productid
WHERE shopid = 39
AND dateid = 1
AND dim_shop.shopno = dim_manager.shopno
order by newid()
I am posting this for me as it will come handy during future projects.
Thanks
Sutha
0 Comments:
Post a Comment
<< Home