Fun with SQL: Joining on column name and row field value

Ok, it is that time of the day where I’m starting to get high on caffeine :-)

So this is the scenario. I have an application that is automatically generating a table in my database. It can add columns on the fly(But thankfully not delete).

It could look like this:

CREATE TABLE Cube.AppGenereatedTable
(
x1value float,
x2value float,
ID_Region INT
)

We can now query information about that table like so:

select column_name,ordinal_position,DATA_TYPE , *  from information_schema.columns
where table_name = 'AppGenereatedTable'
AND TABLE_SCHEMA='cube'

The problem is, that right now, I do not know what to do with that data, and when I’m not at the customer anymore, more fields can be added by the application. What I would like, is a place where the customer can add more information about that column, and decide if that column should be included in some logic. To do that, I add a “metadata” table.

CREATE TABLE Dim.SomeMetadata
(
ID_SomeMetadata INT,
ColName nvarchar(100),
friendlyName nvarchar(100),
includeInETLLogig bit
)

Now I want to put it all in a fact table, where I have 1 row / column value. It could look like this:

CREATE TABLE Fact.SomeFactTable
(
Value float,
ID_Metadata INT
)

Ehhh, Ok, so know I will do what the title is saying. To load the data, I need to create the sql automaticly like so:

--Use the colnr to reference all collumns
DECLARE @colnr AS INT
DECLARE @sql AS NVARCHAR(MAX)
SET @colnr = 1
--Do for as many collumns you expect there ever will be. Do some automation here if you like
WHILE (@colnr < 50)
BEGIN
SET  @sql =
'INSERT INTO Fact.SomeFactTable
(Value, ID_Metadata)

--And now we do the SQL with the join. Pay attention 

SELECT '+(select column_name from information_schema.columns
where table_name = 'AppGenereatedTable'
AND TABLE_SCHEMA='cube'
AND ordinal_position=@colnr AND DATA_TYPE = 'float')+' as value
,someMetaData.ID_SomeMetadata
FROM Cube.AppGenereatedTable appGenereatedTable
INNER JOIN Dim.SomeMetadata someMetaData on someMetaData.ColName='''+(select
column_name from information_schema.columns
where table_name = 'AppGenereatedTable'
AND TABLE_SCHEMA='cube'
AND ordinal_position=@colnr AND DATA_TYPE = 'float')+''''

print @colnr
print @sql
BEGIN TRY
exec( @sql )
END TRY
BEGIN CATCH
--Catch error. Do nothing
END CATCH
SET @colnr = @colnr + 1
END

Well, I don’t know if there is other than me, who can use this for something. Next post will be about precisely about what I’m using this to :-)

SSAS Tip -> Format seconds as HHHH:MM:SS

A common way of showing durations in SSAS, is to format them as HH:MM:SS. A common scenario is to store the durations in seconds. To display them in the HH:MM:SS format, first we have to convert the duration to hours, and then put on the format, like so

CREATE MEMBER CURRENTCUBE.[MEASURES].[someMeasure] AS

[Measures].[someMesureInSeconds]/86400,FORMAT_STRING = “HH:MM:SS”,…

However, this only works if the duration is less than 24 hours.

If it is more, then there is no out the box way to show it in the correct format. To get around that problem, you have to create a mearure as a string. First calc the hours, and then add the minutes and seconds formated as above, like so:

CREATE MEMBER CURRENTCUBE.[Measures].[MesureX]
AS
CSTR(INT([Measures].[LongDurationInSec]/(60*60))) + “:”
+ FORMAT(CDATE([Measures].[LongDurationInSec]/86400
- INT([Measures].[LongDurationInSec]/86400)), “mm:ss”)
FORMAT_STRING = “[HH]:MM:SS”, …

SQL TIP ->Other uses of the SQL GO CMD

As most developers know, the SQL GO CMD signals the end of a batch of SQL statements.

However, another use of it, is to use it to execute a SQL statement multiple times. Just add a number after it. For instance:

SELECT GETDATE() as now
GO 5

Which will execute the Select statement 5 times.
Or, something maybe more useful, create a table with integers from 1-5:

CREATE TABLE #numbers(number int)
GO --> Execute the first batch
INSERT INTO #numbers
VALUES((SELECT COALESCE(MAX(number),0) FROM #numbers) +1)
GO 5-->Do 5 times
SELECT * FROM #numbers

Which will give you the result:

Batch execution completed 5 times.
number
-----------
1
2
3
4
5
(5 row(s) affected)

Well, that was it. Another use of the go cmd. Have fun :-)

SQL TIP -> Recursive CTEs

This morning I  had to do some recursive common table experessions. It is acctually relativly simple to do.

A common scenario:

We have a parent child relation in a selfreferencing table. That is common when we have an employee manager relationship or organisation relationship.

Let’s just look at  the  employee manager relationship.

Let’s create an example tmp table like so

CREATE TABLE #Employee (
EmployeeID  INT,
EmployeeName NVARCHAR(50),
ManagerID INT
)

And let’s insert some test data into the table

INSERT INTO #Employee(EmployeeID, EmployeeName,ManagerID) VALUES (1,'Mr. Manager',NULL)
INSERT INTO #Employee(EmployeeID, EmployeeName,ManagerID) VALUES (2,'Jensen',1)
INSERT INTO #Employee(EmployeeID, EmployeeName,ManagerID) VALUES (3,'Hansen',1)
INSERT INTO #Employee(EmployeeID, EmployeeName,ManagerID) VALUES (4,'Clausen',2)

Where, as you can see,  Mr. Manager is at the top, and Clausen is at the bottom.

We now want to use as recursive common table expression(CTE) to show what level a person is on, in the employee hierarchy.

To do that, we start with the highest level(Where managerID is Null). Then we do a union all with a select which is referencing it self, like so:

WITH EmployeeCTE (EmployeeID,EmployeeName, ManagerID , Level) AS
(
SELECT EmployeeID, EmployeeName,ManagerID, 0 AS Level
FROM #Employee WHERE
ManagerID IS NULL
UNION ALL
SELECT e.EmployeeID, e.EmployeeName,e.ManagerID, Level+1
FROM #Employee   e
INNER JOIN EmployeeCTE  cte ON cte.EmployeeID=e.ManagerID
)
SELECT * FROM EmployeeCTE

And the result can be seen below. This SQL shows what level an employee is on.

EmployeeID  EmployeeName ManagerID   Level
1           Mr. Manager NULL        0
2           Jensen      1           1
3           Hansen      1           1
4           Clausen     2           2

If we want to see an employees way to the top, we can select from the same CTE like so:

SELECT  lvl0.EmployeeName as lvl0Name,  lvl1.EmployeeName AS lvl1Name,      lvl2.EmployeeName AS lvl2Name FROM EmployeeCTE lvl2
INNER JOIN EmployeeCTE lvl1 ON lvl1.EmployeeID=lvl2.ManagerID
INNER JOIN EmployeeCTE lvl0 ON lvl0.EmployeeID=lvl1.ManagerID
WHERE lvl2.Level=2

And the result:

lvl0Name     lvl1Name     lvl2Name
Mr. Manager Jensen        Clausen

Now you know how to do recursive CTE :-)

Have fun

Prepopulate a table with data in SSDT

The new database project in SSDT, is truly a great tool. It is not perfect thou, and one of its shortcomings, is that there is no out of the box way, to define some static/initial data in in a project, when you deploy it.

In all the projects I do, I always want to have some static data in the database. That could be meta data about the solution, customer type, product type etc.

Of course there is a workaround to this. One is to define the data in a post deployment script. To do that, right click your project and click “Add script”. Select the post deployment script.

To maintain data in a table, each time we deploy, we use a merge statement. I’ll use my stock table, that I created in a previous post. Then the SQL will look like this:

MERGE INTO [Stock] AS Target
USING (VALUES
('VWS','Vestas Wind System'),
('DANSKE', 'Danske Bank')
)
AS Source (StockSymbol, StockName)
ON Target.StockSymbol= Source.StockSymbol
-- update matched rows
--(If we don't want the application to update the rows)
WHEN MATCHED THEN
UPDATE SET StockName = Source.StockName
-- insert new rows
WHEN NOT MATCHED BY TARGET THEN
INSERT (StockSymbol, StockName)
VALUES (StockSymbol, StockName)
-- delete rows that are in the target but not the source
--(If we don't want the application to insert data in our table)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;

 

Now, just right click your project, and click publish. And badabing, now you have data in your table.

A more pretty way, could be to put the merge statements into SP, and the call the SP’s from the post deployment script. Then the application could also reset the data with a call to a SP.

The conclusion: To prepopulate your solution with data, put the data into a merge statement, and put it into the post deployment script. This is a fine solution as long as you don’t have to many or to big datasets. Hopefully MS will come up with a better solution in the next version of SSDT.

Analysis services -> Clear Cache

Just a short note about analysis services and its cache :-)

One of the reasons for analysis services fast respond time is its cahce. But what if you want to test a query without using the cache? Well, just use this XMLA Script:

<Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
 <ClearCache>
 <Object>
 <DatabaseID>Your Database Goes here </DatabaseID>
 <CubeID>And or cube id goes here</CubeID>
 </Object>
 </ClearCache>
 </Batch>

Also, if you do partial processing, the cache, can in some cases also be invalid. Therefore, after a partial processing, you also want to run the above XMLA script.

Database development with SSDT and Azure

One of the new things in the development tool to SQL server 2012 (SSDT) is the database project as shown in the picture below. Before 2012 it was a part of visual studio, but now you can get it for free in the data tools.

I’m guessing that that is one of the reasons for the name change from BIDS :-)

But I’m rambling. Let me start by describing a typically way of database development.

  • You develop your database  in a development environment with management studio
  • When you want to deploy, you create a change script that will be executed on the production server.(There may be servers in between like test, preprod etc., but let’s not add those into the mix)

Well, that’s pretty much it.

The updated database might be used by a lot of other applications. Maybe the database is even the main part of a specific application.

There are a lot of problems in regards to this way of developing. Some of them are

  • There is no test of references in Management studio. If you change a table that is used by for instance a view, you won’t automatically catch it.
  • The change script might be error prone. Especially if there is no connection between develop and production servers, so you can’t compare them with a tool like Redgate or Atlantis.
  • There may be a “drift” in the production database, which mean that a DBA for instance can have added an index to the production server, because he wrongly thinks he knows more about it than you :-) or Maybe someone created a view to be used for reporting etc.
  • You don’t have source control.

That’s just some of the problem you will have with that kind of database development. The above problems and many others are addressed in the SQL server database project.

Let me demonstrate some of it with an example.

Let’s first create the project. If you look at the screenshot below you’ll see that there are 2 SQL server menus. The reason for that, is that I have visual studio 2010 installed along side with data tools. The project that is marked is the new one from data tools, and is of course a newer and better version. Select that one.

Figure 1 – Create the project

When you have created the project, you have an empty project. In my case I want to continue working on the Azure SQL project I started on in a previous blog post. So I just right click the project, click import, and type in the connection to my Azure database, and click start.

Figure 2 – Import the Azure database

And badabim. Now I have my database in my development tool, and I can add it to source control alongside my other projects. All my database objects are now shown in the solution explorer. There are about 100 different object types. Right now, I only have one table, but that is fine for my example.

If I double click the product table, I get the view as shown below.

Figure 3 – The designer

You can edit the table either in the designer at the top, or in SQL at the bottom.

You can build your project, and you will get errors, if you are using keywords that are not available in the target platform or have an invalid reference etc.

And regarding the target platform. You can change it in the properties window. As shown below.

Figure 4 – Change the target platform

If you change the target to Azure, you will get errors if you don’t have a clustered index on all of your tables, since that is a requirement in SQL Azure. Likewise, if you use file groups in Azure, you will get a compile error since that of course make no sense in Azure etc.

In my example, I would like to work with stocks. I delete the product table, and add two new, so I get what is shown below.

Figure 5 – The new tables

I can now from my development tool, do a schema compare and synchronize with the development/production server.

Figure 6 – Schema compare

In my example, I just have the Azure database, so I just synchronize that.

I can look at my azure database and see that it has been synchronized.

Figure 7 – The azure database have been updated. Nice :-)

So the conclusion: I must admit I’m really exited about the update of the database project, and the fact that it is now free to use along with the other data tools.

It addresses a lot of common problems when doing database development that you would normally have.

I can’t imagine myself starting a 2012 database project, and then not use this tool. No matter if it is a DWH, BI or application project I will be working on.

Change ”select top 1000” to ”Select top XXX”

In SQL server management studio, when you right click on a table, there is a shortcut to select top 1000 or edit top 200 rows. Se pic below.

It is an easy way, when you just want to see some of the data in a table, or if you want to edit some values in a small table. If you are using the 2 shortcuts you can change 1000 and 200 to values that make sense to you.

Click options->SQL Server Object Explorer. Then you get the options as shown in the pic below.

Change the values to your needs, and voila, now you have select top yyyy :-)

Unboxing Azure part 1: The new Portal & how to create an Azure SQL database

Not so long ago, Microsoft(MS) announced a lot of new and cool features to their Azure platform at their event “Meet windows Azure”

There is no doubt that MS is expanding fast in the cloud, and with this blog series I will try to give a short overview of the business intelligence(BI) capabilities that are currently available in Azure.

The new Portal

MS have updated there portal, so that it is now primarily is using HTML5. It looks like this.

The new portal is intuitive and easy to use. One of the new exiting things is the virtual machines, where you for instance can create a windows server, and do the stuff that are not currently supported in other services in Azure. In the BI world, that could be installing a SQL server to use the SQL server agent to run SSIS packages or to host analysis services, since these services are not supported other places in Azure right now.

The new portal is still in “Preview”. I don’t know precisely what that means, but if there is something you are missing, you can go to the old Silverlight portal by clicking on the portal link in the top right corner.

SQL Azure

But first things first. When we are talking BI we have to start with a database. Click the SQL databases item, and click create database, and follow the instructions. For information regarding edition and pricing, read here

When you create a SQL azure database, it comes with a firewall that blocks all traffic. To get access to it, you need to open for the IP addresses that need to access it. In the new portal that is hard to find. Click “SQL databases”->”Servers”->Click on the server name-> Clik configure and add the IP addresses. While you’re developing, you can start by adding your own IP.

When you have done that, you can start designing your database. There is a management tool in Azure you can use. Just click the manage database link in the portal.

From the management tool you can design tables(See below), views and SP’s.

You can also create new databases and deploy databases from your database projects in Visual studio, which is the correct way to develop SQL databases. (I’ll cover that in a later blog post)

You can off course also logon to the database from management studio. Get the connection string from the portal, and type it in. See example below.

As before it is very limited what you can do from management studio. You can’t design anything with the designer. You have to do everything in SQL, and there is also something you can’t do with SQL either. (Again the correct way to develop database projects is to use the database project in visual studio.)

But to conclude on this first look. We have created a SQL database using the new portal. From what we have seen up until not, not so much have changed since the last version. How we can do SQL development have however changed, and I’ll cover that in the next blog post in the series.