SSAS translation problems with Excel

I just revisited an old customer, where they had a setup with a SSAS 2008 and are using Excel 2007 as a frontend.

They had a problem when they tried to log on to one of the cubes. They didn’t get any errors, but when they had filled out all the connection information, they just kept getting prompted with the prompt below:

They could see the cube with all their other client tools.

After some debugging, I found that the error was caused by a translation on a measure group.  Apparently Excel 2007 can’t handle measuregroup name translations. I really have to test on another cube, if it that is a general problem, or if it just is a special case

At another consultant job some months  ago, I also experienced problems when working with translations in SSAS 2008 R2 and using Excel 2010 as frontend.

I guess the take away is, that if you have some strange problems  when using the cubes from Excel, but it works normally with other client tools, then the problem properly lies in the translations.

Unboxing MS SQL Server 2012 – The new analytic functions

As mentioned in my previous blot post, SQL server 2012 comes with 8 new analytic functions:

  • LEAD
  • LAG
  • LAST_VALUE
  • FIRST_VALUE
  • PERCENTILE_CONT
  • PERCENTILE_DISC
  • PERCENT_RANK
  • CUME_DIST

All the new analytic functions work with the window functions I described earlier in this blog series, and calculate an aggregate in the given window.
Let’s start by looking at the LAG and LEAD functions. The syntax looks like this

LAG/LEAD (scalar_expression [,offset] [,default])
OVER ( [ partition_by_clause ] order_by_clause )

Let’s add them to the query and dataset I introduced earlier:


SELECT TOP 10
[Date] ,
[Last] ,
LEAD([Last],1) OVER
(PARTITION  BY YEAR(Date), MONTH(Date), Name ORDER BY Date DESC ) AS 'Lead 1' ,
LAG([Last],1) OVER
(PARTITION  BY YEAR(Date), MONTH(Date), Name ORDER BY Date DESC ) AS 'Lag 1' ,
LAG([Last],2) OVER
(PARTITION  BY YEAR(Date), MONTH(Date), Name ORDER BY Date DESC ) AS 'Lag 2'
FROM dbo.Stock ORDER BY DATE DESC

Which yield the result:


Date                    Last                  Lead 1                Lag 1                 Lag 2

----------------------- --------------------- --------------------- --------------------- ---------------------

2012-04-19 00:00:00.000 51,45                 51,65                 NULL                  NULL

2012-04-18 00:00:00.000 51,65                 53,35                 51,45                 NULL

2012-04-17 00:00:00.000 53,35                 55,55                 51,65                 51,45

2012-04-16 00:00:00.000 55,55                 48,95                 53,35                 51,65

2012-04-13 00:00:00.000 48,95                 50,55                 55,55                 53,35

2012-04-12 00:00:00.000 50,55                 50,65                 48,95                 55,55

2012-04-11 00:00:00.000 50,65                 50,20                 50,55                 48,95

2012-04-10 00:00:00.000 50,20                 52,50                 50,65                 50,55

2012-04-09 00:00:00.000 52,50                 52,50                 50,20                 50,65

2012-04-06 00:00:00.000 52,50                 52,50                 52,50                 50,20

The LAG and LEAD functions are pretty self-explaining, but it is a feature that I often need in my work, and this new way of getting those values are a great addition to the SQL Server.

The FIRST/LAST_Value works pretty much the same way, and are just as self-explaining. The syntax looks like this

FIRST/LAST_VALUE ( [scalar_expression )     OVER ( [ partition_by_clause ] order_by_clause rows_range_clause )

Pay attention to the rows_range clause, if you want to make sure that the first/last row function doesn’t stop at the current row. Let’s look at the SQL:

SELECT TOP 50
[Date] ,
[Last] ,
FIRST_VALUE([Last]) OVER
        (PARTITION  BY YEAR(Date), MONTH(Date), Name ORDER BY Date ASC )
        AS 'First value in month' ,
LAST_VALUE([Last]) OVER
        (PARTITION  BY YEAR(Date), MONTH(Date), Name ORDER BY Date ASC
        RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING )
        AS 'Last Value in month'
FROM dbo.Stock
ORDER BY DATE DESC

Result:

Date                    Last                  First value in month  Last Value in month

----------------------- --------------------- --------------------- ---------------------

2012-04-19 00:00:00.000 51,45                 55,30                 51,45

2012-04-18 00:00:00.000 51,65                 55,30                 51,45

2012-04-17 00:00:00.000 53,35                 55,30                 51,45

2012-04-16 00:00:00.000 55,55                 55,30                 51,45

2012-04-13 00:00:00.000 48,95                 55,30                 51,45

2012-04-12 00:00:00.000 50,55                 55,30                 51,45

2012-04-11 00:00:00.000 50,65                 55,30                 51,45

2012-04-10 00:00:00.000 50,20                 55,30                 51,45

2012-04-09 00:00:00.000 52,50                 55,30                 51,45

2012-04-06 00:00:00.000 52,50                 55,30                 51,45

2012-04-05 00:00:00.000 52,50                 55,30                 51,45

2012-04-04 00:00:00.000 52,50                 55,30                 51,45

2012-04-03 00:00:00.000 54,95                 55,30                 51,45

2012-04-02 00:00:00.000 55,30                 55,30                 51,45

2012-03-30 00:00:00.000 56,60                 56,45                 56,60

2012-03-29 00:00:00.000 55,60                 56,45                 56,60

2012-03-28 00:00:00.000 58,20                 56,45                 56,60

2012-03-27 00:00:00.000 58,15                 56,45                 56,60

2012-03-26 00:00:00.000 60,95                 56,45                 56,60

2012-03-23 00:00:00.000 60,00                 56,45                 56,60

2012-03-22 00:00:00.000 58,25                 56,45                 56,60

2012-03-21 00:00:00.000 58,80                 56,45                 56,60

2012-03-20 00:00:00.000 59,30                 56,45                 56,60

2012-03-19 00:00:00.000 62,40                 56,45                 56,60

2012-03-16 00:00:00.000 63,95                 56,45                 56,60

The last 4 functions PERCENTILE_CONT, PERCENTILE_DISC, PERCENT_RANK, CUME_DIST are Window Distribution Functions used to do statistical computations. Coming up with an example about how to use those will take too long, but the syntax is similar to the other functions, and if you know the math behind, they are straightforward to use.

Conclusion

But to sum up. The conclusion is the same as in my previous blog post in this series. The new analytic functions a very nice addition to the new SQL server, and we can now do more advanced analytics in a much simpler way.

Monitor the process of long running SQL queries like DBCC SHRINKDATABASE

When a query runs a little longer than I would expect, I sometimes starts to get nervous. Especially in the cases, where there is no status telling me when it will finish; Like for instance when I run a shrink database cmd.

You can get information about the queries running like this.


SELECT 
T.text, 
R.Status, 
R.Command, 
DatabaseName = db_name(R.database_id) , 
R.cpu_time, 
R.total_elapsed_time, 
R.percent_complete
FROM sys.dm_exec_requests R 
CROSS APPLY sys.dm_exec_sql_text(R.sql_handle) T


Especially the percent_complete is nice.

Unboxing MS SQL Server 2012 – The new window functions

SQL server 2012 comes with a new set of window functions and analytically possibilities, which I will look into in this blog post.

Prior to 2012
Prior to the 2012 version, windows functions were also possible. The syntax looked like this:

Ranking Window Functions
< OVER_CLAUSE > :: =
OVER ( [ PARTITION BY value_expression , … [ n ] ]
<ORDER BY_Clause> )

Aggregate Window Functions
< OVER_CLAUSE > :: =
OVER ( [ PARTITION BY value_expression , … [ n ] ] )

Where the over clause creates a window for the entire result set, and the PARTITION BY divide the window to a lot of smaller distinct windows. This can be used, when you look at a row with a value, for instance, to a product and day, and you want to compare it to the value for the same product to month or year.

To demonstrate both the new and old version I’ll introduce a dataset.

Let’s look at the Vestas share prices. I’ll create a table only with the share prices for Vestas, like so

CREATE TABLE [dbo].[Stock]
(
[StockID] [int] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](50) NULL,
[Date] [datetime] NULL,
[OpeningPrice] [money] NULL,
[High] [money] NULL,
[Low] [money] NULL,
[Last] [money] NULL,
[Volume] [float] NULL
)

Then I import the data for the last couple of years from http://www.euroinvestor.dk/. Now we have a realistic dataset.

Let’s look at the closing price for the last 100 days like this:

SELECT top 100 StockID, Name, Date, last
FROM dbo.Stock
ORDER BY Date Desc

This yield the result

Using the window functions prior to the 2012 version, we could use the window functions like this

Giving this result set

You can use these values to do more intensive analytics etc.

Of course, the aggregate functions could also be achieved with a subselect, or a join to the same SQL statement grouped on the same as it is partitioned by in window functions, but this would for obvious reasons perform worse and yield a more complicated SQL.

The aggregate functions also provide the same as what will be performed by analysis services, when you are build a data mart with aggregated values.

The Row number is often used in data collection scenarios where you want to filter out duplicates.

The 2012 version

So, that is how window functions worked prior to the 2012 edition. So lets look at some of the things they have improved. First of all, the syntax has been extended as shown below:

OVER (
[ <PARTITION BY clause> ]
[ <ORDER BY clause> ]
[ <ROW or RANGE clause> ]
)

<PARTITION BY clause> ::=
PARTITION BY value_expression , … [ n ]

<ORDER BY clause> ::=
ORDER BY order_by_expression
[ COLLATE collation_name ]
[ ASC | DESC ]
[ ,…n ]

<ROW or RANGE clause> ::=
{ ROWS | RANGE } <window frame extent>

<window frame extent> ::=
{ <window frame preceding>
| <window frame between>
}

<window frame between> ::=
BETWEEN <window frame bound> AND <window frame bound>

<window frame bound> ::=
{ <window frame preceding>
| <window frame following>
}

<window frame preceding> ::=
{
UNBOUNDED PRECEDING
| <unsigned_value_specification> PRECEDING
| CURRENT ROW
}

<window frame following> ::=
{
UNBOUNDED FOLLOWING
| <unsigned_value_specification> FOLLOWING
| CURRENT ROW
}

<unsigned value specification> ::=
{ <unsigned integer literal> }

As you may have noticed, one of the short comings prior to the 2012 version is that you can’t add an ‘order by’ to the aggregate functions, hence you can’t calculate a running AVG or SUM. That has been changed in the new version. Let’s try to add an order by to the previous SQL. And let’s just delete the collumns that makes no sense. So the SQL will look like so:

Then we have a running AVG in each month.

All of my examples probably don’t make much sense to a stock broker. I’m only looking at the SQL functions, and not the data. One thing I do know, is however, that the running AVG won’t make much sense, starting all over from the first each month. Luckily there has also been added a sliding window function, so we can look at the previous/next n rows when doing our calculations.

This can now be done like so.

Where the AVG now is the AVG of the current row, and the 2 previous rows. The extension of the ORDER BY, now overrides the default partitioning.  The result set:

Conclusion

I have looked into some of the new window functions, and it is safe to say, that they have been significantly improved, and can now be used to do more advanced analytics in a simpler way.
A new addition to the window functions is also some new analytic functions that can be used. This is: CUME_DIST, LEAD, FIRST_VALUE, PERCENTILE_CONT, LAG, PERCENTILE_DISC, LAST_VALUE, PERCENT_RANK.

I’ll cover those in the next blog post.

 

Unboxing MS SQL Server 2012 – Installing the database server and the Adventure works databases

The new MS SQL server version 2012 was released not so long ago. With this article series, I will log my first experiences with it, regarding installation and setup, new features, look and feel etc.

First things first. Let’s install it. If you don’t have a developer edition in you company, download a trial version from MS’s website.

Just download the ISO, and install it from there.

Figure 1 – download it

Figure 1 – download it

 

Figure 2 - Double click setup.exe

Figure 2 - Double click setup.exe

 

Figure 3 – installation process looks the same as previous version -> Clik New SQL server stand alone installation

Figure 3 – installation process looks the same as previous version -> Clik New SQL server stand alone installation

 

Figure 4 – If you downloaded the Trial version as me, select the evaluation version

Figure 4 – If you downloaded the Trial version as me, select the evaluation version

 

Figure 5 – Let’s just select SQL server feature installation

Figure 5 – Let’s just select SQL server feature installation

 

After this, just click through with the default setting and start the installation.

After the installation, you can log on to the database server with SQL Server Management Studio (SSMS). Not much to see now, except from a more pretty interface, that looks as though it is part of Visual Studio 2010.

All development is now done visual studio 2010, and there are a lot of improvements, but more of that in a later article. Let’s first get some databases to work with. Let’s Install the adventureworks databases.

Apparently in 2012, there is no install file to install the databases. Instead you can download the mdf files separately from codeplex, and attach them. Do It here: http://msftdbprodsamples.codeplex.com/

Let’s just start with the two relational databases.

Figure 6 – Right click on databases->Select attach database, and select the downloaded mdf files. The OLTP  and the DWH databases.

Figure 6 – Right click on databases->Select attach database, and select the downloaded mdf files. The OLTP and the DWH databases.

 

We are now ready to explore the new SQL features. We’ll start with that in the next blog post.

Search text in SP

*Just as a note to myself*

I’ve just joined a big project with thousands of stored procedures. To get an overview of it, it would be nice if I could get the SP’s which inserted data in to table a, or update column b in table c etc.

That is easily done, like so:

SELECT Name
FROM sys.procedures
WHERE OBJECT_DEFINITION(OBJECT_ID) LIKE ‘%table a%’

and

SELECT Name
FROM sys.procedures
WHERE OBJECT_DEFINITION(OBJECT_ID) LIKE ‘%column b%’
AND OBJECT_DEFINITION(OBJECT_ID) LIKE ‘%table c %’

MS, seriously!!! SSIS and System.Runtime.InteropServices.COMException (0x800A03EC)

Let me describe my problem. Maybe you have the same, if you have googled for the error code, and found this page.

So, I’m downloading an old Excel file from a German webpage. The Excel file is in a spreadsheetML format, which is impossible to read from SSIS.

No worries, I can easily with a script task, convert the file to CSV, using C# and the Excel API. Like so

….
using MS.Internal.Xml;
using Microsoft.Office.Interop.Excel;
…….

Microsoft.Office.Interop.Excel.Workbook wb1 ;
Microsoft.Office.Interop.Excel.Application wb2 =
new Microsoft.Office.Interop.Excel.Application();
wb2.DisplayAlerts = false;
wb1 = (Microsoft.Office.Interop.Excel.Workbook)wb2.Workbooks._Open(filename, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);

if (wb1.FileFormat == Microsoft.Office.Interop.Excel.XlFileFormat.xlXMLSpreadsheet)
{
wb1.SaveAs(filename.Replace(“.xls”,”.csv”) , Microsoft.Office.Interop.Excel.XlFileFormat.xlCSVMSDOS , Type.Missing, Type.Missing, false, false, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlNoChange,false,false,false,false,false);
}
else
{
wb2.Workbooks.Close();
}
wb2.Workbooks.Close();

Now I can read the data with a SSIS CSV data source. This Works fine, when I run my SSIS package from BIDS. When I put it in to production, and run the package with the SQL server agent, I get the following error:

Service cannot be started. System.Runtime.InteropServices.COMException (0x800A03EC): Microsoft Office Excel cannot access the file ‘c:\XXXX’. There are several possible reasons:
• The file name or path does not exist.
• The file is being used by another program.
• The workbook you are trying to save has the same name as a currently open workbook.

None of the above is my problem. I think I tried everything to find a solution to the problem. It turns out, that it is a bug in windows server 2008. There are some different solutions on the web, which include changing the registry values on the server.

Since I’m a consultant, and hence an expensive resource, I would like not to fuck anything up, playing around with registry values.

I found an easier fix however. Apparently, the folder C:\Windows\SysWOW64\config\systemprofile\Desktop has been removed from Windows server 2008. It was present in windows server 2003.

When I created the folder, everything worked fine. Apparently the desktop folder is necessary, if you want to manipulate Excel files with code from the SQL server agent. Scary.

So, the answer: if you have problems manipulating office files from code executed from the SQL server agent on windows server 2008 create the folder C:\Windows\SysWOW64\config\systemprofile\Desktop (C:\Windows\System32\config\systemprofile\Desktop if it is 32bit)

That was it for now. Bye bye:-)

OLE DB error: OLE DB or ODBC error: Invalid column name ‘x’.; 42S22.

After spending time debugging my analysis services cube for the second time having this error, I guess it will be a good idea to blog about it, so I can remember it to another time 🙂

I got the above error after I had modified a dimension, so used it three fields for the key, instead of one. I made the relationship from my facts, based on the 3 keys, and then I deployed and processed the cube. And then I got the above error.

The reason for the error is, that I had the partitions for the fact table, based on some SQL sentences, and I had forgotten to modify these partitions after I had made the new relation.

So the conclusion: When you get the error “Invalid column name ‘x’.; 42S22” in analysis services, it is a problem related to the partitions design.

Warning: Null value is eliminated by an aggregate or other SET operation

I just had a SSIS package that had been running every day for the last couple of years without any problems, which failed on me with the following error message:

Warning: Null value is eliminated by an aggregate or other SET operation

The SQL source had a group by, where I was taking a minimum on some datetime fields. I needed it to return NULL values because I had some additional logic that took these NULL values into account.

What was even stranger was, that when I ran the package by double clicking it on the server, it completed with success.

So apparently a warning from the SQL server causes an error in SSIS, but only sometimes….. Nice….

The solution is to either change your SQL, so you don’t get NULL values, by for instance use COALESCE or ISNULL, or if you can’t do that, put this statement in front of your SQL : SET ANSI_WARNINGS OFF

Read more about the problem/bug here:
https://connect.microsoft.com/SQLServer/feedback/details/483175/failure-because-of-warning-about-null-aggregation

[rsAggregateOfMixedDataTypes] The Value expression for ‘x’ uses an aggregate function on data of varying data types. Aggregate functions other than First, Last, Previous, Count, and CountDistinct can only aggregate data of a single data type

Yet another irritating error message from reporting services 🙂

You will typically get that error, when you try to create a calculated member in the visual designer like so:

IIF( [Measures].[x]=0 , [Measures].[y] , 0,00)

And then use the SUM or another aggregate function on the new measure.

If you create the same calculated member in pure MDX, and run it in for instance management studio, it will not yield any errors. Like so:

WITH MEMBER [Measures].[z] AS
IIF( [Measures].[x]=0 ,
[Measures].[y] , 0,00)

Just to describe the calculated member in words. If the measure x equals 0, then the new measure z will be the value of the measure y, and if the measure x not equals 0, then the new measure z will take the value of 0,00.
The problem is that, when reporting services SUM the new measure z, it Sums data from y and 0,00, and reporting services will in some cases see the values as different data types. In my example the measure x was a decimal, and reporting services interpreted the 0,00 as an integer.

To avoid this problem, you have to tell reporting services that 0,00 is a decimal like this:

IIF( [Measures].[x]=0 , [Measures].[y] ,  Cdec(  0,00))