I was working with a Data Warehouse Automation (DWA) tool for a little more than a year, and I have to say I loved it. As a BI developer you could focus on the challenges you had in dimensional modelling, what granularity should you have the fact tables and going crazy with the business requirements and everything fast, consistent and tested!
But why is Data Warehouse Automation not used more often and more popular? I’m asking that myself more and more. That’s why I’m writing a series of blog posts all about DWA. In this first blog, I’m trying to find possible reasons behind and also argue for DWA, and why we should use it more often.
Every one needs to make data driven decision faster, why not use a generator which gives you answers in days instead of months..?
What do I mean by that? Many people and therefore many companies fear…
Very long time ago, 1986, 31 years ago to be precise, IBM in Europe created the very first architecture of a data warehouse. And it seems to be a masterpiece as it didn’t change much since. Though how can we improve or bring some innovation into Data Warehouse business in times everyone is talking about big data, data lakes, Internet of Things (IoT), predictive analytics, Data Vault, etc.?
No matter how we want to improve the architecture, it has to be automated as much as possible. Nowadays it became too slow for serving the business needs doing it the traditional way. However, I don’t think DWH will go away anytime soon (see more DWH vs Data Lake). I strongly believe that DWA tools are the future and will boost the Data Warehousing reputation back to earlier years.
That’s why I wrote this blog series all about DWA. You can start with the first blog…
As I’m using a lot of tools on a daily basis which makes my life easier, I’d like to share them with you. Also because lately, I recommending them more regularly to friends and co-workers. Hope you find them useful, please leave a comment if you’re using other cool tools or if you disagree with my opinion.
Some tools I got inspired by the author Tim Ferriss and his guests in his popular podcasts. Find the links at the end of this blog for more information.
Capture the time
Rescue Time – rescuetime.com
This little tool you install on each computer or smartphone you own, and it will keep track of how productive you are. It shows you what program you used, how long and you can compare days, weeks or years. If you like, you’re able to set goals for an application to not use more than X amount of hours.
I can say…
Quote by Leonard Nimoy.
I’ve experienced the truth of this a long time ago during my apprenticeship, without being aware of it. You probably remember those days when you had to prepare for an exam, but due to time constraints, you couldn’t look up everything in the books. So my friend and I compiled summaries of the most important things. Not surprisingly, these summaries quickly became very popular in class, and more people asked us if we could share it with them. I remember that back then I was sceptical about this and was asking myself “Why should I give it to them? We put a lot of work in it and they just want to copy that? No way!” At the end of the day, they got it anyway either through my friend or some other way. Eventually, I just said to myself “Oooookaay, give it to everyone who asks,”…
Continued from Migrate from Oracle to Microsoft (Views) – Part II here part III with the main focus on very interesting topic testing ;-).
When I implemented all about 20 views from Oracle to SQL Server, I had to test the performance and not least important, the content of the views if their still contain the same. And how do you do that the easiest way?
For measuring the performance you simply select alls Views with a SELECT COUNT(*) on both databases. Don’t forget to clear the cache before each run.
To generate all the COUNT-Statements I created a little script, just to make life easier right ;-):
–Generate Queries – Performance Test…
–stuff(c.list,1,1,”) as cols,
‘PRINT ”’ + v.Table_name + ”’ ‘ +
‘SELECT ”’+v.Table_name+”’ as Tbl_name, NULL as cnt UNION ALL ‘ +
‘SELECT ”’+v.Table_name+”’ as Tbl_name, count(*) as cnt FROM [dbo].[‘+v.Table_name +’] ‘–WHERE 1=2 ‘
Continued from Migrate from Oracle to Microsoft (Views) – Part I here part II with more focus on the performance part.
Performance issues recursive queries
When it comes to performance I had one big issue with the built-in function SYS_CONNECT_BY_PATH which is used for rekursiv queries in Oracle. This function is well know by the Oracle optimizer which makes this function unbelievable fast and there is no similar function on SQL Server. Even more complex was, that this queries was used on nested views, so the SQL was quite complex.
I started to translate it to an CTE view which is the pendant in SQL server and the answer to recursive queries in SQL Server. But I ended up having big performance issues because the queries where to complex. I had good performance measures on an easy simple query but invinitive response with more complex.
So what’s the solution to that?
Inline User-Defined Functions. Of course it…
Sometimes you have to migrate from an Oracle database to a Microsoft SQL Server. I’m not going into the reason that could be behind. I had different projects where I had to achieve this goal. The issues that I faced is what I would like to share with you.
When it comes to rewrite code from Ora-SQL to MS-SQL, in my case it were Oracle Views that had to be transfered to Microsoft Views, then the first thing you figure out is, that there’s not the same built in functions on both sides. For example the well know TO_NUMBER() or TO_CHAR() in Oracle dosen’t exist in Microsoft. So what you do?
Below you see a list of the common functions compared between the two technologies:
MS SQL Server
Smallest integer >= n
Max or min number or string in list
Translate NULL to n
Return NULL if two values are equal
str1 + str2
There is no easy way to generate partition for SSAS Cubes by default. So you have to do a SSIS-Package (here is a way you can do it dynamic-cube-partitioning-in-ssas-2008) or write a SQL Script which generates XMLA executables.
As shown in the Statement, you can dynamically generate month partitions. The trick is, to create a linked server, that you can execute XMLA statements with relational T-SQL, so i created a Linked Server first.
Thanks for any advice or improvements in the comment section:
@server = ‘SSAS’
, @srvproduct = ”
, @provider = ‘MSOLAP’
, @datasrc = ‘[DB-NAME]’;
EXEC master.dbo.sp_serveroption @server=’SSAS’, @optname=’rpc out’, @optvalue=’true’;
CREATE SCHEMA ssas;
CREATE PROCEDURE ssas.usp_Execute (
They are not cheap, but they are best! This are the new technology of headphones. They do have an active noice canceling in it. So you will not hear (or almost not) the monotonous sounds of trams, train, flight or even talking from other peoples. If you are traveling a lot or working in a shared room with more than 5 people, you have to buy them, you won’t regret it!!
I have the old one, the Bose QuietComfort 15, but I think the 25 series is even better, but not worser I hope :-).
Try it or ask me in the comment section, I will love to answer your questions.
Link zu Bose – QuietComfort 25bose quietcomfort 15
If yes, there is one best thing I used for the last time. If you listen to music while you working or study, this makes you focus on the work:
It is a music player with neuroscience based music channels, which dosen’t disturb you.
Some advertising from their website:
Stop the wrong music.
50% of us play background music while studying, working or inventing. But…it’s the wrong music! Mainstream music lowers comprehension and creates distraction because it is designed to connect with you intellectually and emotionally.
Use the right music!
Focus@will technology delivers various “Attention Amplifying” music channels scientifically designed to engage with your brain’s limbic system. This soothes the easily distracted fight or flight mechanism increasing attention span and general focus.
The science behind this.
We developed this in partnership with leading neuroscientists Dr. Evian Gordon (www.brainresource.com) and Dr. Stephen Sideroff (UCLA Professor of Psychology). Trials show typical 12-15% positive increase in focus biomarker and up to…