Monday, February 10, 2014

Business Analysts: I SALUTE YOU!

I've been meaning to speak out for those individuals that are buried deep in the confines of their cubicles under mountains of data, hundreds of report requests, and in my humble opinion, keep businesses running and innovating: the Business Analyst. Now the term "Business Analyst" here just signifies any individual out there that satisfies the curiosity of its user base. IT love to call these people Power Users or Subject matter experts (SME). That's fine for architect lingo, but for business people that just want answers, they turn to their analysts. Some call them "data analysts" or "data scientist". Recently, I ran into a tweet that pretty much summed it up for me:

Its the "analytical curiosity" that @Office_Jockey is talking about that differentiates a Business Analyst from a Developer. Never being satisfied. Or better yet, chasing after answers until your gut tells you "that's it". That's as good as its going to get. You've cross-referenced as many data points as possible and feel very confident that your proposed query, report, or data model is proven and sound. What makes it feel even better is when people try to throw stones at your work, and they just fall by the wayside because your logic has either taught them something new, or reinforced what they already knew or had the very same question you did on some other day when you busted your bump to get that anamoly out of your data.

Business Analysts: I SALUTE YOU!


Tuesday, July 23, 2013

SQL Saturday #222 Sacramento

SQL Saturday #222 Sacramento 2013

If you haven't heard yet, SQL Saturday Sacramento is this weekend on July 27th. This is the 2nd annual SQL Saturday event in Sacramento with a spectacular speaker lineup. The Sacramento SQL User Group has done an outstanding job in getting sponsors and bringing in talented speakers and authors from all over the world.  Whether you are a database professional or a data analyst looking to expand your knowledge in Excel, there is something for everyone at this event. A wide variety of topics will be presented, including Database Administration, Business Intelligence, Big Data, Cloud, and more. Check out the schedule for yourself at http://www.sqlsaturday.com/222/schedule.aspx. I'll be giving a session on SSIS 2012 and its improvements for developers and look forward to catching up with you at the event.

Before Saturday, there is also a fantastic full day class on Performance Tuning with Indexes on Friday, July 26th with the well-known SQL author Kalen Delaney. Kalen's authored several books in SQL Server and is one of the most eagerly anticipated speakers at conferences internationally. You can get more info on her at her website http://www.sqlserverinternals.com/. If you're definitely interested in taking the class, you can still sign up now and get a seat! It's $149 and you can pay online by going to http://sqlsat222precon.eventbrite.com/.

Last year's event was great, but this event looks to top that off pretty quickly. If you haven't signed up, go to http://sqlsaturday.com/222/eventhome.aspx and register today! Lunch will be provided for a small fee and there will be an after party which will be a great time to network with other SQL Server professionals in the Sacramento area.

Hope to see you there.

Friday, June 21, 2013

Next Chapter

Next ChapterWhen you come to the end of a chapter of a good book, turning the page to reveal the next chapter's title always brings anticipation. At least it does for me. It's a combination of feelings: the unknown, predicting what happens next, and the eagerness to learn more. I breath, mentally reset, and focus my attention to the rest of the story. The same feeling has come over me as I walk into the next phase of my career.

My new chapter begins on Monday as Regional Director of Business Intelligence/SQL for Neudesic in the Pacific Northwest. I'll be building a team of BI consultants whose focus will be to work side-by-side with their clients to come up with the best strategies, the best business solutions, and develop the best people. That's everywhere from Northern California up through Washington! I'm very excited and eager to get started. I'll have a team waiting for me on day 1, but I can't wait to see our team's growth in 6-months, 1 year, or even 5 years from now.

These past couple of years being an independent consultant (RDP Streamline) and being part of the leadership at DesignMind have been the most fun I've had in my career to date. I sincerely appreciate the DesignMind organization for its wealth of experience and client base. All the clients and consultants I've worked with took great care of me and I of them. It always pleasantly surprised me to hear about solutions, tips, or SQL Community work that our team did which continues to help people today. After all, that's what drives me most of the time: helping people. Which reminds me of a conversation I had with my 16 year old over father's day.

Angel Jr and I were discussing college over Father's Day and when we started to talk about majors, he said, "Papa, I really don't know what I know what I want to do yet. I just like helping people." My response: "Me too, mijo. Me too." I continued to tell him that he had to find what he was passionate about. It may take a couple of tries, but all you can do is let your feet follow the desires of your heart. 

Onto the next chapter. 

Tuesday, April 23, 2013

Excel Data Explorer and the Twitter Search API


I had seen Data Explorer for the first time at the PASS Summit last year, but was not fully aware of its capabilities until the recent PASS BA Conference. I saw demos of how it can load web data into Excel. I thought Data Explorer was more of a cloud BI solution, but now understand that its a very powerful query engine/data transformation tool. You can import data from many sources like you can PowerPivot, but what's fascinating is its ability to load web data from static html pages, tables on web pages, web services, etc. I had a need to search recent activity on Twitter and decided to test drive Data Explorer in Excel.

Twitter's Search API returns JSON formatted data, and I heard multiple times that Data Explorer can handle JSON. To start, here's a sample url string you can send to Twitter anonymously using Version 1 of the Search API:


Here's an explanation of the parameters used in this URL string:
  • q: The query text. See Using the Twitter Search API for examples and operators.
  • rpp: The number of results returned. We're limited to 100 here, but can get more with the REST API.
  • until: Tweet date is less than this date formatted as YYYY-MM-DD.
Good news is that Data Explorer is available for Excel 2013 and Excel 2010 SP1. Assuming you've installed Data Explorer cold, you'll want to be sure to turn on a couple of settings: 1) Enable Advanced Query Editing and 2) Fast Combine. This is for parameterization of our queries and copy/pasting formulas later on. I don't know why these options exist since you can use the tool with or without them. Personally, I think Microsoft should just leave them on. It took me a fair amount of time before realizing these settings were my problem. Nonetheless, to turn on these features, up in the Data Explorer ribbon, click on the Settings button and select the Advanced Query Editing option and then OK. Under Settings is a button for Fast Combine. Click it and accept the prompt to Enable.

Now we're ready.

In the Data Explorer ribbon, choose the button to retrieve data from Web.


Enter the Twitter Search URL string above in the form and select OK.

This will open a Data Explorer Query window with a table of information about the query results. Click on the word "List" and you'll get a list of records. If you expand the Steps dialog on the right, you'll notice that this added a step in your data shaping process. Pretty cool, right?


Right click on the header of this list and select the To Table option. This convert this list object to a table. If you don't see any records by this point, you'll have to change the parameters of the URL string until you see some records returned. Get familiar with the limitations of Twitter's v1 Search API here.

Accept the list of defaults in the To Table dialog. Click OK.

The resulting table is very similar to the list, but with a very important exception. In the header of the table next to the word "Column" is a small two-arrow icon. Click this little button and it will automatically read the columns defined in the JSON file and give you the option to choose the ones you want to bring into your query. Leave them all selected for now. You can always remove columns later or by modifying the code at this Step in your data shaping process.

Click OK on the columns you'd like to expand and you'll finally get columns of all the tweets.

Now you can rename columns, change data types, or hide columns. OR, you can just copy the code below in the formula window and click refresh.

let

    Source = Json.Document(Web.Contents("http://search.twitter.com/search.json?q=passbac&rpp=100&until=2013-04-21")),

    results = Source[results],

    TableFromList = Table.FromList(results, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

    ChangedType = Table.TransformColumns(TableFromList,{},Value.FromText),

    #"Expand Column1" = Table.ExpandRecordColumn(ChangedType, "Column1", {"created_at", "from_user", "from_user_name", "geo", "iso_language_code", "text"}, {"CreatedDate", "FromUser", "FromUsername", "Geo", "ISOLanguageCode", "Text"}),

    ChangedType1 = Table.TransformColumns(#"Expand Column1",{{"CreatedDate", DateTime.From}})

in

    ChangedType1



Click Done on the query window and the data will be imported into Excel. Yes!

Now let's parameterize this query so that it's easy to update the search results from Excel instead of repeatedly opening Data Explorer to change the URL. Originally, I followed much of Chris Webb's post on how to do this a parameterized query string in the formula, but for this post, I found it easier building my Twitter Search URL string with an Excel formula. This can handle empty parameters a little better and can encode the "q" parameter text string using Excel's ENCODEURL function as Chris' post suggests you do.

Add another couple tables in another worksheet named Parameters: one to list each parameter's value and one for the Search URL string. It's nice to add a description of each parameter as shown below so you can remember each parameter's definition in the string. Then in the Search URL string table, build the URL with the following formula:

=CONCATENATE("https://search.twitter.com/search.json?q=",ENCODEURL(B2),IF(ISBLANK(B3),"",CONCATENATE("&rpp=",ENCODEURL(B3))),IF(ISBLANK(B4),"",CONCATENATE("&since=",ENCODEURL(B4))))


Before you continue, make sure to note the name you gave the table containing the parsed URL. You can do this is in the Design tab of the Table Tools ribbon. By default, Excel will name your tables Table1, Table2, and so on.

Now go back to the sheet with the Data Explorer query, and you can either click on the Filter & Shape button in the Query tab of the Table Tools ribbon, or click on the Filter & Shape button in the Query Settings dialog at right that appears when you bring up this worksheet.

Now with the query window open, click on the formula button to open up the formula editor. You'll want to replace the url string in double-quotes with Excel.CurrentWorkbook(){[Name="SearchString"]}[Content]{0}[SearchString] where SearchString is the name I assigned to the table containing the URL I constructed on the Parameters worksheet. Hit enter to exit out of the formula editor, and click Done in the query editor. Now you have a parameterized query. The resulting formula for the whole spreadsheet will look similar to the one below with differences in case you renamed columns or changed data types.
let

    Source = Json.Document(Web.Contents(Excel.CurrentWorkbook(){[Name="SearchString"]}[Content]{0}[SearchString])),

    results = Source[results],

    TableFromList = Table.FromList(results, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

    ChangedType = Table.TransformColumns(TableFromList,{},Value.FromText),

    #"Expand Column1" = Table.ExpandRecordColumn(ChangedType, "Column1", {"created_at", "from_user", "from_user_name", "geo", "iso_language_code", "text"}, {"CreatedDate", "FromUser", "FromUsername", "Geo", "ISOLanguageCode", "Text"}),

    ChangedType1 = Table.TransformColumns(#"Expand Column1",{{"CreatedDate", DateTime.From}})

in

    ChangedType1



There's a lot you can do with this design pattern, so be sure to check out the Twitter documentation for using the Search API. I uploaded this sample spreadsheet here which I created in Excel 2013 if you want to try out some different query strings for yourself. Important: Using a parameter from an Excel workbook is possible because I enabled Fast Combine before doing anything with Data Explorer. If you forget, you'll get an error that says 'Query' is accessing data sources which can not be used together. If this is you, then enabling Fast Combine should do the trick.

It's also worth mentioning that version 1 of Twitter's Search API is nice because it doesn't require any special authentication, but this method is deprecated and may be eliminated at any time. The preferred method to search tweets is to use v1.1 of Twitter's Search API which uses OAuth to authenticate yourself before actually pulling any data from Twitter. If you have to do this, then you'll have the extra task of developing a utility with .NET or building a script task in SSIS to pull JSON file(s) one-by-one into a directory. Data Explorer is able to handle multiple files in a single directory, and JSON files should fall right in line with this. If any of you would like to tackle that challenge next, please leave a comment with a link to your blog or published article for all of us to read.

More Data Explorer fun coming soon. Thanks for reading!

Sunday, April 14, 2013

PASS BA Conference Key Note Day 1; "FUN has its own ROI"

The first day of the PASS Business Analytics Conference began with an amazing keynote on the PASS organization, BI's impact on businesses, and many new and exciting analytical tools available in Excel 2013. All of this in front of a packed room of at least 1000 Business Analysts and Business Intelligence developers ready to dig into presentations on Big Data, Predictive Analytics, and many business cases using a Microsoft and Open Source tools. By the end of it, we all wanted to have FUN the rest of the conference.

PASS President Bill Graziano opened things up introducing many attendees new to the PASS organization to the wealth of resources available to BA/BI professionals across the world. Especially our BI/BA Virtual Chapters. The vast majority of attendees at this conference were new to the PASS organization, so in case any of you reading missed it, there are 4 Virtual Chapters that have monthly meetings. Get plugged in!
Matthew Wolken then took the stage next and set the precedence for the conference with regards to how business analytics are very different with the advances of technology, mobile devices, and social media. 

Analyzing social networks and how they relate back to businesses is very important to Dell. So important, that Dell has a Command Center built specifically to manage customer relationships across all social networks. The image below is different than the one he showed in the Keynote, but is used in a Forbes article describing how Dell and companies alike are making huge investments in managing customer relationships through social media. Check it out, and if you missed any of the sessions on analyzing this social, there will definitely be some of these sessions at Summit 2013 in Charlotte.

http://tech.fortune.cnn.com/2012/10/25/nasa-style-mission-control-centers-for-social-media-are-taking-off/

Then for the grand finale, Amir Netz, Microsoft Technical Fellow, and Kamal Hathi, Director of Program Management for BI, took the stage to show off 3 innovative tools that are available in Excel 2013: Power View, Data Explorer and GeoFlow. Excel is one of the most popular BI tools on the planet and the goal Microsoft has with these new analytical tools is to keep them simple and accessible for everyone, even kids in junior high, including Amir's son, who questioned his Dad's intentions when showing his closest friends Power View. They were impressed with Amir, as were we as he led the charge showing off visualizations analyzing music popularity domestically and internationally. Kamal joined in on the FUN by showing how you can organize some of the biggest datasets in the world using Data Explorer by loading JSON formatted files, something that is very difficult to do today without custom code. Then Amir made a dramatic close by showing off GeoFlow on 80 on a huge Perspective Pixel screen.For a preview of what GeoFlow can do, check out this YouTube video.

Some of the most memorable quotes from Amir and Kamal:

"When you think deep down, users of BI products have complete control of how they use their day. How much time they spend on BI. When BI is fun, they are using it. FUN has its own ROI. Focus on the element on fun." -Amir Netz


"Big Data is not just size, there's a life-cycle. Interesting insights. Find and combine data. Form Intuitions, analyze, and tune. Take action and operationalize." - Kamal Hathi



One other mention. Sitting next to Mark Vaillancourt b|t at the bloggers table was another first for us. We were both first timers at PASS Summit 2011, and we were both thrilled to be there tweeting and writing at the first ever PASS Business Analytics conference. Unbelievable. Mark's live blogging skills were outstanding and you can find his live blog post for Keynote Day 1 here.

Will send out more blog posts soon. Thanks to PASS, Microsoft, and Dell for a great beginning to what will most likely be the next phenomenon in technical events across the country.

Friday, February 1, 2013

Run SQL Server 2012/SharePoint BI for evaluation in VMware

Here's the steps I use to create a fully-loaded Microsoft BI virtual machine for demos and testing using VMware. VMware Player is free for Windows and Linux PCs. For Mac OS X, go out and buy VMware Fusion to get this running.

First thing is to download VMware player and the Microsft BI image. Here are some of the few features I love about this VM:
  • SQL Server 2012 (DB engine, PowerPivot, Multidimensional and Tabular SSAS instances)
  • SharePoint 2010 (PowerView, Performance Point, Excel Services)
  • MDS/DQS
  • Office 2010
  • Excel Add-Ins for DQS, MDS, and PowerPivot
Once VMware player installs, you'll want to open the Windows Virtual Machine. What you'll really end up doing is importing the Virtual Hard Disk (VHD) because unfortunately, this is a Virtual PC image. Not to worry. That's why I wrote this post. We can import the VHD with VMware Player. So unzip the contents of the download, go into VMware Player and go to File > Open a Virtual Machine. Navigate to the unzipped contents and find the file SQL Image 2012.vmc.


After you locate the .vmc file, you will be prompted to install VMware vCenter vConverter to convert the hard disk from a VHD to a VMDK.

Click OK and don't try to install vCenter Converter seperately. Accept the language default and continue with the installation of vCenter Converter Standalone Client. Once this completes, retry opening the VM again by going to File > Open a Virtual Machine and navigating to the SQL Image 2012.vmc file. Now you'll get a different prompt wanting to Import the Virtual Machine:

Review the name of the VM and storage path, and then click Import. The Converter utility will now auto-convert the VHD file to the location specified. Once its done, you will see a machine in VMware Player that's ready to go. Just select the Virtual Machine name listed on the left hand side of the player and hit Play virtual machine on the bottom right.


You'll be prompted to login as CONTOSO\Administrator, whose password is pass@word1.

You'll also need a TechNet or MSDN subscription to put in a license key for Windows Server 2008 R2 and Microsoft Office 2010 to be legit. I actually like installing Excel 2013 on this VM to play with the Power View capabilities with some pretty big datasets, but you don't have to as that's easily available in the PowerPivot Gallery loaded with plenty of samples on this VM as you can read in Dan English's blog post on this image.

Hope you enjoy the image as much as I do. Saves the hassle of installing all the components and get right to learning the new goodness of the software. 

Thursday, January 10, 2013

Prep for Analysis Services Multidimensional


Before getting into the thick of building a multidimensional cube, here's some questions you should ask yourself:

1. Do I have a project sponsor?
2. Do I have a reasonably sized dimensional model (aka Star Schema)?
3. Can someone check answers out of my data warehouse?

Please note these questions change if you decide on either Tabular or PowerPivot, as the prep work for those flavors of Analysis Services is different. I won't be going into Multidimensional vs Tabular in this post. For now, I presume you're here because you have SQL Server 2008 R2 or earlier and want to get started right away. Either that, or you're on SQL Server 2012 but your server memory cannot fit all your tabular cubes, or you've discovered you need parallel processing. Have a different reason for using multidimensional? PLEASE LEAVE A COMMENT. Would love to hear from you.

Your Project Sponsor

If you don't have the backing of a Senior Executive or someone influential that can pull through for you when times get rough, politically, data-wise, money-wise, then hold off until you get one. Either that, or get buy-in from your Manager that you can learn how to make a cube on your own time with their data.

You are about to do a lot of hard work that could revolutionize the way your business sees the world. Believe me, you will want the backing of someone more senior than you on the journey you're about to take. I've seen a lot of good work thrown in someone's face because the manager's, VP's Director of Operations had no idea where good money was being spent when something was being developed by someone else in some other office under some other direction. Don't put yourself in that situation or under the agony of throwing away so much hard work. If you need help selling the value proposition, reference my prior blog post on making a business case for Analysis Services.

Get a trusted Project Sponsor that knows your work and believes in you. Hopefully someone with signing authority, that has your back if things don't work out as planned. Trust me though, if you do this right, you only have a bright future ahead of you.

Dimensional Model, Reasonably sized

The success of your cube building is directly proportional to the quality of your data model. The reason I say "reasonably sized" is to encourage you to work as Agile as possible. Don't bite off more than you can chew. The opposite of Agile is referred to as Waterfall. Doesn't sound too good, does it. BI projects are best done on small chunks anyway. You'll learn stuff about your data along the way that will make you change your model in ways that will only strengthen dimensional model and cube making it more relevant and useful for end users.

If you're not familiar with dimensional modeling, go out and get a book. I recommend Kimball's Complete Guide to Dimensional Modeling. It comes with some good examples of common business models for different industries. These real world scenarios illustrate modeling concepts intertwined with the theory behind each dimension and fact table.

Validate with the Experts

"Garbage in, garbage out." "You're only as good as your data." Whoever coined these terms must have gone through validation of a data model for sure. :)

Validate your dimensional model with someone intimately familiar with the process your modeling. An analyst or production engineer would be great, but not necessary. Could be a data entry person, like a shipping/receiving clerk, an admin, or project manager. Be selective here. Hopefully there's someone knowledgeable enough to pick out a wrong number from your results and lead you to the right answer. Be prepared for frustration to set in here because its not always clear who has the right answer. The answer may lie between one or more people in different departments, and you may have to step through the process of a complete transaction (order to shipment to receipt to issue to delivery) in order to get an answer you can live with.

If you are the operations expert in your company, then validate your report delivery methods with your Project Sponsor and/or consult your ERP/CRM vendor documentation for different reports to validate against. Your project sponsor will want to get his hands on what you've been working on, so don't forget about him/her. Also, if you are working with an ERP/CRM system, you may have access to a VAR (value-add reseller) that can help you with some pre-built queries. Many times, ERP systems have built-in docs for schema definitions or make them available online. Never guess at what different codes mean in your relational system. Whatever the case, just make sure to do your homework and get feedback.

Alright, let's get on with it

If you've made it through the process of modeling your data and validating it, then congratulations. You've just build your data warehouse! You are 80% there to building your first multidimensional cube.

The whole process of dimensional modeling your data is probably the most time consuming, but also the most rewarding part of your BI project. If you've prepared well enough and stayed small (agile), then building your cube will seem quick and easy. Also, remember that at this point, many good clients can perform adhoc queries and self-service analytics against your data warehouse by itself. Clients like PowerPivot, Tableau, and Qlikview work wonderfully against a data warehouse. The potential is there to expand on your initial modeling ideas. Only problem moving forward is with the missing semantic layer. Remember those Hierarchies, Relationships, KPIs, and Drill-throughs you promised your project sponsor? That's where Analysis Services comes in.

Until next time, good luck with your models! In my next post, we'll start with a data source view that pulls in the tables from your data warehouse and continue with the process of creating a cube.