Categories
Datawarehouse Power BI SSAS

“Power BI for Large Databases with Composite Mode” at South Florida Power BI User Group

Power BI for Large Databases with new Composite Mode presentation by Hector Villafuerte at the South Florida Power BI User Group.

Today’s enterprise business analytics requires run complex analytic queries against large datasets stored in different sources like Azure SQL Database, Azure Analysis Services, Azure SQL Data Warehouse, Apache Hive, Apache Spark. In this session, BI Architect and Microsoft Certified Professional, Hector Villafuerte focuses upon the design, architecture and best practices that allows Power BI to offer the best functionality and performance combination. Throughout the session, we explore various large datasets and witness how to implement Power BI and different data technologies to achieve high performance and visual capabilities.

DATE: Thursday – October 4th, 2018
TIME: 6:00 pm

– ADDRESS:
Nova Southeastern University
Main Campus – Davie
3301 College Avenue
Davie, FL 33314

Categories
Datawarehouse Power BI SSAS Uncategorized

My presentation for SQLSATURDAY 755 South Florida Conference

This is the presentation for my session at SQL Saturday, South Florida, June 9 2018

Link to download PPT: Download Presentation

Categories
Datawarehouse

SQL Server 2016 On-Premise with Row-Level Security with Active Directory (Part II)

In the previous article, I introduced a simple way to implement Row-Level-Security implementation for Microsoft SQL Server 2016 database with Active Directory.

Here, I’ll show you a more efficient way to implement RLS when you deal with millions of records in your database and thousands of users in your active directory.

Below is the original implementation of a RLS security predicate:

CREATE FUNCTION RLS.GeographyAccessPredicate(@GeographyKey AS INT)

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN SELECT 1 AS AccessResult

FROM dbo.DimGeography G

WHERE

G.GeographyKey = @GeographyKey AND

(

(

G.CountryRegionCode = 'US' AND IS_MEMBER('DOMAIN\AW United States') = 1

)

OR

(

G.CountryRegionCode = 'GB' AND IS_MEMBER('DOMAIN\AW United Kingdom') = 1

)

OR

(

G.CountryRegionCode = 'DE' AND IS_MEMBER('DOMAIN\AW Germany') = 1

)

OR

(

G.CountryRegionCode = 'FR' AND IS_MEMBER('DOMAIN\AW France') = 1

)

OR

(

G.CountryRegionCode = 'CA' AND IS_MEMBER('DOMAIN\AW Canada') = 1

)

OR

(

G.CountryRegionCode = 'AU' AND IS_MEMBER('DOMAIN\AW Australia') = 1

)

)

There are two performance issues in this filter predicate. First, there is a WHERE condition that slows down the query performance. The GeographyAccessPredicate function contains OR conditions for each country, which it will denied the use of indexes in the tables.

Second, we use a function called IS_MEMBER, which evaluate whether the current user is a member of the specified Active Directory group. This function will execute poorly when we query tables with millions of rows and we have thousands of active directory users and groups in our domain.

In order to improve the performance, instead of calling IS_MEMBER function, I’m going to create a table in the database to keep Active Directory users and groups. This Active Directory user and group table will allow me to join the data with the table and remove the function IS_MEMBER from the security policy predicate. It will simplify and improve the performance of the query execution.

CREATE TABLE [dbo].[DimActiveDirectoryUser](

[LoginName] [varchar](50) NOT NULL,

[GroupName] [varchar](150) NOT NULL,

CONSTRAINT [PK_DimActiveDirectoryUser] PRIMARY KEY CLUSTERED

(

[LoginName] ASC,

[GroupName] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]

In order to populate this table, we can execute a Power Shell script or create a ADSI linked server. In this article, I’ll use ADSI linked server:

USE [master]

GO

EXEC master.dbo.sp_addlinkedserver @server = N'ADSI', @srvproduct=N'Active Directory Service Interfaces', @provider=N'ADSDSOObject', @datasrc=N'adsdatasource'

EXEC master.dbo.sp_addlinkedsrvlogin @rmtsrvname=N'ADSI',@useself=N'False',@locallogin=NULL,@rmtuser=N'domain\YourUser',@rmtpassword='Your Password'

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'collation compatible', @optvalue=N'false'

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'data access', @optvalue=N'true'

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'dist', @optvalue=N'false'

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'pub', @optvalue=N'false'

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'rpc', @optvalue=N'false'

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'rpc out', @optvalue=N'false'

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'sub', @optvalue=N'false'

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'connect timeout', @optvalue=N'0'

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'collation name', @optvalue=null

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'lazy schema validation', @optvalue=N'false'

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'query timeout', @optvalue=N'0'

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'use remote collation', @optvalue=N'true'

GO

EXEC master.dbo.sp_serveroption @server=N'ADSI', @optname=N'remote proc transaction promotion', @optvalue=N'true'

GO

 

Be sure to replace the values for the following parameters @rmtuser=N’domain\YourUser’,@rmtpassword=’Your Password’.

The following is the script used to synchronize our active directory with the new table DimActiveDirectoryUser in the AdventureWorks database. Notice that I only synchronize the groups that start with the wildcard: “AW*”, since they are the only groups used in my application.

USE AdventureWorksDW2014

GO

TRUNCATE TABLE DimActiveDirectoryUser

GO

DECLARE @GroupName VARCHAR(50)

DECLARE db_cursor CURSOR FOR

SELECT CN

FROM OpenQuery(ADSI,

'SELECT cn, ADsPath FROM ''LDAP://DC=domain,DC=com'' WHERE objectCategory=''group'' AND cn=''AW*'''

)

OPEN db_cursor

FETCH NEXT FROM db_cursor INTO @GroupName

WHILE @@FETCH_STATUS = 0

BEGIN

DECLARE @OPENQUERY nvarchar(4000), @TSQL nvarchar(4000), @LinkedServer nvarchar(4000)

SET @LinkedServer = 'ADSI'

SET @OPENQUERY = 'INSERT INTO DimActiveDirectoryUser(LoginName, GroupName) '

SET @OPENQUERY = @OPENQUERY + ' SELECT sAMAccountName, ''' + @GroupName + ''' FROM OPENQUERY('+ @LinkedServer + ','''

--SET @OPENQUERY = 'SELECT sAMAccountName as Login, CN as Name, GivenName as FirstName,SN as LastName, DisplayName as FullName, UserAccountControl FROM OPENQUERY('+ @LinkedServer + ','''

SET @TSQL = 'SELECT sAMAccountname,givenname,sn,displayname,useraccountcontrol,CN, department FROM ''''LDAP://domain.com/DC=domain,DC=com'''''

SET @TSQL = @TSQL + ' WHERE objectCategory=''''person'''' '

SET @TSQL = @TSQL + ' AND objectClass=''''user'''''

SET @TSQL = @TSQL + ' AND memberOf=''''CN=' + @GroupName + ',OU=AdventureWorks,DC=domain,DC=com'''' '

SET @TSQL = @TSQL + ' '')'

PRINT @TSQL

EXEC (@OPENQUERY+@TSQL)

PRINT @TSQL

FETCH NEXT FROM db_cursor INTO @GroupName

END

CLOSE db_cursor

DEALLOCATE db_cursor

GO

-- Update values to Match Predicate Conditions

UPDATE DimActiveDirectoryUser SET LoginName = 'DOMAIN\' + LoginName, GroupName = REPLACE(GroupName,'AW ','')

 

Notice that the last update to the LoginName and GroupName facilitates a better match in our security predicate.

Finally, the security predicate is simplified. No more OR conditions or IS_MEMBER function in the security predicate. GeographyAccessPredicate is a join between the DimGeography and the DimActiveDirectoryUser tables.

DROP SECURITY POLICY RLS.AWSecurityPolicy

go

DROP FUNCTION RLS.GeographyAccessPredicate

go

CREATE FUNCTION RLS.GeographyAccessPredicate(@GeographyKey AS INT)

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN SELECT 1 AS AccessResult

FROM dbo.DimGeography G

INNER JOIN dbo.DimActiveDirectoryUser U

ON G.GeographyKey = @GeographyKey AND

U.LoginName = SUSER_NAME()

AND G.EnglishCountryRegionName = U.GroupName

GO

CREATE SECURITY POLICY RLS.AWSecurityPolicy

ADD FILTER PREDICATE RLS.GeographyAccessPredicate(GeographyKey) ON dbo.DimCustomer

The performance was improved 2X compared with initial version of the security predicate. You’ll notice this difference when dealing with millions of records and thousand users in your active directory. It also depends of the SQL Server configuration.

The following is the query execution plan with the RLS security.

-- Show Plan with Security

ALTER SECURITY POLICY RLS.AWSecurityPolicy WITH (STATE=ON)

GO

SET SHOWPLAN_ALL ON

GO

SELECT COUNT(*) FROM DimCustomer

GO

SET SHOWPLAN_ALL OFF

When implementing RLS keep in mind that there is always an overhead in performance when querying a secured table with RLS. The performance of the queries results depends on how you implement the WHERE condition logic and this logic can be optimized using indexes in the joined fields and avoiding complex functions in your RLS security policy.

Categories
Datawarehouse

SQL Server 2016 On-Premise with Row-Level Security with Active Directory (Part I)

In this article, I’ll explain a basic use case of Row-Level-Security implementation for Microsoft SQL Server 2016 database.

The AdventureWorks Database supports a fictitious, multinational manufacturing company called Adventure Works Cycles. This company sells products around the world. Sales are ordered from five different countries like Unites States, Germany, United Kingdom, Canada, France and Australia.

Adventure Works Cycles has Sales Teams in each of these countries. These teams should only access to data from their corresponding country.

In the picture below, we can see that Ginger Howell, a sales representative from the United States Sales Team, is trying to get information from the Customer Dimension table (Dim Customer).

When GHowell send the query: “SELECT * FROM DimCustomer” to the database. The DimCustomer table contains customer records not only from USA, but from other countries. The database should retrieve to GHowell only the records corresponding to United States Customers.

SQL Server 2016 introduced a new feature the database engine: Row-Level-Security. Row Level Security (RLS) is a concept that provides security at the table row level within the database layer, instead of at the application layer. RLS is accomplished by using a function (predicate) and the security policy that should be implemented by a database administrator.

We are going to use Adventure Works Database as a sample. You can download the database from this link: https://msftdbprodsamples.codeplex.com/releases/view/125550

Download and restore the Adventure Works DW 2014 Full Database Backup.zip

The following is a simplified diagram that includes only three tables from the Adventure Works database: DimCustomer, DimGeography and FactInternetSales

Now, we have Active directory groups corresponding to each country’s sales team.

We create a SQL Login for each of these Active Directory groups.

With the SQL Logins created, we can implement our security policy.

First, we create a new schema, just to keep SQL objects separate from the existing objects.

CREATE SCHEMA RLS

We create the function that we’ll use as a filter predicate for the security policy.

The predicate function is an inline table valued function. This function returns 1 when a row in the DimGeography table matches the corresponding CountryRegionCode (G.CountryRegionCode = ‘US’) and the user executing the query belongs to the corresponding Active Directory group (IS_MEMBER(‘DOMAIN\AW United States’) = 1).

CREATE FUNCTION RLS.GeographyAccessPredicate(@GeographyKey AS INT)

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN SELECT 1 AS AccessResult

FROM dbo.DimGeography G

WHERE

G.GeographyKey = @GeographyKey AND

(
 (
    G.CountryRegionCode = 'US' AND IS_MEMBER('DOMAIN\AW United States') = 1

  )
  OR

 (
    G.CountryRegionCode = 'GB' AND IS_MEMBER('DOMAIN\AW United Kingdom') = 1
 )
  OR
 (
     G.CountryRegionCode = 'DE' AND IS_MEMBER('DOMAIN\AW Germany') = 1
 )
  OR
  (
     G.CountryRegionCode = 'FR' AND IS_MEMBER('DOMAIN\AW France') = 1
   )
  OR
   (
     G.CountryRegionCode = 'CA' AND IS_MEMBER('DOMAIN\AW Canada') = 1
   )
  OR
  (
    G.CountryRegionCode = 'AU' AND IS_MEMBER('DOMAIN\AW Australia') = 1
  )
)
 

Finally, we add the function as filter predicate on the DimCustomer table:

CREATE SECURITY POLICY RLS.AWSecurityPolicy

ADD FILTER PREDICATE RLS.GeographyAccessPredicate(GeographyKey) ON dbo.DimCustomer

After this our Row-Level-Security is implemented!

Now, we’ll test it by running some scripts. We impersonate a user, GHowell, which belongs to the United States Active Directory.

EXECUTE AS LOGIN = 'DOMAIN\GHowell' -- United States Group

SELECT SUSER_NAME()

SELECT FirstName, LastName, EnglishCountryRegionName

FROM dbo.DimCustomer C

INNER JOIN dbo.DimGeography G

ON C.GeographyKey = G.GeographyKey

The query returns only customers from United States region as expected.

Execute the Revert command to return the execution context back to your user.

REVERT;

Now let’s see what happens behind scenes when the query is executed.

-- Show Plan without Security
GRANT SHOWPLAN TO [DOMAIN\Administrator]
GO
ALTER SECURITY POLICY RLS.AWSecurityPolicy WITH (STATE=OFF)
GO
SET SHOWPLAN_ALL ON
GO
SELECT COUNT(*) FROM DimCustomer
GO
SET SHOWPLAN_ALL OFF

The execution plan shows a simple scan to the DimCustomer table, when the security policy is off.

Now, let’s see how is the execution plan when the security policy is applied.

-- Show Plan with Security

ALTER SECURITY POLICY RLS.AWSecurityPolicy WITH (STATE=ON)
GO
SET SHOWPLAN_ALL ON
GO
SELECT COUNT(*) FROM DimCustomer
GO
SET SHOWPLAN_ALL OFF

When the security policy applied, the execution plan shows an extra join. The filter predicate is being applied and joined to the DimCustomer table.

There is an overhead when Row-Level-Security is applied. This impact in the performance of the queries is affected by the implementation of the filter predicate function.

In our filter predicate function GeographyAccessPredicate, there is a WHERE condition that slows down the query performance. The GeographyAccessPredicate function contains OR conditions for each country, which it will denied the use of indexes in the tables.

Second, we use the function called IS_MEMBER, which evaluate whether the current user is a member of the specified Active Directory group. This also will execute poorly when we query tables with millions rows.

In the next article, I will show a better approach for the predicate function, to speed up the query performance.

Categories
Datawarehouse Power BI

SQL Saturday Presentation – Data Security with Power BI, SSAS, SQL Server and Active Directory

Hector Villafuerte speaking on SQL Saturday South Florida

Saturday, Jun 10, 2017 Conference

http://notes.optimumclick.com/wp-content/uploads/sites/3/2017/06/RLS-SQL-Saturday-1.pptx

Categories
Datawarehouse Power BI

Bringing Security where data lives

The following is a series of articles that explain how we can use Row Level Security (RLS) and Active Directory across all BI technologies, database and tools in the corporate environment.

The goal is to create an integrated and a comprehensive data access control across multiple database technologies using different BI Tools or Apps.

SQL Server 2016 and Power BI started to support Row-Level-Security last year. RLS was already supported by SQL Server Analysis Services (SSAS) with Tabular and Multidimensional for some time.

The following picture shows some BI Tools and Databases used in the Microsoft BI Ecosystem.

On the top we have BI Tools used for authoring, publishing and hosting reports like:

  • Excel, number one BI Tool in the world, which can get data from a large variety of data sources and use it in Excel with Power Pivot or Power Query.
  • Report Builder to author printed paginated reports that can be share and published using the in SSRS Portal.
  • Power BI Desktop to author interactive visualizations with analytics capabilities.
  • Power BI Service, which is the portal that host Dashboards and Power BI reports.
  • We also can have Web Applications using Microsoft .NET MVC or Web Forms with Entity Framework that connect to any of these data sources.

On the bottom we have a variety of databases:

  • SQL Server Relational database, which could be an online transactional processing (OLTP) databases or Online analytical processing (OLAP) database, when use it as a data warehouse.
  • SQL Server Analysis Services (SSAS) Tabular Mode
  • SQL Server Analysis Services (SSAS) Multidimensional Mode
  • Power BI Imported Datasets, which are very similar to Tabular databases, but hosted on the cloud (Power BI Service) instead of on premise.

Traditionally, BI applications used to implement all rules regarding the data security within the application. For example we can have a BI App that implement security that allow users or group of users to access only to some specific set of data. The problem with this approach is it does not allow to reuse this security among other BI Tools. For example we can have security implemented in SSRS for folders and report level permissions or connections. Even though this SSRS security cannot be reused by Power BI nor Excel.

The best place to implement the security is right within the database, where the security can be centralized and maintained in only one place. On this way, any BI Tool or any App that uses the database, will also use the implemented security without the need to implement security in every single Web App or Tool.

The following picture shows a comparison of a database and application features. Business Intelligence and Security used to be common responsibilities of and application. Now databases have more powerful features in terms of business intelligence and security and these responsibilities can be implemented now in the database.

In order to implement a centralized and reusable security, we have to use these two security features available in all databases (Relational SQL, SSAS Tabular, SSAS Multidimensional):

  • Row Level Security (RLS) and
  • Active Directory (AD) Groups.

Implementing security in the database using RLS and AD give us the flexibility to use any BI tool that can connect to our database using an Active Directory account and our database will be protected using centralized security rules implemented right in the database.

In the following articles, I will implemented RLS and AD security across all these technologies and all BI tools. In these articles I will be share code samples, I’ll show How to unit test the security and also How to troubleshoot the security.

The following list is the series of articles to be publish related with this topic:

SQL Server 2016 On-Premise with Row-Level Security with Active Directory (Part I)

SQL Server 2016 On-Premise with Row-Level Security with Active Directory (Part II)

SSAS 2016 Tabular On Premise with Row-Level Security and Active Directory.

Client App Security (MVC, Web Forms, Web API) with SQL 2016 RLS and Active Directory

Power BI Security with Active Directory Synch and On-Premise Data Gateway.

Power BI using SSAS Tabular with Row Level Security.

Power BI using SSAS Multidimensional with Row Level Security.

Power BI – Imported Data with Row Level Security.

Power BI – Direct Query with Row Level Security.

Power BI: Integrate Power BI dashboard with a web application.