Microsoft SQL Server and ML Modeling with Python

 

Microsoft SQL Server + Machine Learning: 


Microsoft has been always known as a company which walks with the Era, because they develop significantly remarkable products with extensive features, like Power BI for Reporting, SQL Server Database and SSDT (SQL Server Data Tools), and other tools in Business intelligence field. 

Microsoft SQL Server has recently introduced support for Python and R for Data Science and ML Modeling in SQL Server 2017 version and continuously developing support foPython and R language by providing the extensive feature. 

Python & R can be used to provide advanced analytics solution to modern business problems like  

  • Performing Advance ETL which is not possible with Basic SQL. 

  • Fraud detection for credit cards 

  • Advance product promotion analysis 

  • Generating Predictive Analysis and Reports for Business 


Prerequisites 

  • * SQL server 2017  
  • * Python and R services (this needs to be installed while installing SQL server) 

Let’s Start: 

To start creating ML Modeling in SQL Server. We will require Python and R support, which needs to be installed while setting up the SQL Server installation. 

If Python support is not installed, then be install as add-on service in SQL Server. 

To verify, if Python is present, we need to run the below script in SSMS query window. 


--enable external script in sp_congigure 

EXEC sp_configure "external scripts enabled"1 

RECONFIGURE WITH OVERRIDE; 

--script to test python on SQL server 

EXECUTE sp_execute_external_script 

@language =N'Python', 

@script=N'import sys 

p = "python version" 

print( p , sys.version) 

print("hellow from the SQL server")'; 

GO 

 

Now we have Python 3.5.2 version in our SQL Server. 

Microsoft SQL Server provides Continuum Analytics Python package which includes several different Python models for Data Science and Advanced Analytics like Pandas, SklearnNumpy etc. If a new python module is required for our analysis, then it can be installed with PIP. 

 

Note: Make sure SQL Server Launchpad services are up and running because Launchpad service is required to execute external scripts (Python or R). 

 

 

Step 1:  Creating a Table for Model storage 

First, we need to create a table which will be saving machine learning trained model, so that a one-time trained model can be used multiple times for predictions. 


Note: When complex algorithms are being used, it required to consume a high-volume CPU and other resources. So, it is the best practice to save our trained models to saves time. 


Code:  To create ML Model tables in SQL Server. 


CREATE TABLE ML_py_models ( 

  model_name VARCHAR(30NOT NULL DEFAULT('default model') PRIMARY KEY, 

  model VARBINARY(MAXNOT NULL 

); 

GO 


Step 2: SP to Creating and training ML Model   

Create a custom SP (Stored Procedure), which will create a model and return a pickle file to save in the SQL Server table. 

The ML model which we are creating, will use Amazon stock data and predict stock price 10 days into the future.   


Code to create Stored Procedure: 

CREATE PROCEDURE generate_ML_py_model (@trained_model varbinary(maxOUTPUT) 

AS 

BEGIN 

EXECUTE sp_execute_external_script 

@language = N'Python' 

, @script = N' 

from pandas import read_csvDataFrame 

from sklearn.linear_model import LinearRegression  

#import matplotlib.pyplot as plt 

#import quandl 

from sklearn import preprocessing, cross_validation 

import numpy as np 

import pickle 

data = read_csv("E:\\amazon_data.csv") 

forcast = int(10) 

data = data[["Adj. Close"]] 

data["prediction"] = data[["Adj. Close"]].shift(-forcast) 

feature = np.array(data.drop(["prediction"],1)) 

#feature = np.array(data["Adj. Close"]) 

label = np.array(data["prediction"]) 

X = preprocessing.scale(feature) 

y = label[:-forcast] 

forcast_value = X[-forcast:] 

X= X[:-forcast] 

X_trainX_testy_trainy_test = cross_validation.train_test_split(X,ytest_size =0.2) 

lr = LinearRegression() 

lr.fit(X_trainy_train) 

#Before saving the model to the DB table, we need to convert it to a binary object 

trained_model = pickle.dumps(lr)' 

, @input_data_1 = N'' 

, @input_data_1_name = N'' 

, @params = N'@trained_model varbinary(max) OUTPUT' 

, @trained_model = @trained_model OUTPUT; 

END; 

GO 

Developing custom SPs are also possible for different parameter like features and label’s according to the Dataset and creating different ML models with several other machine learning algorithms. 

These different models can be saved with pickling and used later. 


Step 3: Saving the trained model into Table 

The Stored Procedure which was created earlier is used to train the ML model and save it into the table ML_py_models.  

 

 

Step 4: Creating a Stored procedure to use the trained model 

A custom SP which will use these trained models and gives the expected output. 

Code: Below SP will be using a trained model to gives the prediction outputs. 


CREATE PROCEDURE ML_py_predict (@model varchar(100)) 

AS 

BEGIN 

  DECLARE @py_model varbinary(max) = (select model from ML_py_models where model_name = @model); 

  EXEC sp_execute_external_script 

        @language = N'Python', 

        @script = N' 

import pickle 

from pandas import read_csvDataFrame 

from sklearn.linear_model import LinearRegression  

from sklearn import preprocessing, cross_validation 

import numpy as np 

 

data = read_csv("E:\\amazon_data.csv") 

forcast = int(10) 

data = data[["Adj. Close"]] 

data["prediction"] = data[["Adj. Close"]].shift(-forcast) 

feature = np.array(data.drop(["prediction"],1)) 

X = preprocessing.scale(feature) 

forcast_value = X[-forcast:] 

X= X[:-forcast] 

lr = pickle.loads(py_model) 

predictions = lr.predict(forcast_value) 

df = DataFrame({"Amazon_Predicted_stock": predictions}) 

OutputDataSet = df 

print(df)' 

, @input_data_1 = N'' 

, @input_data_1_name = N'' 

, @params = N'@py_model varbinary(max)' 

, @py_model = @py_model 

with result sets (("Amazon_Predicted_stock" float)); 

END; 

GO 

 
Step 5: Predicting output by using the trained model 

Now we will call SP to get out predictions, which you can see in the output.  

As our model was designed to predict stocks for 10 days ahead in the future, So we are getting 10 consecutive values. 

 

 

 

This output result set can be used to generate a predictive report for business. 

This is a simple example of using SQL Server Advanced Analytics with Python. We can develop more usable and custom Stored Procedure for creating and training ML models to solve different business requirement and needs. 

 

Conclusion: 

As Data Science and Machine Learning are emerging technologies in today’s era, every business wants to get maximum ROI (Return oInvestments) by using Advanced Analytics. Microsoft is focusing on providing the best solution for their current products like Power BISQL Server, SSRS, SSAS and SSIS to make BI (Business Intelligence) more efficient and effective. 

Microsoft is continuously enhancing SQL server capabilities in a preview of SQL server 2019 version Microsoft providing support for HDFS and Spark framework support for simplifying big data analytics for SQL Server users. 

 

Comments