Documente Academic
Documente Profesional
Documente Cultură
As soon as you run this statement, all subsequent computations will take place on the SQL Server computer specified in
the sqlCompute parameter.
2. If you decide that you'd rather run the R code on your workstation, you can switch the compute context back to the local
computer by using the local keyword.
R
rxSetComputeContext("local")
For a list of other keywords supported by this function, type help("rxSetComputeContext") from an R command line.
Note
After you've specified a compute context, it remains active until you change it. However, any R scripts that cannot be run in a
remote server context will be run locally.
The R language provides many summary functions, but rxSummary supports execution on various remote compute
contexts, including SQL Server . For more information about similar functions, see Data Summaries in the ScaleR
reference.
2. When processing is done, you can print the contents of the sumOut variable to the console.
R
sumOut
Note
Don't try to print the results before they have returned from the SQL Server computer, or you might get an error.
Results
Summary Statistics Results for: ~gender + balance + numTrans +
numIntlTrans + creditLine
Data: sqlFraudDS (RxSqlServerData Data Source)
Number of valid observations: 10000
Name Mean StdDev Min Max ValidObs MissingObs
balance 4075.0318 3926.558714 0 25626 100000
numTrans 29.1061 26.619923 0 100 10000 0 100000
numIntlTrans 4.0868 8.726757 0 60 10000 0 100000
2. Use the variable ccColInfo that you created earlier to define the columns in the data source.
You'll also add to some new computed columns numTrans, numIntlTrans, and creditLine to the column collection.
R
ccColInfo<list(
gender=list(type="factor",
levels=c("1","2"),
newLevels=c("Male","Female")),
cardholder=list(type="factor",
levels=c("1","2"),
newLevels=c("Principal","Secondary")),
state=list(type="factor",
levels=as.character(1:51),
newLevels=stateAbb),
balance=list(type="numeric"),
numTrans=list(type="factor",
levels=as.character(sumDF[var=="numTrans","Min"]:sumDF[var=="numTrans",
"Max"])),
numIntlTrans=list(type="factor",
levels=as.character(sumDF[var=="numIntlTrans","Min"]:sumDF[var
=="numIntlTrans","Max"])),
creditLine=list(type="numeric")
)
3. Having updated the column collection, you can apply the following statement to create an updated version of the SQL
Server data source that you defined earlier.
R
sqlFraudDS<RxSqlServerData(
connectionString=sqlConnString,
table=sqlFraudTable,
colInfo=ccColInfo,
rowsPerRead=sqlRowsPerRead)
The sqlFraudDS data source now includes the new columns added in ccColInfo.
These modifications affect only the data source object in R; no new data has been written to the database table yet. However,
you can use the data captured in the sumOut variable to create visualizations and summaries. In the next step you'll learn how to
do this while switching compute contexts.
Tip
If you forget which compute context you're using, use rxGetComputeContext(). If the return value is RxLocalSeqCompute
Context, you're using the local compute context.
Next Step
Visualize SQL Server Data using R Data Science Deep Dive
Previous Step
Define and Use Compute Contexts Data Science Deep Dive
2016 Microsoft