RapidMiner tips and tricks #1 How to use SQL Server named instances with RapidMiner Read/Write database operators
July 14, 2013 at 7:29 PM
Tips and tricks. Tip #1 How to use SQL Server named instances with RapidMiner Read/Write to database operators
Hello and welcome to my first of many tips and tricks for RapidMiner. If you are unfamiliar with RapidMiner, it's a Open Source Java based data mining solution. You can visit the official RapidMiner website by clicking here. My plan is to write a short article to provide solutions to problems that I encounter as I learn more about this awesome application.
RapidMiner and database connectivity
There are many operators in RapidMiner that take input data sets and generate models for prediction and analysis. Often, you will want to write the result set of the model to a database. To do this you use the "Write Database" operator.
I was using RapidMiner for web mining by way of the Crawl Web operator. The Example set output of the Crawl Web operator was connected to the input of the Write Database operator. At the time I was using a SQL Server database that I pay for through my web hosting account. Just like most everything in RapidMiner, the setup was easy and worked like a charm. My database size quota was 200MB with my current hosting plan and it became apparent to me that I would quickly run out of space. As such, I decided to use the local SQL Express 2012 named instanced on my machine. This is where the problem was introduced. I couldn't figure out how to successfully setup the database connection in RapidMiner.
RapidMiner, Named Instances, and Integrated Security
The issues that I encountered when trying to setup my local SQL Server 2012 named instanced were as follows:
- If I used the named instance for the server name(localhost\SQLExpress), I was unable to connect. I didn't encounter this problem with my hosting server's database because it was a direct hostname (xxx.sqlserverdb.com). There was no instance name and so the configuration was easy.
- I wasn't sure how to specify integrated security as this is something that you usually specify in the connection string. I didn't encounter this problem either using my hosting database server because I was given a user name and password to connect to the server.
After some research and banging my head against my laptop, I finally figured out the resolution to my problems and I'm here to save someone else the headache.
For the named instance issue, there is a trick that is not readily apparent to get this to work. You set your database server name as per usual, in my case, localhost, however, when you specify the database name, you include a semicolon (;) followed by instance=<instance name>. So for my local server instance (localhost\sqlexpress), I set the Host value to localhost and the Database scheme value to mydatabasename;instance=sqlexpress .
As far as the integrated security requirement, all you need to do is make sure that you have the latest JTDS SQL Server driver from here. Once you download the zip file, you'll need to extract the file jtds-1.3.0-dist.zip\x86\SSO\ntlmauth.dll and place it in your windows\system32 directory. This will insure that you have the driver with the capabilities of using the integrated security. Once this file is in place, you simply leave the username and password values blank. Here is a screen shot of the Manage Database Connections window in RapidMiner for your reference.
Well that about wraps it up. Please leave a comment if you have any questions.
Until next time,