Wednesday, March 14, 2012

Importing a Library in SAS Enterprise Miner OnDemand

Most of the Kaggle datasets have been added to a folder associated with our class on the SAS Enterprise Miner Ondemand servers. As an alternative to uploading data files yourself using the "File Import" node you can access these datasets by adding the folder as a library to your project. These are the seps to do this.

1) Click on the name of your project in the project tree (upper left of the Enterprise Miner interface) and from the Properties Panel click on the "..." button next to "Project Start Code"



2) Add the following piece of SAS code to the project start code window

libname KAGGLE "/courses/u_dit.ie1/i_610146/c_3477/KaggleDatasets";  

and hit "Run Now" a couple of times before hitting OK.



3) Now you can add a data source to your project as normal and when it comes to selecting from a library a new library called Kaggle should be present in the list. The list of datasets within this librry have names that should show an obvious connection back to the Kaggle contests. 



4) Continue as normal.

There are some problems emerging with the bigger datasets so I will keep working on these. If you are having any problems, or need a dataset that isn't present (try uploading it yourself using the File Import node) just give me a shout.

3 comments:

  1. Some people have been receiving authorisation errors when they have tried to connet to the Kaggle libraries on the SAS server. The exact libname statement you need to use depends on the Enterprise miner class you downloaded your licence for.

    There are two classes you are likely to be a member of. If you downloaded your licence for "DT286 Machine Learning" then use the following libname statement:

    libname KAGGLE "/courses/u_dit.ie1/i_610146/c_2077/KaggleDatasets";

    Alternatively if you downloaded your licence for "SAS EMiner Certification Revision" then you should use the libname statement in the original post:

    libname KAGGLE "/courses/u_dit.ie1/i_610146/c_3477/KaggleDatasets";

    On another note the credit scoring and cars kicked competitions seem to have the data that is most straightforward to use.

    ReplyDelete
  2. Slight revision to above. The best thing to do is to make sure to register for a licence for the "SAS EMiner Certification Revision" module as there are some problems with data access for the other one.

    ReplyDelete
  3. I am not in your class (not even on the same continent), but I have been struggling for 6+ hours to load my course datasets on SAS enterprise miner on demand to no avail. Thank you so so much, this was extremely helpful. Arnold.

    ReplyDelete