The operator “Enrich Data by Webservice” of the RapidMiner Web Mining Extension seems to be having issues making connections to URLs over HTTPS protocol. Please contact RapidMiner to get more support.Please help us fix this tutorial by letting us know if you have found a solution or alternative via an email or a comment below. Thank you!
RapidMiner is a great tool for non-programmers to do data mining and text analysis. This is a tutorial on how to do sentiment analysis with RapidMiner. This tutorial uses our free Twinword Sentiment Analysis API.
Requirements
- RapidMiner Studio (download free trial)
- Twinword Sentiment Analysis API Key (get free API key at Mashape)
Step 1) Install Web Mining Extension for RapidMiner
Before going any further, you should already have RapidMiner installed. If not, visit the link above, download and install the full software to start your free trial.
RapidMiner is a great tool already packed with text processing capabilities. In addition, we can use it to connect to third party APIs to do more work, such us connecting to our Twinword Sentiment Analysis API.
However, before we can do this, we need to install an extension that will allow us to send data to the web and capture the response.
First, start RapidMiner and in the top menu, go to Help > Marketplace (Updates and Extensions)…
Once the Marketplace is opened, search for “Web Mining” and install the extension.
With the Web Mining extension installed, you can now connect to REST APIs to process your text and data.
Step 2) Setup the Connection to the API
Go to the “Design” page in RapidMiner.
To connect to our web API, you will need to use the Web Mining extension you just downloaded.
On the left “Operators” pane, find the operator called “Enrich Data by Webservice” listed under Web Mining > Services > Enrich Data by Webservice.
Drag it to the center “Main Process” pane and drop it there.
Select the operator we just dropped in the “Process” pane to edit the “Parameters” on the right pane.
We need to set the following parameters:
query type | Regular Expression |
||||||||
attribute type | Nominal |
||||||||
regular expression queries |
|
||||||||
request method | POST |
||||||||
service method | |||||||||
body |
|
||||||||
url | https://twinword-sentiment-analysis.p.rapidapi.com/analyze/ |
||||||||
separator | |||||||||
delay | 0 |
||||||||
request properties |
|
||||||||
encoding | SYSTEM |
Note: we are using Regular Expression queries to match and grab the four items (“type”, “score”, “ratio”, and “keywords”) we want out of the entire JSON response that we would get back from the API.
After your done, it should look something like this:
Step 3) Setup the Input Text
Now that we have the right settings to connect with the API, we need text to send.
Before we can start, make sure that you have the “Text Processing” extension installed. If not, go back to the Marketplace (Updates and Extensions) to install it, the same way you installed the “Web Mining” extensions.
First, lets create a sample document with sample text. Again, in the left “Operators” pane, find the “Create Document” operator under Text Processing > Create Document.
Drag it to the center “Process” pane and drop it there. Select it so that we can edit the “Parameters” in the right pane.
Then click on “Edit Text…” in the “Parameters” pane to paste in some sample text. For purpose of this tutorial, we will just type something like the following: I love hotdogs. Hotdogs are the greatest. They are hot and delicious.
Now we have a document. Great! However, the operator (“Enrich Data by Webservice”) we set up to connect to the API expects an input type called “Example Set”, not a “Document”.
So, we need to convert the “Document” type text we just created into an “Example Set”. Luckily, there is another operator right next door called the “Documents to Data” operator. You can find the operator under Text Processing > Documents to Data.
Drag and drop it into our “Process” pane and select it.
In the “Parameters” pane, just type text
in the “text attribute” field.
Step 4) Link the Operators Up
You’re almost there! Just connect the operators.
- Create Document out connects to
- doc of Documents to Data and its exa connects to
- Exa of Enrich Data by Webservice and its Exa connects to
- res
After you’re done, it should look something like this:
Step 5) Run It!
All that’s left now is to click run (the blue play icon at the top).
After running it, you should see the “Results” page with our one row with several columns including our “text” about hotdogs and the the four items (“score”, “keywords”, “type”, “ratio”) we used Regular Expression to grab out of the JSON response from the Sentiment Analysis API.
Note: If you need more explanation on the meaning of the score and ratio, please read our blog post about Interpreting the Score and Ratio of Sentiment Analysis.
If something goes wrong, you can go back to the “Design” page and make the necessary changes and run it again.
Here is a link to Twinword’s Free Sentiment Analysis API mentioned in this article.
Good luck. If you have any questions or issues, please feel free to contact us at [email protected].
1 Comment
great post thanks gave me an idea. so still useful 🙂