
Are the benefits of RubyMine worth the order of magnitude increase in memory usage over Emacs?

Are the benefits of RubyMine worth the order of magnitude increase in memory usage over Emacs?
One of my labor of loves is a small community blogging site called Yakkstr. There are a few hundred active users and the site allows them to subscribe to posts so they can be notified when there are new comments on the post. I’ve wanted to build a collaborative filter that will give users a list of posts they are likely to be interested in and may have missed (i.e. didn’t subscribe to it). This is a fairly standard collaborative filtering problem.
Note: The trick with mahout is choosing the right classes for similarity, neighborhood, and recommender.
I’ll use users post subscriptions to determine which other users they are most similar to. The easiest way to get data into mahout is through a csv file. The example below has two fields: user_id, post_id
4,1 7,2 4,4 1,4 4,3 8,1 8,3 4,5 4,6 6,6 ...
The full data from Yakkstr as of today includes 21k subscriptions. Getting this into Mahout is straight forward, I’ll use the FileDataModel class.
1 2 | |
Next I need to choose the similarity metric I want to use. Since the data is binary, a user has subscribed to a post or they haven’t, I’ll use the TanimotoCoefficientSimilarity which is a good way to get the similarity between sets. We’ll also setup the neighborhood, using NearestNUserNeighborhood, this will allow us to get the N most similar users to a given user.
1 2 3 4 5 | |
The final step is to choose the mahout recommender implementation I want to use. Since I want recommendations for a user based on binary preference data I’ll use GenericBooleanPrefUserBasedRecommender.
1 2 3 4 5 | |
That’s it, a collaborative filter using JRuby and Lucene in 10 lines of code (not counting the requires). Full source below
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | |
I’ve long believed that the core value of JRuby, much like a killer app for an OS, will be the killer wrappers of good Java libraries. There are all these amazing Java libraries that are out of site, out of mind, for the average rubyist. In a recet post I discussed Lucene, another awesome JRuby library for which there isn’t a good MRI equivalent.
If you’re already proficient with elisp and emacs you can see my config file here and everything I’m using was installed via package.el from the marmalade repo.
Configuring emacs is getting easier and easier due to package.el and Marmalade.
The first step is to install package.el if your’e using emacs 23 or lower (package.el will be included with emacs 24).
$ cd ~/Library/Preferences/Aquamacs\ Emacs/
$ curl -o package.el http://repo.or.cz/w/emacs.git/blob_plain/1a0a666f941c99882093d7bd08ced15033bc3f0c:/lisp/emacs-lisp/package.elNext, edit ~/Library/Preferences/Aquamacs Emacs/Preferences.el
Restart Aquamacs. You can now view all packages in marmalade with M-x package-list-packages and install them by click on the name which will open a pop up and clicking on ‘install’.
Here are the packages I have installed.
A quick rundown of my favorite features.
I use anything in place of switch-buffer, try it’s great. I’ve also added the anything-git-project function which will match against all files in the git repo that the current buffer is in. This is like CMD-t in textmate.
I’ve made ido much more friendly. Seeing is believing, so try it.
I haven’t found the perfect theme, but color-theme-sanityinc-solarized is very close.
First install leiningen, instructions are here
Install the swank-clojure plugin
$ lein plugin install swank-clojure 1.3.2Install clojure-mode via package.el (M-x package-list-packages)
Now let’s create a new clojure project with lein
$ lein new test-project
$ cd test-project
$ lein depsIn Aquamacs, open one of the clojure files from the project then run (this can take a few seconds)
M-x clojure-jack-in
Yay, you’re now in slime with clojure and the classpath for your lein deps setup properly. I’m a big fan of paredit and have a hard time writing lisp without it these days. Give it a try if you aren’t familiar with it, but it will take a day or two to get used to it. This Cheatsheet will help.
Lastly, you can see my entire emacs configuration here
I LOVE Ruby. The language is fun to use, the community is vibrant with good taste, and the ecosystem is diverse. JRuby is a major part of this diversity, and many of the rubyists I run into don’t take advantage of it. sadface
This is a mistake, now let’s see how easy it is to use lucene in JRuby. Lucene is a search engine, but the components of a search engine (tokenizers, term frequencies, etc) are very useful on their own. Below are some examples. I left out some of the code that setups the modules in JRuby, you can check out the full runnable source on github.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | |
Side Note: The Lucene API is awful, and it’s use is ugly even in JRuby, but it’s not hard to wrap it in a warm blanket of ruby. I once helped do this, but it hasn’t been updated to use the latest version of Lucene. It’s a great library if you don’t care about that so check it out.
The following passage from
The Algorithm Design Manual jumped out at me as I read it.
“The heart of any algorithm is an idea. If your idea is not clearly revealed when you express an algorithm, then you are using too low-level a notation to describe it”
1 2 3 4 5 6 | |
1
| |
I wonder how much of a hit to productivity this adds for each programmer that reads the code and what affect this has on bug rates given that there is a measurable relationship between LOC and bugs.
Side Note: This isn’t about static vs dynamic typing. The scala code for this is virtually identical to the ruby code.