Mochiweb to Scalaris example
I’ve created a simple HTTP interface with MochiWeb that allows you to read and write key/Value pairs to Scalaris. The REST “like” interface is very simple:
To write, send a request to: “http://localhost:8002/scalaris/write” with the parameters key=”your_key”, value=”your_value”
To read, “http://localhost:8002/scalaris/read” with the parameters key=”your_key”
The code uses a gen_server process that makes an rpc:call to the Scalaris API.
You can check out the code here: mochiweb-scalaris
How Scalaris stores your data
After installing Scalaris, running a few nodes, and playing with the API, I wanted to figure out how it actually stores the data. Using this diagram: supervisor.pdf I tracked down the source in the file cs_db_otp.erl ( the db was the obvious clue).
The database is a gen_server that wraps calls to the underlying storage - an Erlang gb_tree. What this means is the data is not actually saved to disk, but rather lives in memory. It appears the each chordnode has its own database.
When you store a Key - Value pair, the database actually records a structure like this in the gb_tree:
Key,{Value,WriteLock,ReadLock,Version}
On a write, “WriteLock” is set to true and the Version is incremented. On a read, “ReadLock” is set to true. Most of this logic appears to be controlled through the transaction layer.
Other stuff
- Search? There appears to be the ability to search for keys within a given range via a “get_range”. But I haven’t yet found how to call that from the transaction_api
- No delete. As I mentioned in my earlier write up on using the API, there is no way to actually “delete” a key once it’s set. But hey, the code is open-source and since gb_tree has a delete method, it should be possible to add.
One other thing of note, if you’ve implemented any distributed Erlang you know it’s not recommended to run a cluster of Erlang nodes around the Internet using the built in code. Scalaris implements its own TCP layer (the authors mention this in their Videos) for the nodes to communicate. Check out the comm_layer in the source for some ideas if you need to write your own.
Getting started with Scalaris
After installing Scalaris earlier today, I couldn’t help but jump into the code to see how things work. The users guide shows how to use the Java API and there are some nice examples. But as far as Erlang goes, I couldn’t find any information. Fortunately there are a few unit tests that provided the clues I needed.
Using the Transaction API
First you need to start Scalaris on a few nodes. See “bin/boot.sh” and the “cd_local” scripts. The README describes this. From within one of the erl shells you created with a “cs_local” script you can explore the API. Really pretty easy (and fast):
%% Write a key
> transstore.transaction_api:single_write("name", "dave").
%% Read it back
> {Value, Version} = transstore.transaction_api:quorum_read("name").
> Value.
"dave"
> Version.
0
If you write the same key again, the version field will automatically increment. And you can store Erlang structures as the value if you’re looking to store more complicated data. But I couldn’t find a way to remove a key, so I’m not sure if a “delete” exists right now.
A PubSub API !
Digging around I also found a PubSub API that looks pretty interesting. However, it’s appears fixed on a specific goal: JSONRPC to a URL using the jsonrpc in yaws.
Basically you can subscribe to a Topic (Key) with a callback to a URL (Value). The URL must know how to handle the jsonrpc request because the underlying api will send the request to the page on a publish. See this Yaws JSONRPC for more info. Here’s how it appears to work:
%% Subscribe to the topic "ErlangNews". Publish events to the URL.
> pubsub.pubsub_api:subscribe("ErlangNews", "http://localhost:8000/news.yaws").
%% Get the subscibers to the topic
> pubsub.pubsub_api:get_subscribers("ErlangNews").
["http://localhost:8000/news.yaws"]
%% Publish some news...
> pubsub.pubsub_api:publish("ErlangNews", "Welcome to Erlang!").
Underneath the covers, the publish method attempts to make a jsonrpc call on the method “notify” to the URL. Even with this specific goal of JSONRPC it doesn’t look like it would be to difficult to create your own custom implementation - maybe a web based chat system might be interesting…
Rethinking the database
After all these years of using relational databases, my brain is locked in to their structured approach. Having only key value pairs to work with seems straight forward. But once I started brainstorming through a few prototypes I realized mapping all the data into Scalaris will take a little thinking.
First Look at Scalaris on OS X 10.5
I’ve been itching to try out Scalaris since it hit Google code. Well, I finally got the chance today. Why am I so excited about it? Two reasons: 1) I’ve been playing with my own Erlang version of Amazon SQS I built and would like to see if I can take it to the next level with Scalaris. 2) I’m sure there’s something I can learn from the code; IMO studying quality open-source code is one of the best learning resources.
So first I need to get things running. The only difficult part of the install is the rrdtool requirement. I haven’t used rrdtools before (but I may now…looks useful for some work I’m doing) so I need to install it on my Mac. I tried building from source but quickly found I was missing a bunch of dependencies. So I decided to try macports. It did the trick and made life a bit easier. So assuming you have macports installed, here’s what it took for me to install rrdtool:
Building on Mac OS X 10.5
- sudo port install gettext
- sudo port install pango
- sudo port install rrdtool
This took awhile but worked! Now the easy part: just follow the README in the scalaris source. The build went without a hitch!
Once things were installed, I followed the instructions and created a scalaris.local.cfg and fired up the boot server and 2 nodes.
Here are few screen shots:



I ran into problem on the home screen when I tried to enter a key in the form of a URL. YAWS crashed but of course restarted itself. Not sure what all the other screens are telling me yet… so it’s time to dig into the code.
Update…
It’s not a URL that crashes YAWs, it simply data validation. If a field is missing when the HTML form is submitted it causes the error above. So the problem really has nothing to do with Scalaris. I’ve submitted a small patch to the Scalaris team that fixes the problem.
Using ErlyWeb templates with MochiWeb
Here’s a very simple way to add dynamic templates (views) to a MochiWeb based web front-end. I wouldn’t want to build an entire web application this way, but it’s a quick way to add a simple web interface to your Erlang application.
First I pulled the simple template code from ErlyWeb (erltl.erl) and compiled it with my code.
For this example, here’s a template index.et that shows a list of nodes the Erlang application is connected to.
index.et
<html> <body> <h2>Available nodes</h2> Total online: <% integer_to_list(length(Data)) %> <ul> <% [index(A) || A <- Data] %> </ul> </body> </html> <%@ index(N) %> <li><% atom_to_list(N) %></li>
Now compile it with “erltl:compile(”index.et)” and put in the same path as your mochiweb app (below).
Next setup the Mochiweb application:
-module(test_templates_web).
-export([start/1, stop/0, loop/2]).
start(Options) ->
{DocRoot, Options1} = get_option(docroot, Options),
Loop = fun (Req) ->
?MODULE:loop(Req, DocRoot)
end,
mochiweb_http:start([{name, ?MODULE}, {loop, Loop} | Options1]).
stop() ->
mochiweb_http:stop(?MODULE).
loop(Req, DocRoot) ->
"/" ++ Path = Req:get(path),
case Req:get(method) of
'GET' ->
%% This call returns a list of nodes the Erlang application is connected to -
%% a list like this: [node1@here.com,node2@here.com]
NodeData = nodes(),
%% Pass the data to the template
Outty = index:render(NodeData),
%% Send the output (String) back to the browser
Req:ok({"text/html",Outty});
_ ->
Req:not_found()
end.
Inside the ‘GET’ above, I pass the data to the compiled template, then send it back to the browser.
Obviously I’ve glossed over a few details, but hopefully you get the idea. It may not be as “elegant” as a Rails app, but it’s blazing fast!
Install Erlang on Ubuntu with Capistrano
Here’s a simple task that I use to install Erlang on remote servers using Capistrano and Deprec. Assuming you have an SSH key setup with the remote server, you can run this task from a terminal like this: “miceda:erlang:install”. This downloads the source and does the make/make install dance.
Capistrano::Configuration.instance(:must_exist).load do
# install Erlang 12 on Ubuntu 7.10 (gutsy)
namespace :miceda do
namespace :erlang do
SRC_PACKAGES[:erlang] = {
:filename => 'otp_src_R12B-1.tar.gz',
:dir => 'otp_src_R12B-1',
:url => "http://www.erlang.org/download/otp_src_R12B-1.tar.gz",
:unpack => "tar zxf otp_src_R12B-1.tar.gz;",
:configure => %w(
./configure
;
).reject{|arg| arg.match '#'}.join(' '),
:make => 'make;',
:install => 'make install;'
}
desc "Install Erlang 12B-1"
task :install do
install_deps
deprec2.download_src(SRC_PACKAGES[:erlang], src_dir)
deprec2.install_from_src(SRC_PACKAGES[:erlang], src_dir)
end
desc "Install deps for Erlang"
task :install_deps do
apt.install( {:base => %w(libc6 libncurses5 libncurses5-dev libssl-dev openssl m4 libexpat1-dev)}, :stable )
end
end
end
end
Of course the big benefit is that with Capistrano you can do this across several servers at the same time.
Deprec, Ubuntu 7.0.4 (Feisty), and your Proxy server
The Problem
Your trying to deploy a Rails app using Deprec on Ubuntu running behind a proxy server and getting connection errors.
Possible solution
You need to tell apt-get about your proxy server. Here’s how I did it:
- Create a file ”/etc/apt/apt.conf”
- In the file add the following:
Acquire { Retries "0"; HTTP {Proxy "http://YOURPROXY:PORT";};};
Now edit the proxy information in ”/etc/wgetrc” by uncommenting “use_proxy” and setting “http_proxy”:
use_proxy = on http_proxy = http://YOURPROXY:POST
Finally add the proxy to ”/etc/bash.bashrc”
http_proxy = http://YOURPROXY:POST export http_proxy
Once I did this, everything worked great!
Software to detect when words rhyme
The other night my daughter was telling me about some interesting facts she read in one of her school books. One fact in particular caught my interest:
Orange and Silver are the only two words in the (American)English language that do not ryhme with any other word
My first thought was “How the hell did they figure this out? Did someone go through the entire dictionary and test each word?” Then I thought, “you can probably automate this task with software; but how? What makes two words rhyme?” To me, a rhyme is when two words sound the same. But maybe there is an obscure rule in Grammar I can use to programmatically test a set of words to see if they rhyme?
A Google search turned up this information on using phonetics to detect rhyme The blog talks about using the International Phonetic Alphabet (IPA) to translate words into their phonetic equivalent and then inspecting the words for a match. Check out this example
Aha! I’m getting close. Now if I could just capture the IPA in code and use it to translate words on the fly I’d have the next killa app. However, there’s just one problem. After grabbing the IPA chart it was obvious the translation is based on how something sounds. Even for a human it appears extremely difficult considering accent, dialect, and other factors.
So it seems ( at least right now ) it’d be nearly impossible to write software to detect rhyme.
Create a ShapeFile with Ruby
Here’s a quick snippet on how to install and create a ShapeFile from data in your Model.
The setup
- Download and Install ShapeLib. Make sure to note where the install puts the libshp.so and the shapefil.h (you may need that information later).
- Download ruby-shapelib
- Unzip ruby-shapelib and run: “ruby ./extconfig.rb”. Depending where step 1 put your files,you made to alter some options you pass to this program. Specifically—with-shapelib-dir and—with-shapelib-include
- Once you’ve done that, make sure everything is working right with irb:
$ irb >> require 'shapelib' => true
If you get “true”, you’re good to go.
Simple example
Let’s say we have a table called markers, with the fields lat (float), lng (float), and created_at (datetime). We want to create a shapefile for the points and also collect the time (created_at) as an attribute in the shapefile.
require 'shapelib'
# Create a shapefile from an array of markers
def make_shapefile(markers)
# Create the shapefile.
# First argument: is the name of the file to create
# Second: The shapefile type
# Third: An array of array(s) describing the attribute (name, type, size)
fp = ShapeLib::ShapeFile::new("test1.shp",:Point, [['date', :String, 32]])
# Loop over the markers
markers.each do |m|
point = ShapeLib::Point::new(m.lng,m.lat,{"date" => m.created_at})
fp.write point
end
fp.close
end
# try it out...
make_shapefile( Marker.find(:all) )
If all is working right, you should end up with 3 files: test1.shp, test1.shx, and test1.dbf
If you don’t have one already, here’s a nice open-source application to tinker with your new shapefiles: qgis
Scrape the Wayback machine with this little script
Here’s a little script I use to scrape archived pages from the Alexa Wayback Machine . Basically, it works like this:
- Query Alexa for an old URL you’re looking for and the Years you’re interested in
- Use Hpricot to look in the results for links to archived pages. The pattern is http://web.archive.org/web/200301../url. Where the number is the timestamp and the url on the end is the old page you’re looking for. Return and array of successful matches
- Loop over the results of above and download the pages locally using curl (you could also use wget)
- Save the pages with the name “archive_timestamp.html”
Here’s the code:
require 'hpricot'
require 'open-uri'
urls = %w[http://sample.com http://sample2.com ...]
years = %w[2002 2003 2004]
# Search Alexa for the following URLS and Years
# extract the relevent links from the search result pages
def extract_links_from_search(search_urls=[],years=[])
results = []
search_urls.each do |u|
years.each do |y|
search_alexa = "http://web.archive.org/web/#{y}*/#{u}"
doc = Hpricot(open(search_alexa))
(doc/:a).each do |link|
ul = link.attributes['href']
# Search result pages have the following url, followed
# by the timestamp (20030313094512)
# followed by the search url
if ul =~ /http://web.archive.org/web/d+/http:/
results << ul
end
end
end
end
results
end
def download_and_store_pages(results=[])
results.each do |url|
#Create a file name based on the Timestamp
fn = "archive_#{$&}.html" if url =~ /d+/
puts "Saving as: #{fn}"
`curl #{url} -o #{fn}`
end
end
outp = extract_links_from_search(urls,years)
puts "Getting the data"
download_and_store_pages(outp)
This is quick and dirty and took about 10 minutes to write. It could probably be simplified, but it does the job for me.