January 17, 2017

Snakes on a Couch! Using Python with CouchDB, Part II-- Where do you want to eat?

The Problem

  • October 14, 2010
  • By Akkana Peck

When we left off in Part 1, we had a little CouchDB database of restaurants along with the dates we last visited them.

How can you turn that into an interesting application?

The problem

How often do you stand around with your friends asking "Where do you want to eat?" "I dunno, where do you want to eat?" "I dunno."

Wouldn't it be handy if you had a program to suggest a place you haven't been to in a while?

From Part 1, you already had a view that sorts restaurant by date:

date_mapfn = """function(doc)
  if (doc.type == 'restaurant') {
    emit(doc.last_visited, doc.name);

In Python, retrieve the restaurants in that view like this:

import couchdb
server = couchdb.client.Server('http://localhost:5984/')
db = server['restaurants']

restaurant_list = db.view('bydate/get_restaurants_by_date')
for r in restaurant_list :
    print r

That will print out your restaurants, sorted by date visited:

But for this application, you don't want to see your whole restaurant list. You just want it to pick a single place, chosen at random. How do you choose just one from a long list in a CouchDB database?

Skip and Limit

First you need to be able to skip items. For instance, if you have four restaurants in your database,

restaurant_list = db.view('bydate/get_restaurants_by_date', skip=2)

will skip the first two restaurants and give you only the third and fourth.

You can choose a random restaurant by skipping a random number less than the total number of restaurants. So first get that total:

# Get the total number of restaurants in the database:
totview = db.view('bydate/get_restaurants_by_date')
tot = len(totview.rows)

Then use Python's random module to pass an appropriate skip value:

import random

# Start at a random offset into the view by date:
restaurant_list = db.view('bydate/get_restaurants_by_date',
                          skip=random.randint(0, tot-1))

Why set the maximum at tot-1 rather than tot? Because if you have 4 restaurants, the most you'd ever want to skip is three; you don't want to skip all four!

Now you have a list of restaurants starting at a random point in the list. But you really only need one restaurant. Use limit to specify how many you want:

# Start at a random offset into the view by date:
restaurant_list = db.view('bydate/get_restaurants_by_date',
                          skip=random.randint(0, tot-1),

Now if you iterate over the list or check len(restaurant_list), you'll see there's only one item.

Digression: Efficiency

But wait. You're doing almost the same query twice. Isn't it inefficient to make the database search twice for that restaurant list?

That's where you hit the big difference between CouchDB and traditional SQL databases. In CouchDB, queries -- views -- are something built into the database, part of the design document. Whenever you add a new view, CouchDB builds a structure called a B-tree that gives it fast access to the documents in the view. From then on, as you add new documents to the database, CouchDB updates the B-trees for each view. So adding a new view can be slow, but accessing a view, or any part of it, is very efficient since the B-tree is already there.

Sitemap | Contact Us