Home | Hardware | Internet News |Web Hosting |IT Management |Network Storage
LinuxPlanet
Search 
  Power Search | Tips 

 Front Door
 Discussion
 LinuxEngine
 Opinions
 Reports
 Reviews
 Tutorials
 News
 Technology Jobs

 Browse by subject.
Free Newsletter

Java/Open Source Daily
Linux Today
More Free Newsletters

Be a Commerce Partner


















internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

Print this article
Email this article

   LinuxPlanet / Tutorials







Advanced Recoll Setup: Indexing Your Data the Convenient Way
Running Recollindex

A. Lizard
Monday, August 11, 2008 12:30:02 PM

Recollindex is the program that actually does the hard drive indexing.

The following is the dynamic content script. I'm using time to find out how long it takes to run this. I put these in /home/username/.recoll , you can put them where you please as long as the permissions are set to user and executable, for instance:

# chmod a+x recollindex-time.sh

sudo in the script tells the script to execute /sbin/poweroff using root privileges.

======================== script begin
#/bin/sh!
# recollindex-time.sh
# run recollindex for dynamic content once, shut down
# may modify to send create and continuously update a log
rm -rf /home/username/.recoll/log-dynamic.txt
/usr/bin/time recollindex -c /home/username/.recoll/xapiandb-eudora > /home/username/.recoll
sudo /sbin/poweroff
======================== script end

To run:

$ sh /path-to/.recoll/recollindex-time.sh
======================== script begin
#/bin/sh!
# recollindex-static-time.sh
# run recollindex for static content once, shut down
# may modify to send create and continuously update a log
rm -rf /home/username/.recoll/log-static.txt
/usr/bin/time recollindex > /home/username/.recoll/log-static.txt 2>&1
sudo /sbin/poweroff
======================== script end

To run:

$ sh /path-to/.recoll/recollindex-static-time.sh

typical messages from dynamic update:

:../internfile/mh_html.cpp:105:textHtmlToDoc: final transcode had 8 errors for [unknown]
:2:../internfile/mh_html.cpp:105:textHtmlToDoc: final transcode had 129 errors for [unknown]
:2:../internfile/mh_mail.cpp:511:walkmime: transcode failed from cs 'unknown-8bit' to UTF-8
:2:../internfile/mh_mail.cpp:511:walkmime: transcode failed from cs 'DEFAULT_CHARSET' to UTF-8
:3:../rcldb/rcldb.cpp:918:dumb_string: unac failed for [From: solidbusinessopportunity@yahoo.com
To: username@mindspring.com
Date: Sun, 30 Dec 2001 17:22:51
Subject: I COULDN'T BELIEVE IT!

Would you spend $1,000 in order to receive $30,000 in return?

[snip]
================== end log

These error messages are not significant, and can be ignored.

Setting Up the Recoll GUI

Assuming that you've run at least search for your static and dynamic databases (I suggest static first), you can set up the recoll gui (see Figure 1). You can use the defaults except that you'll have to add the dynamic database directory to the static database directory already in the search path.

Open Recoll by Start > Utilities > Local Text Search (recoll)

Set Top Menu > Preferences > External Index Dialog > External Indexes Click Browse. Follow the path until you find the dynamic database directory. Click OK. You'll see that path in the External Indexes window with a checkbox. Check it. You're ready to search.

Searching

I regard the GUI setup as fairly self-explanatory. For a simple search, open Recoll. If you expect to use it a lot, you might want to drag/drop the menu item onto the desktop or add it to the KDE Taskbar as I do. Pull down the menu marked Query Language and find Any term (OR) or All Terms (AND) and put in keywords. Or open Tools and pull down Advanced Search for multiple boolean operators. The best way to become familiar with the UI is to do some searches and find out from experience what the menus and icons do. There is a help menu and documentation is available.

Database updating

As I said, since I run a personal workstation that's usually shut down at night, I don't run a cron job to run these scripts automatically, I start the searches manually before going to bed when a reminder program tells me to do so. As a KDE user, I use Kalarm for this.

Set up two alerts in Kalarm.
Start > Utilities > PIM > Personal Alarm Scheduler (Kalarm) (for KDE 3.x, don't know what the KDE4 or Gnome equivalent is)

After opening a new alert by:
File > New
Set up your reminder message including the script name/path
Click recurrence tab and set the reminder interval... and for a weekly cycle, you can select days of the week. I run my dynamic searches 3x a week, on Monday, Wednesday, and Friday.

Then set one the same way for your static searches with the static script, I'd run this every 3-6 months. You can set months by name, too.

Documentation and further information

file:///usr/share/recoll/doc/usermanual.htm is where the onboard help is kept.
This is the recoll website. Recoll: A Linux Desktop Search Engine That Works.

About the Author: A.Lizard is an Internet consultant who lives in the San Francisco Bay Area. He has been writing technology articles for publication since 1987.

« Back: System Indexing Without Bogging Down

Skip Ahead

1 System Indexing Without Bogging Down
2 Scripted Shutdown
3 Running Recollindex
Figure 1
Figure 1





Linux is a trademark of Linus Torvalds.


internet.com home | search | help! | about us

Jupiter Online Media

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers