April 23, 2019

Advanced Tips for Search-and-Replace in Linux

Search and Replace Power Tools

  • September 29, 2009
  • By Juliet Kemp

Juliet Kemp

In my previous article about regular expressions, I gave some examples of ways in which you can use them on the command line, with various utilities. Regexps can also be used within many text editors (sometimes with a slightly different syntax, but the gist is the same). I'll use Vim and Emacs as examples; for different editors you may need to check the manual for the syntax details.

Search-and-replace is likely to be the operation you'll most often use regexps for in an editor. First let's look at a straightforward non-regexp search-and-replace. Let's say that you've just decided to rename a variable from foo to fooOne. In Vim, hit Esc for command mode, then use this command:

% means that the operation should be carried out throughout the whole document. The important part is s/foo/fooOne/, which means "replace every instance of 'foo' with 'fooOne'". The final g means "global"; without this you'll just replace the first instance on every line, but with it, you replace every occurrence.

To use this search-and-replace pattern in Emacs, hit M-x then type replace-string RET foo RET fooOne.

However, while this non-regexp operation would replace foo with fooOne, it would also replace foobar with fooOnebar, which you probably didn't want. To get around this, use the word boundary markers \< and \>:

This restricts the replacement to occur only when 'foo' exists as a word on its own (with a word boundary character on each side of it). In Emacs:
M-x replace-regexp RET \ RET fooOne


Backreferences (as used in the previous tutorial) can also be very useful. For example, say you wanted to change all the date references in a file from US-style (09/22/09) to UK style, with long year and a dot instead of a slash (22.09.2009). This regexp would do the trick in Vim:

For Emacs, use:
M-x replace-regexp RET \<\([[:digit:]]+\)/\([[:digit:]]+\)/\([[:digit:]]\{2\}\)\> RET \2.\1.20\3
OK, that looks quite complicated! First of all, let's note that in vim, we use # rather than /, giving us s###g rather than s///g. This makes it easier to read if you're looking for / in the pattern, and also means that you don't need to escape any / characters.

As discussed in the previous article, each pair of escaped brackets, \(PATT\), store a backreference to PATT. Here we have three backreferences, with a word boundary in front and afterwards (the \< and \>), and separated by a slash between each of the backreferences (as in 09/22/09).

The first pattern we're looking for is \d\+: this means at least one digit character (\d). So this will match 9, 09, 12, etc. In Emacs, this is written [[:digit:]]+ (there is no need to escape the + in Emacs regexp syntax, as you must do in Vim). You can also use [[:digit:]] instead of \d in vim if you prefer.

The second backreference pattern is the same as the first one, to match the number of days. The third pattern, \d\{2\} matches exactly 2 digit characters (\{n\} matches exactly n of the previous character type), because years aren't usually written as single digits.

The replace string is then straightforward: reorder the three backreferences so that the day digits come first, then the month, then the year with 20 in front of it, all separated by a period.

Most Popular LinuxPlanet Stories