Searching text using the SoundEx algorithm
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. It can be a useful way to store names for searching so that similar names can be found. As defined by Wikipedia:
The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling. The algorithm mainly encodes consonants; a vowel will not be encoded unless it is the first letter. Soundex is the most widely known of all phonetic algorithms (in part because it is a standard feature of popular database software, although not included with SQLiteDatabase).
SoundEx has these rules: 1. Retain the first letter of the name and drop all other occurrences of a, e, i, o, u, y, h, w.
- Replace consonants with digits as follows (after the first letter): - b, f, p, v → 1 
- c, g, j, k, q, s, x, z → 2 
- d, t → 3 
- l → 4 
- m, n → 5 
- r → 6 
 
- If two or more letters with the same number are adjacent in the original name (before step 1), only retain the first letter; also two letters with the same number separated by 'h' or 'w' are coded as a single number, whereas such letters separated by a vowel are coded twice. This rule also applies to the first letter. 
- If you have too few letters in your word that you can't assign three numbers, append with zeros until there are three numbers. If you have more than 3 letters, just retain the first 3 numbers. 
This is a Xojo SoundEx function:
Public Function SoundEx(word As String) As String
  Const kLength As Integer = 4
  Var value As String
  Var size As Integer = word.Length
  ' Make sure the word is at least two characters in length
  If (size > 1) Then
    word = word.Uppercase
    ' Convert the word to a character array for faster processing
    Var chars() As String = word.ToArray("")
    ' For storing the SoundEx character codes
    Var code() As String
    ' The current and previous character codes
    Var prevCode As Integer = 0
    Var currCode As Integer = 0
    ' Add the first character
    code.AddRow(chars(0))
    Var loopLimit As Integer = size - 1
    ' Loop through all the characters and convert them to the proper character code
    For i As Integer = 0 To loopLimit
      Select Case chars(i)
      Case "H", "W"
        currCode = -1
      Case "A", "E", "I", "O", "U", "Y"
        currCode = 0
      Case "B", "F", "P", "V"
        currCode = 1
      Case "C", "G", "J", "K", "Q", "S", "X", "Z"
        currCode = 2
      Case "D", "T"
        currCode = 3
      Case "L"
        currCode = 4
      Case "M", "N"
        currCode = 5
      Case "R"
        currCode = 6
      End Select
      If i > 0 Then
        ' two letters With the same number separated by 'h' or 'w' are coded as a single number
        If currCode = -1 Then currCode = prevCode
        ' Check to see if the current code is the same as the last one
        If currCode <> prevCode Then
          ' Check to see if the current code is 0 (a vowel); do not proceed
          If currCode <> 0 Then
            code.AddRow(currCode.ToString)
          End If
        End If
      End If
      prevCode = currCode
      ' If the buffer size meets the length limit, then exit the loop
      If (code.LastRowIndex = kLength - 1) Then
        Exit For
      End If
    Next
    ' Pad the code if required
    size = code.LastRowIndex + 1
    For j As Integer = size To kLength - 1
      code.AddRow("0")
    Next
    ' Set the return value
    value = String.FromArray(code, "")
  End If
  ' Return the computed soundex
  Return value
End Function
You call the SoundEx function like this:
Var result As String
result = SoundEx("Robert") ' R163
result = SoundEx("Rupert") ' R163
result = SoundEx("Rubin") ' R150
result = SoundEx("Ashcraft") ' A261
result = SoundEx("Ashcroft") ' A261
result = SoundEx("Tymczak") ' T522
result = SoundEx("Pfister") ' P236
See also
Full Text Searching topic