Spellchecking with (Cake)PHP

I wrote some cake1.3 libs years ago which would request google’s spellchecker API. This is great for small and unregular lookups.
But as soon as you need to use it more excessive an internal server solution is not only much faster but also capable of high frequent lookups. You can – for instance – check a complete book with hundred thousands of words in seconds.

The library pspell seems to be deprecated in PHP5.3. The way to go is enchant.

Enchant PHP Extension

On windows you don’t have to do much. WAMP with PHP5.3 comes with the Enchant extension right away.
You only need to activate it (php_enchant) by menu or by manually removing the # char for this extension in php.ini.
Don’t forget to restart your Apache.

For linux you might first want to run apt-get install php5-enchant to install the basic library. You will then need to add the extension to php.ini. If someone has any details on this, please let me know and I update this tut.

Once it is running you should see an "enchant" module entry on your phpinfo page.
On my system it seems to be Version 1.1.0.

SpellLib for CakePHP2

The lib is in my Tools plugin and is called SpellLib (in /Lib/).

We need to provide the lib with dictionaries. Those can be found online at different locations. One is here. You can probably use all kinds of dictionaries which end with the extension .dic.
Now store them in you global vendors folder: /vendors/dictionaries/[engine]/ whereas [engine] is your preferred engine (defaults to myspell). If you want to store it in a different path, see the last chapter on possible options.

Once they are in your vendors directory you can check on them:

$this->SpellLib = new SpellLib();
$dicts = $this->SpellLib->listDictionaries();

This should display a list with at least one tag like en_GB or de_DE based on what you downloaded.

Basic usage (with english spell checking):

$this->SpellLib = new SpellLib();
if ($this->SpellLib->check($word)) {
    //everything is fine
} else {
    //contains an array of words that could be the correct ones
    $suggestions = $this->SpellLib->suggestions($word);
}

See the test case for details.

Options

If you want to store those dictionary files in another vendor path, you can configure this by using Configure class or simply by passing the path on to the class:

// in your configs
$config['Spell'] = array(
    'path' => CakePlugin::path('Tools') . 'Vendor' . DS . 'dictionaries' . DS,
    'engine' => ENCHANT_ISPELL,
    'lang' => 'de_DE'
);

// passing it on as `path` param:
$this->SpellLib = new SpellLib(array('path'=>CakePlugin::path('Tools') . 'Vendor' . DS . 'dictionaries' . DS));

To use other languages dynamically, German for example, use the lang param:

$this->SpellLib = new SpellLib(array('lang'=>'de_DE'));

Any feedback is appreciated!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.