CentOS 5.8 MySQL, PHP, and Sphinx Install and Configure

This is how I installed Sphinx on a Virtual Machine with the same versions of PHP and MySQL as my production server. I realize that MySQL is many major versions behind, but the version of CPanel on my production server does not support MySQL 5.5.x.

Step 0: If you have Cpanel or an equivalent package manager, make sure to disable automatic MySQL updates. Otherwise, when Cpanel tries to upgrade MySQL to 5.5.x, things will break.

Step 1: Install CentOS. I used the Net Install so as to only download what I need to get a bare server up and running. Follow this as a good tutorial:

http://www.if-not-true-then-false.com/2010/centos-netinstall-network-installation/

Step 2: Install MySQL 5.1 from RPM packages. If you need a different version, then install it as needed. Personally, I look forward to when I can replace my MySQL 5.5.x version with Percona 5.5.x. Because the Remi repositories have migrated to 5.5, but CentOS 5.x is stuck on 5.0, I need to follow these instructions to get 5.1.62. However, the script failed, so I manually downloaded the RPMs to a directory them ran rpm to install them. Make sure to get all RPMs of the correct architecture. I missed one and it caused quite the headache later.

rpm -Uvh ${PERL_FILE}
rpm -Uvh ${MYSQL_SHARE2_LIB}
rpm -Uvh --force ${MYSQL_SHARED_LIB}
rpm -Uvh ${MYSQL_DEVEL__LIB}
rpm -Uvh ${MYSQL_CLIENT_LIB}
rpm -Uvh ${MYSQL_SERVER_LIB}</code>

Be sure to set the root password.

mysqladmin -u root password NEWPASSWORD

Once MySQL is installed, you can check it at the command line.  Now its time to install Sphinx. Download the rpm and the source tarball from here. I used 2.0.4. Install the rpm as usual. While its installing, unpack the source files. Additionally, download the MySQL 5.1.62 source files from MySQL’s website and unzip them. To use the SphinxSE plugin, you have to compile it, then install it manually.

cp -r sphinx-2.0.4-release/mysqlse mysql-5.1.62/storage/sphinx
cd mysql-5.1.62
sh BUILD/autorun.sh
./configure
make

You may see some documentation to use

./conifgure --with-plugins=sphinx

If you use that method, Sphinx is compiled into MySQL, not as a plugin. I wanted to use it as a plugin so I could leverage the RPM package management. If you compile Sphinx into MySQL, skip installing it my RPM, and follow the “make” command with “make install”. Once the compilation is complete, do this:

cd ./mysql-5.1.62/storage/sphinx
ls -la

There should be a directory called .libs. This is where all of the shared object files exist.

cd .libs
ls -la

You should see files like ha_sphinx.*. If you only see sphinx.* files, it typically means you used the –with-plugins=sphinx argument. This means you you need to rerun ./configure forward. If you followed these directions, but they still don’t show up, its time to get your google on. Once you locate ha_sphinx.so and all of its friends, you need to copy them into your MySQL plugins directory (not in the source area). For my install:

cp ha_sphinx.* /usr/lib64/mysql/plugin/.

Some blogs report /var/lib/mysql/plugins path, but that must be for other distributions or versions of CentOS. YMMV. Once the files are copied, either via mysql command line or PHPMyAdmin execute these commands.

INSTALL PLUGIN sphinx SONAME 'ha_sphinx.so';
SHOW ENGINES;

If the first command does not fail, then Sphinx should show up in the list when SHOW ENGINES is executed. If all is good so far, then the next step is to setup the sphinx.conf file, index your data, setup your sphinx index table, then query your index table joined to your data. Your sphinx.conf file should be located at /etc/sphinx/sphinx.conf.

source some_unique_name_src
{
type = mysql
sql_host = localhost
sql_user = your_db_user
sql_pass = your_db_password
sql_db = your_db_name
sql_port = 3306
sql_query = SELECT id, UNIX_TIMESTAMP(date_added) as date_added, field_1, field_2 from table_to_be_indexed
sql_attr_uint = id
sql_attr_timestamp = date_added
sql_query_info = SELECT * FROM documents WHERE id=$id
}

index some_unique_name_index
{
source = some_unique_name_src
path = /var/lib/sphinx/some_unique_name
docinfo = extern
charset_type = sbcs
}

The rest of the sphinx.conf file was stock. You can optimize these settings, I’m sure they are not already.Notice that the source of the index is the name of the source above. These were the only changes I made to get it up and running. YMMV. Once your conf file is complete, its time to try your hand at indexing. From the command line:

sudo -u sphinx indexer --all --rotate

I don’t think the –rotate is important the first time. I need to search the docs regarding its role. Either way, you should not get any errors. It should take less than a few minutes depending on your data size. Next, get searchd to start without erroring out.

service searchd start
ps aux | grep searchd

If the second command does not return searchd, try running it at the command line. I received errors when run at the command line, but not the service. Once I debugged the errors, “service searchd start” worked flawlessly. Once this is done, create a table in the MySQL database of table_to_be_indexed like this:

CREATE TABLE sphinx_index
(
id INTEGER UNSIGNED NOT NULL,
weight INTEGER NOT NULL,
query VARCHAR(3072) NOT NULL, group_id INTEGER, INDEX(query)
) ENGINE=SPHINX CONNECTION="sphinx://127.0.0.1:9312/some_unique_name_index";

Note that the CONNECTION has the index name in it. Also, the documentation uses localhost instead of 127.0.0.1. I could not get this to work. It might be affected by your firewall, your MySQL settings, or even SELinux. Google has some details, but its sketchy. Before we continue lets review our steps:

  1. Install MySQL from RPM from mysql.com
  2. Compile MySQL and SphinxSE from source downloaded from mysql.com and sphinxsearch.com
  3. Copy shared object (.so) files to MySQL plugin directory. Install the plugin from within MySQL
  4. Setup sphinx.conf file
  5. index MySQL table
  6. Run searchd
  7. Create Sphinx table in database to query

Now, if everything worked as it should (ha!), its time to run the MySQL query joining your data table and your index table. Here is the query:

SELECT *
FROM `sphinx_index`
JOIN  `table_to_be_indexed`
ON `table_to_be_indexed`.`id` = `sphinx_index`.`id`
WHERE `sphinx_index`.`query` = 'terms to search your index; mode=any';

query should be the query terms for which you want to search. Mode is defined in sphinx’s documentation. If everything is working, this should return results. My first few issues were searchd not starting, and it was not available via localhost, 127.0.0.1 was required. Hopefully, this helps everyone. I know I have lots of optimizing, but if you have any issues or improvements, let me know. Good luck!

P.S. Regarding performance, I am seeing anywhere between 5x and 20x improvement in fulltext searching, plus adding data to the database is 100x faster, while index creation is taken outside the database server, allowing it to run in parallel, not tying up database resources, or even on another machine.

Leave a Reply

Your email address will not be published. Required fields are marked *