NYCPHP Meetup

NYPHP.org

[nycphp-talk] Tuning MySQL Full Text Search

Ben Sgro (ProjectSkyLine) ben at projectskyline.com
Wed Aug 22 12:42:45 EDT 2007


Hello Rob,

I'm happy w/the relevance, but the order isn't right, and too many results 
are being returned (which is my own issue to fix).

> ( (3 * MATCH(title) AGAINST ('term')) + (1 * MATCH(body) AGAINST 
> ('term')) )

That's really cool. I didn't realize you could do that, and that is 
something I'd like to do.
Without the BOOLEAN, the results were really off, minimal results and not 
that accurate.
Once I added BOOLEAN, the results got a lot better.

There is also the problem where common words, aren't returning anything, 
such as a search for
"water". It should however, since the water keyword is very frequent 
throughout the site.

- Ben

Ben Sgro, Chief Engineer
ProjectSkyLine - Defining New Horizons
+1 718.487.9368 (N.Y. Office)

Our company: www.projectskyline.com
Our products: www.project-contact.com

This e-mail is confidential information intended only for the use of the 
individual to whom it is addressed.
----- Original Message ----- 
From: "Rob Marscher" <rmarscher at beaffinitive.com>
To: "NYPHP Talk" <talk at lists.nyphp.org>
Sent: Wednesday, August 22, 2007 12:31 PM
Subject: Re: [nycphp-talk] Tuning MySQL Full Text Search


> On Aug 22, 2007, at 10:50 AM, Ben Sgro ((ProjectSkyLine)) wrote:
>> I'd like to tune this to have different weights for words, because  I'm 
>> not happy with the search results.
>>         $dbObject->DatabaseQuery('SELECT id, title, body, links_to,'
>>                                 . ' MATCH(title, body)'
>>                                 . ' AGAINST (' . $dbObject->Safe 
>> ($searchStr)
>>                                 . ' IN BOOLEAN MODE)'
>>                                 . ' AS score FROM ' . 
>> DATABASE_TABLE_CONTENT
>>                                 . ' WHERE MATCH (title, body)'
>>                                 . ' AGAINST (' . $dbObject->Safe 
>> ($searchStr)
>>                                 . ' IN BOOLEAN MODE)'
>>                                 . ' ORDER BY score DESC',
>>                                 constReturnArray, LOG_LEVEL_DEBUG);
>
> Hey Ben,
>
> Are you not happy with the relevance sorting?  Or is it not returning 
> rows that you think should match... or return too many rows?  What's  an 
> example of how you could weight the words?
>
> You should test this to see if it's true... but I've seen it  mentioned 
> that the score returned by "IN BOOLEAN MODE" is an integer  with the 
> number of terms matched.  Without "IN BOOLEAN MODE", it  gives a floating 
> point number that " is computed based on the number  of words in the row, 
> the number of unique words in that row, the  total number of words in the 
> collection, and the number of documents  (rows) that contain a particular 
> word."
>
> Actually... do you need the "IN BOOLEAN MODE"?  Otherwise, results  are 
> automatically sorted on relevance.
>
> Also... you could put a higher weight on title matches over body  matches:
> ( (3 * MATCH(title) AGAINST ('term')) + (1 * MATCH(body) AGAINST 
> ('term')) )
>
> -Rob
> _______________________________________________
> New York PHP Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> NYPHPCon 2006 Presentations Online
> http://www.nyphpcon.com
>
> Show Your Participation in New York PHP
> http://www.nyphp.org/show_participation.php 




More information about the talk mailing list