2012-04-06 74 views

回答

5

UPDATE我移动我的原始答案down

我有一个奇怪的建议

您可能需要使用m ysql工具,叫做myisam_ftdump

下面是从样品FULLTEXT转储在我原来的答复

C:\MySQL_5.5.12\data\sandro>myisam_ftdump -vc txtdata 1 
     2   0.4054651 everyhing 
     2   0.4054651 impossible 
     1   1.3862944 knew 
     3   -0.4054651 know 
     2   0.4054651 nothing 
     1   1.3862944 people 
     2   0.4054651 possible 
     1   1.3862944 probable 
     1   1.3862944 something 

如果您可以生成这是一个文本文件,你可以有PHP解析它,你正在寻找的字。

原来的答案

带或不带布尔模式,答案是否定的。

但是,可以显示基于词出现和整个字符串长度的排名如下:

样本数据

DROP DATABASE sandro; 
CREATE DATABASE sandro; 
use sandro 
CREATE TABLE txtdata 
(
    id int not null auto_increment, 
    txt VARCHAR(255), 
    primary key (id), 
    FULLTEXT (txt) 
) ENGINE=MyISAM; 
INSERT INTO txtdata (txt) VALUES 
('I know Nothing is possible'), 
('We know nothing is impossible'), 
('I knew everyhing is possible'), 
('We know everyhing is possible'), 
('For may people something is probable'); 

这里是各种搜索排名

mysql> SELECT *,MATCH(txt) AGAINST ('possible knew') as score FROM txtdata; 
+----+--------------------------------------+--------------------+ 
| id | txt         | score    | 
+----+--------------------------------------+--------------------+ 
| 1 | I know Nothing is possible   | 0.3919430673122406 | 
| 2 | We know nothing is impossible  |     0 | 
| 3 | I knew everyhing is possible   | 1.73200523853302 | 
| 4 | We know everyhing is impossible  |     0 | 
| 5 | For may people something is probable |     0 | 
+----+--------------------------------------+--------------------+ 
5 rows in set (0.00 sec) 

mysql> SELECT *,MATCH(txt) AGAINST ('possible know') as score FROM txtdata; 
+----+--------------------------------------+--------------------+ 
| id | txt         | score    | 
+----+--------------------------------------+--------------------+ 
| 1 | I know Nothing is possible   | 0.3919430673122406 | 
| 2 | We know nothing is impossible  |     0 | 
| 3 | I knew everyhing is possible   | 0.3919430673122406 | 
| 4 | We know everyhing is impossible  |     0 | 
| 5 | For may people something is probable |     0 | 
+----+--------------------------------------+--------------------+ 
5 rows in set (0.00 sec) 

mysql> SELECT *,MATCH(txt) AGAINST ('impossible knew') as score FROM txtdata; 
+----+--------------------------------------+--------------------+ 
| id | txt         | score    | 
+----+--------------------------------------+--------------------+ 
| 1 | I know Nothing is possible   |     0 | 
| 2 | We know nothing is impossible  | 0.3919430673122406 | 
| 3 | I knew everyhing is possible   | 1.340062141418457 | 
| 4 | We know everyhing is impossible  | 0.3919430673122406 | 
| 5 | For may people something is probable |     0 | 
+----+--------------------------------------+--------------------+ 
5 rows in set (0.00 sec) 

mysql> SELECT *,MATCH(txt) AGAINST ('impossible know') as score FROM txtdata; 
+----+--------------------------------------+--------------------+ 
| id | txt         | score    | 
+----+--------------------------------------+--------------------+ 
| 1 | I know Nothing is possible   |     0 | 
| 2 | We know nothing is impossible  | 0.3919430673122406 | 
| 3 | I knew everyhing is possible   |     0 | 
| 4 | We know everyhing is impossible  | 0.3919430673122406 | 
| 5 | For may people something is probable |     0 | 
+----+--------------------------------------+--------------------+ 
5 rows in set (0.00 sec) 

mysql> 
+0

的结果问题在于分数应该标准化,对我来说这似乎是不可能的,因为它取决于行数和其他完全动态的因素。那么我想我必须在PHP中做我想做的事情? – 2012-04-06 18:58:28

+0

以及如何使用该转储文件计算单词?我的意思是我没有看到它连接到结果的行 – 2012-04-07 12:59:35