sql record counts: Determining Relevance with Least Amount of Queries

Tuesday, February 14, 2012

Determining Relevance with Least Amount of Queries

Suppose I have a table of record IDs and known record keywords like
this...
id|keywords
=================================
134|-PRNTR-FL-PRNT-GRDLN-
167|-PRNT-GRDLN-
198|-GRDLN-
009|-PRNT-
451|-XCL-
893|-BLB-
800|-JST-BLX-
(Note vowels removed as a crude form of compression for this example to
be legible.)
...and someone submits a search form with the keyword values...
XCL PRNT GRDLN
...and I want to return a relevance result like...
relevance|id|keywords
=================================
2|134|-PRNTR-FL-PRNT-GRDLN-
2|167|-PRNT-GRDLN-
1|198|-GRDLN-
1|009|-PRNT-
1|451|-XCL-
...where ids 134 and 167 had 2 found occurrences, the others except 893
and 800 had 1 occurrence, and anything else that didn't match is not
returned. The highest number of occurrences are sorted to the top.
How does one achieve this in ANSI-92 SQL in the least amount of
queries?*untested*
select t.*, relevance
from your_table t
join(
select id, count(*) relevance
from your_table t
join (
select '%-XCL-%' token
union all
select '%- PRNT-%'
union all
select '%-GRDLN-%'
) tokens
on t.keywords like token
) r
on t.id = r.id
BTW, your table seems to violate 1NF

Tuesday, February 14, 2012

Determining Relevance with Least Amount of Queries

No comments:

Post a Comment

sql record counts

Blog Archive

About Me