I need to return all the text results(s), if any, that share the maximum length left bounded substring common to the search string.
Given a search for "StackOverflow" in a table column containing
"Stack",
"Sta",
"StackOv",
"StackOverthrow",
"StackOverSlow",
"StackFlow",
"Soverflow",
"StackOverCrow",
"StackOverSlow",
etc.
the query would return "StackOverthrow" as it contains the greatest number of matching characters, as well as StackOverSlow and StackOverCrow in a unique result set.
Currently Im doing something inefficient which is to start with a LIKE search for the first characters and to continue repeating and extending the search string until nothing is found, and keeping the last good result.
i.e.
select names from table where name like 'XX%';
"S" ->Results
"St"->Results
. .
"StackOver"->Results
"StackOverf"-> No results (Last result returning items beginning with StackOver etc as being the correct answer)
I know that this approach is extremely inefficient, can anyone provide a single query to achieve this result? I know I could search for all combinations at once and filter for the longest results in code, however, I think the DB should be better at this.
Edit1: Note the example above is somewhat of a simplification. The vast majority of data in the DB is between 2 and 10 chars, with the most common match length of about 3 chars. There are upwards of 100K records in the table.
Edit2: Apologies, I needed to clarify that there may be more than one correct result, and that the results can contain duplicates that need to be removed. Currently with my inefficient method selecting distinct is easy.
解决方案
With an index on name, the following ought to be extremely performant:
SELECT DISTINCT name
FROM myTable
WHERE name LIKE CASE
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'S%') THEN '%'
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'St%') THEN 'S%'
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'Sta%') THEN 'St%'
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'Stac%') THEN 'Sta%'
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'Stack%') THEN 'Stac%'
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackO%') THEN 'Stack%'
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOv%') THEN 'StackO%'
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOve%') THEN 'StackOv%'
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOver%') THEN 'StackOve%'
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOverf%') THEN 'StackOver%'
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOverfl%') THEN 'StackOverf%'
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOverflo%') THEN 'StackOverfl%'
WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOverflow%') THEN 'StackOverflo%'
ELSE 'StackOverflow%'
END
See it on sqlfiddle.