Two questions - Search and Top Level Content Display (day)

Random stuff about serendipity. Discussion, Questions, Paraphernalia.
Post Reply
genesis
Regular
Posts: 28
Joined: Wed Jun 11, 2008 5:26 pm

Two questions - Search and Top Level Content Display (day)

Post by genesis »

Two questions...

First, can Serepdipity be told to display one days worth of postings on the top page instead of a number of posts? It appears no, from the config screen....

Second, the "basic search" doesn't appear to use Tsearch2 on Postgresql. Is there support for that somewhere?

Thanks!
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Two questions - Search and Top Level Content Display (da

Post by garvinhicking »

Hi!

That is true, there is no option for this. The only way to make this work would be to use the smarty serendipity_fetchPrintEntries function and pass it the SQL to it that would be needed to filter by a single date.
Second, the "basic search" doesn't appear to use Tsearch2 on Postgresql. Is there support for that somewhere?
Sadly not, we currently have no active pgsql developers that work on s9y core code, so pgsql support is currently limited to making basic SQL work, not any customized enhancmenets.

Best regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
genesis
Regular
Posts: 28
Joined: Wed Jun 11, 2008 5:26 pm

Post by genesis »

OK.... if I can muddle my way through how all the pieces work implementing Tsearch functionality should be trivial - that I can do, if I can figure out the components necessary to hook into the right places.

I assume the various documentation on the software and its internals will get me where I need to go, assuming I can understand them? :lol:
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi!

The current search functionality can be found in s9y's include/functions_entries.inc.php file, function serendipity_searchEntries(). There already is a PGSQL switch, so if you tell me the required SQL/Code, I could insert it in the official distro :)

HTH,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
genesis
Regular
Posts: 28
Joined: Wed Jun 11, 2008 5:26 pm

Post by genesis »

To do searches on Full Text in Postgresql, you must have the "tsearch2" module loaded (this assumes 8.x or later)

You also must have an index on the table in question structured as:

"idx_message" gin (to_tsvector('english'::text, message))
"idx_subject" gin (to_tsvector('english'::text, subject))

This gives you both the message and subject fields

To query, you do:

select * from post where to_tsvector('english', message) @@ to_tsquery('bite & me');

This tells the engine to parse the message (you can also parse the subject if you want) and the query terms are "bite" and "me", both of which must be present. Booleans are acceptable (& and |)

This snippet is from my forum code which uses tsearch2 for full-text search capability. It is EXTREMELY fast.

Note that if you want support for other than English, you have to have a lexeme loaded for that language into the tsearch2 module in the database or it won't work.

A more complex example is:

select * from post where to_tsvector('english', subject) @@ to_tsquery('hellfire') and to_tsvector('english', message) @@ to_tsquery('bite & me')

This searches for both the word "hellfire" in the subject line AND both the words "bite" and "me" in the message body. All must be present to return a match.

Does this help?
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi!

That helps a lot. In fact, inside the file I mentioned you currently see this:

Code: Select all

if ($serendipity['dbType'] == 'postgres' ||
        $serendipity['dbType'] == 'pdo-postgres') {
        $cond['group']     = '';
        $cond['distinct']  = 'DISTINCT';
        $cond['find_part'] = "(title ILIKE '%$term%' OR body ILIKE '%$term%' OR extended ILIKE '%$term%')";
if you modify this to:

Code: Select all

if ($serendipity['dbType'] == 'postgres' ||
        $serendipity['dbType'] == 'pdo-postgres') {
        $cond['group']     = '';
        $cond['distinct']  = 'DISTINCT';
        $cond['find_part'] = "(
to_tsvector('english', title) @@ to_tsquery('$term') OR
to_tsvector('english', body) @@ to_tsquery('$term') OR
to_tsvector('english', extended) @@ to_tsquery('$term')
)");
That this could work?

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
genesis
Regular
Posts: 28
Joined: Wed Jun 11, 2008 5:26 pm

Post by genesis »

That appears to work for single terms, but not for booleans.... How do I get a log somewhere of the exact query that is being submitted? The search return shows the correct boolean string, but I get nothing back.

If I submit the query by hand off the command line to the database (with complex booleans) I get the correct result.
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi!

I believe booleans are converted from/to HTML entities.

You could patch that mentioned PHP code:

if ($serendipity['dbType'] == 'postgres' ||
$serendipity['dbType'] == 'pdo-postgres') {
$cond['group'] = '';
$cond['distinct'] = 'DISTINCT';
$cond['find_part'] = "(
to_tsvector('english', title) @@ to_tsquery('$term') OR
to_tsvector('english', body) @@ to_tsquery('$term') OR
to_tsvector('english', extended) @@ to_tsquery('$term')
)");
$fp = fopen('/tmp/debug.log', 'a');
fwrite($fp, $cond['find_part'] . "\n");
fclose($fp);
[/code]

This creates a /tmp/debug.log file. You can change the path to match your filesystem.

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
genesis
Regular
Posts: 28
Joined: Wed Jun 11, 2008 5:26 pm

Post by genesis »

That's what's happening....

(
to_tsvector('english', title) @@ to_tsquery('bite & me') OR
to_tsvector('english', body) @@ to_tsquery('bite & me') OR
to_tsvector('english', extended) @@ to_tsquery('bite & me')
)

That sucks.... can you think of a reasonable way to prevent that? Without it, booleans won't work....

(Php's a new one for me.... I'm a "C" programmer :) )
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi!

You could use:

Code: Select all

if ($serendipity['dbType'] == 'postgres' ||
$serendipity['dbType'] == 'pdo-postgres') {
$term = str_replace('&', '&', $term);
$cond['group'] = '';
$cond['distinct'] = 'DISTINCT';
$cond['find_part'] = "(
to_tsvector('english', title) @@ to_tsquery('$term') OR
to_tsvector('english', body) @@ to_tsquery('$term') OR
to_tsvector('english', extended) @@ to_tsquery('$term')
)");
$fp = fopen('/tmp/debug.log', 'a');
fwrite($fp, $cond['find_part'] . "\n");
fclose($fp); 
Is there a SQL statement for postgresql that one can use to check if this vector function is available? This would help s9y to check if it can use that query or not.

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
genesis
Regular
Posts: 28
Joined: Wed Jun 11, 2008 5:26 pm

Post by genesis »

Code: Select all

marketticker=> select routine_name from information_schema.routines where routine_name like 'to_tsvector' and specific_catalog='marketticker';  
 routine_name 
--------------
 to_tsvector
 to_tsvector
 to_tsvector
 to_tsvector
 to_tsvector
(5 rows)
"specific_catalog" is the database you're looking in (someone COULD HAVE loaded tsearch after creating the database; its loaded into the template, so unless intentionally loaded into your schema afterward, having it in the template doesn't help you if the database was created first) so you have to check your database catalog.

Note that the function is overloaded so it will show up in this query more than once; there are lots of other fields available, but I assume all you're looking for is the presence. If so you could use:

Code: Select all

marketticker=> select count(routine_name) > 0 from information_schema.routines where routine_name like 'to_tsvector' and specific_catalog='marketticker';  
?column?
-------
   t
(1 row)
Returns "t" (true) if the functions are present, and "f" (false) if not.

information_schema showed up around 7.4ish, so anyone with anything reasonably current in the context of Postgresql should have it.
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi!

I just committed this to the official s9y repository:

http://svn.berlios.de/viewcvs/serendipi ... 1&view=rev

Can you check if that works fine for you as well?

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
genesis
Regular
Posts: 28
Joined: Wed Jun 11, 2008 5:26 pm

Post by genesis »

One error:

Code: Select all

   to_tsvector('english', title)    @@to_tsquery('$term') OR
            to_tsvector('english', body)     @@to_tsquery('$term') OR
            to_tsvector('english', extended) @@to_tsquery('$term')
            )";
Note that you have an extra closing parenthesis just before the semicolon in what you committed; this is "as corrected" and works.
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi!

Good catch, sorry for that error. Great to hear that it works :)

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
genesis
Regular
Posts: 28
Joined: Wed Jun 11, 2008 5:26 pm

Post by genesis »

Cool - this is a fairly substantial improvement in the search functionality, especially for very large blogs (its smoking fast) and the boolean capability is something that people like as well.
Post Reply