tag plugin - question

Creating and modifying plugins.
Post Reply
robi-bobi
Regular
Posts: 40
Joined: Mon Nov 21, 2005 12:40 pm
Location: Bulgaria
Contact:

tag plugin - question

Post by robi-bobi »

I just started to use tag plugin and it looks great

But, related entries shown by it do not look ok. It looks like the plugin shows the most recent entry which has at least one same tag as current entry.

I think that it should show entries which share the most tags with current one (and order them the same way: time desc)

Is it just my installation that works this way (I could not find setting for this), or this is default behavior.
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: tag plugin - question

Post by garvinhicking »

Hi!

You are right, entries are just sorted by timestamp:

Code: Select all

       $q = "SELECT DISTINCT e1.entryid,
                     e2.title
                FROM {$serendipity['dbPrefix']}entrytags AS e1
           LEFT JOIN {$serendipity['dbPrefix']}entries   AS e2
                  ON e1.entryid = e2.id
               WHERE e1.tag IN ('" . implode("', '", $tags) . "')
                 AND e1.entryid != " . (int)$postID . "
                 AND e2.isdraft = 'false'
                     " . (!serendipity_db_bool($serendipity['showFutureEntries'
            ORDER BY  e2.timestamp DESC
               LIMIT " . $this->get_config('show_related_count', 10);
(from function getRelatedEntries() within serendipity_event_freetag

Sadly I don't really know how to group by most common intersecting tags, I don't know if this is even possible with PostgreSQL or SQLite...

Best regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
robi-bobi
Regular
Posts: 40
Joined: Mon Nov 21, 2005 12:40 pm
Location: Bulgaria
Contact:

Post by robi-bobi »

I understand

maybe by using http://php.net/manual/en/function.similar-text.php we can achieve similar effect?

Retrieving all tag records in DB for all entries and using similar_text() on them with current tags?

Sure, this should be cached - with probably long period of cache :)
Maybe the first cache should be made after saving the entry, so users won't feel any delay

I don't know how this would behave on blogs with LOTS of entries, but I suppose that with <1000 entries this should work fine.

This can be added as option to the current plugin configuration with warning that it may slow down the site.
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi!
maybe by using http://php.net/manual/en/function.similar-text.php we can achieve similar effect?
I'm afraid that won't really work. It would mean to fetch ALL entries body texts for EACH entry to compare them next to each other. That's quite performance intensive and would drag everything down. Building a cache would mean to rebuild the cache for All entries for each new entry that is made. If you have a couple of entries, that would mean you might need to wait minutes before you can publish a new entry.

The way to go would be to try to find a suitable SQL Query that does what you need, and then see if it works across MySQL, psotgresql and SQLite.

Best regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
robi-bobi
Regular
Posts: 40
Joined: Mon Nov 21, 2005 12:40 pm
Location: Bulgaria
Contact:

Post by robi-bobi »

hm, sorry, it looks like both similar_text and levenshtein() are for single words only
robi-bobi
Regular
Posts: 40
Joined: Mon Nov 21, 2005 12:40 pm
Location: Bulgaria
Contact:

Post by robi-bobi »

garvinhicking wrote: The way to go would be to try to find a suitable SQL Query that does what you need
Garvin
I agree with that, just was thinking loud :)

edit: another thought: http://dev.mysql.com/doc/refman/5.0/en/ ... earch.html (which might come close but adds other problems ...)
robi-bobi
Regular
Posts: 40
Joined: Mon Nov 21, 2005 12:40 pm
Location: Bulgaria
Contact:

Post by robi-bobi »

Code: Select all

CREATE TABLE `entry_tags` (
  `entry_id` int(11) NOT NULL,
  `tag_id` int(11) NOT NULL,
  PRIMARY KEY  (`entry_id`,`tag_id`)
);

SELECT b.entry_id, COUNT(*) as score FROM entry_tags a
JOIN entry_tags b ON
  a.tag_id = b.tag_id AND
  a.entry_id = OUR_ENTRY_ID
GROUP BY b.entry_id
HAVING b.entry_id != OUR_ENTRY_ID
ORDER BY COUNT(*) DESC;
not my sql, so no credits go to me

even if the tables are not connected with such 'link' table, the query looks promising
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi!

Can you check if that works with SQLite and postgresql?

The query looks like this with actual s9y tables:

Code: Select all

SELECT b.entryid, COUNT(*) as score FROM serendipity_entrytags a
JOIN serendipity_entrytags b ON
  a.tag = b.tag AND
  a.entryid = OUR_ENTRY_ID
GROUP BY b.entryid
HAVING b.entryid != OUR_ENTRY_ID
ORDER BY COUNT(*) DESC
Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
Post Reply