Page 1 of 1

Preparing for import?

Posted: Fri Jul 25, 2008 7:08 pm
by Mekk
I consider importing to serendipity blog some texts I wrote in the past, and I currently keep as (slightly templated) HTML pages. Of course no generic filter may help me, but I am ready to do some scripting (= parse my pages and generate some import batch)

The question is: which of the import formats would you suggest for the task? It would be nice if it were well documented, relatively simple, but also allowed to preserve some basic data (title, date, meta-description, and of course article body)

Or maybe I should rather try direct saves to s9 database?

Posted: Fri Jul 25, 2008 8:22 pm
by judebert
If you've got the PHP and SQL skillz, just go straight to the database. It's loads easier.

Posted: Sat Jul 26, 2008 10:19 am
by garvinhicking
Hi!

I fully agree with judebert, that would be a lot easier.

If you really want to create your own importer module, the include/admin/importers/generic.inc.php would IMHO be the easiest code to adapt.The wordpress importer is far more complex for a simple HTML import...

Regards,
Garvin

Posted: Sat Jul 26, 2008 11:54 am
by Mekk
Does there exist some description of database schema? In particular, is it all about dumping to s9en_entries, or something else must happen too (like sequence update, permalinks generation etc)?

PS I am perl&python guy. It took me some time to accept necessity of using PHP application ;-)

Posted: Sat Jul 26, 2008 8:40 pm
by garvinhicking
Hi!
Mekk wrote:Does there exist some description of database schema?
Sadly I only took the work to do that for my german book yet, I don't believe there'S something online yet. But it would be a great addition for our technical documentation. We do not have that many foreign keys, so actually much should be self explaining.
In particular, is it all about dumping to s9en_entries, or something else must happen too (like sequence update, permalinks generation etc)?
Sequencing is used frmo the underlying DB system (autoincrement keys). Permalinks only need to be created when you are using a permalink scheme without %id% in the URL pattern.

If you want to assign categories to the entries, you need to fill the serendipity_entrycat DB table (entryid -> categoryid relation). Apart from that, you can fill in custom entry properties in the serendipity_entryproperties DB table (things like entry passwords, ...).

In most of the cases, filling serendipity_Entries does 99% of the work.

Regards,
Garvin

Posted: Mon Jul 28, 2008 3:32 pm
by judebert
Welcome to Serendipity and PHP! I was a Perl coder, and I've recently started investigating Python. I'm loving PHP! Its versatility is akin to Python's, but it keeps the C-like (and Perl-like) blocking syntax.

If you've got access to the database, you could export it to SQL (another language it's worthwhile to learn, simply because it's everywhere), then write a Python script to modify it, then import the SQL to your new Serendipity DB.

Since the schema is only available in German, I'd make a test blog with one author, one category, and one entry. I'd use that as an example of how the tables need to be populated.

Posted: Mon Jul 28, 2008 4:50 pm
by Mekk
I wrote a couple of kilobytes of SQL in my life ;-) Although recently, when writing in Python, I kinda of like SQLAlchemy.

OK, back to the subject. Am I safe to do

INSERT INTO mypfx_entries(title, timestamp, body, author, authorid, isdraft)
VALUES (... here sensibly taken values, with body being just a piece of text ...)

?

In particular, I have no clue whether NULL is safe in comments, tranckbacks, extended, and exflag (or, in general, how are those working)

Also, will it suffice, or maybe I must also put something to mypfx_access, mypfx_entryproperties, mypfx_permalinks, mypfx_references, mypfx_suppress - or anywhere else?

I am to save the imported articles as drafts and to review them before publishing, but nevertheless I'd - for example - like to have sensible date, author, title, metatags etc setup by default....

Posted: Mon Jul 28, 2008 5:59 pm
by garvinhicking
Hi!

About the entries-table:

id is autoincremented for the primary key, this will be sequenced automatically if you do not use it for your INSERT statement.

title is the article's title

timestamp is the unix timestamp of the entry creation

body holds the entry body

extended holds the extended entry body

comments holds the number of comments to that entry (set to 0 for new entries)

trackbacks holds the number of trackbacks to that entry (set to 0 for new entries)

exflag is set to "1" if an entry contains an extended body

author holds the authorname (deprecated, use authorid is preferred)

authorid holds the ID of the author

isdraft is set to "true" (or "t" on PGSQL) when the entry is not a draft. "false" ("f") means the entry is online.

allow_comments can be set to true/false (t/f) to indicate, if comments to this entry are open

last_modified holds the unix timestamp of last update (set to NOW() like the entry.timestamp)

moderate_comments ca be set to true/false (t/f) to indicate if comments to that entry are moderated.

HTH,
Garvin

Posted: Mon Jul 28, 2008 6:04 pm
by Mekk
Thank you very much.

Would I want to give an article some specific old date, should I patch timestamp, last_modified, or both?

Should I set author to NULL if authorid is properly set?

Posted: Mon Jul 28, 2008 6:11 pm
by garvinhicking
Hi!

You should set both timestamp and last_modified to the UNIX timestamp of the date that should be associated with your imported entry.

You can set author to either NULL or an empty string '' - it shouldn't matter.

HTH,
Garvin

Posted: Fri Aug 01, 2008 7:47 am
by medienluemmel
I've written a program to write the complete content offline and then export it to serendipity format. The export output looks like this:

Code: Select all

DELETE FROM sp_category;
DELETE FROM sp_entrycat;
DELETE FROM sp_entries;
ALTER TABLE sp_category AUTO_INCREMENT = 1;
ALTER TABLE sp_entrycat AUTO_INCREMENT = 1;
ALTER TABLE sp_entries AUTO_INCREMENT = 1;
INSERT INTO sp_category (categoryid, category_name, category_left, category_right, parentid) VALUES (null, 'Category 1', 1, 8, 0);
INSERT INTO sp_category (categoryid, category_name, category_left, category_right, parentid) VALUES (null, 'Subcategory 1', 2, 3, 1);
INSERT INTO sp_category (categoryid, category_name, category_left, category_right, parentid) VALUES (null, 'Subcategory 2', 4, 5, 1);
INSERT INTO sp_category (categoryid, category_name, category_left, category_right, parentid) VALUES (null, 'Subcategory 3', 6, 7, 1);
INSERT INTO sp_category (categoryid, category_name, category_left, category_right, parentid) VALUES (null, 'Category 1', 9, 22, 0);
INSERT INTO sp_entries (id, title, timestamp, body, comments, trackbacks, extended, exflag, author, authorid, isdraft, allow_comments, last_modified, moderate_comments) SELECT null, 'Title of Content 1', UNIX_TIMESTAMP('2007-01-01 00:00:00'), 'Content-Teaser', 0, 0, 'Content-Text', 1, 'Author-Name', (SELECT authorid FROM sp_authors WHERE realname = 'Author-Name'), 'false', 'true', UNIX_TIMESTAMP('2007-01-01 00:00:00'), 'false';
INSERT INTO sp_entrycat (entryid, categoryid) SELECT (SELECT id FROM sp_entries WHERE title ='Content-Title'), 0;
INSERT INTO sp_entrycat (entryid, categoryid) SELECT (SELECT id FROM sp_entries WHERE title ='Content-Title'), (SELECT categoryid FROM sp_category WHERE category_name LIKE 'Category-Title');
In my case it works completely; one thing was a little bit tricky: to get the right values for the category_left and category_right entries.

Posted: Fri Aug 01, 2008 7:50 am
by medienluemmel
Oh, one thing I forgot: To get the authorid with

Code: Select all

(SELECT authorid FROM sp_authors WHERE realname = 'Author-Name')
the name of the author(s) has to be present on your serendipity system.

Posted: Mon Aug 04, 2008 3:15 pm
by judebert
Sounds like an interesting script! Perl, I assume?

Posted: Tue Aug 05, 2008 9:21 am
by medienluemmel
judebert wrote:Sounds like an interesting script! Perl, I assume?
It's not a script; it's a c++-application with a (user defined) content and category database; with this application you can create the export sql-statements for Serendipity, but also for an older version of Wordpress.