Faster Inserts with PostgreSQL

August 31, 2005
databases

As I mentioned yesterday, I'm working on optimizing lots of inserts into a database, and I need solutions for different DB servers. Today I have been working with PostgreSQL. Although the PostgreSQL COPY command is recommended, I can't seam to get it to work with ColdFusion, and cfquery. I've tried lots of different ways of doing it.

Fast Insert Solution on PostgreSQL:

So far the fastest solution I have come up with is using PREPARE to create a temporary prepared statement (yes, I'm aware of cfqueryparam, and this method inserts with the highest speed). So here's how you do it:

PREPARE preparedInsert (int, varchar) AS
  INSERT INTO tableName (intColumn, charColumn)
  VALUES ($1, $2);
EXECUTE preparedInsert (1,'a');
EXECUTE preparedInsert (2,'b');
EXECUTE preparedInsert (3,'c');
DEALLOCATE preparedInsert;

Your basically creating a function that allows you to pass variables to your insert statement. Inside the first set of parenthesis you list the types of your variables, then variables are referred to as $1, $3, etc. inside the statement.

Next you can EXECUTE the statement as many times as you need to (this can all be done inside one SQL statement, inside one cfquery tag).

Finally when I'm done, I DEALLOCATE the function, otherwise if you try to PREPARE a statement named preparedInsert again during the same connection session you will get an error.

Performance Results

I found that this method was about 20% faster than multiple INSERT statements when sent in the same SQL statement (same cfquery tag). It was about 2-5 times faster than individual INSERT statements - that is, each insert statement had its own SQL statement, or cfquery tag. Finally it was about 25% faster than using bind parameters (cfqueryparam).



Related Entries

5 people found this page useful, what do you think?

Comments

I've nothing to back this up, but doesn't PostgreSQL support the multi-record INSERT syntax that MySQL supports? I.e. INSERT INTO table (col1, col2) values (1, 'a'), (2, 'b'), (3, 'c'); This has a HUGE effect, because it's an atomic action at the storage level, and therefore only has to update the indexes once, rather than once per record.
Hi Barney, I tested this, and no it does not work on PostgreSQL, I really wished that it would.
Good stuff Pete. I finally got around to trying PostgreSQL out recently, and I'll probably be using it on an insert intensive application soon. It's really came a long ways for the Windows platform as of version 8-- it's quite impressive.
Yeah, PostgreSQL is a very impressive RDBMS, my only gripe is that there aren't any solid mirroring, or clustering solutions builtin (I think there are some third party solutions). Also full text indexing is a third party thing. Not that I really need either of those features right now, it just nice to know they are there - MySQL has both those features.
Hello! I wonder if it is possible in PostgreSQL to get a folder containing flat files and load all files into pgsql table using something like copy table_name from 'folder_name/*.dat' ?
PostgreSQL doesn't have MySQL's convenient multi-value insert, but you can achieve something similar as follows: INSERT INTO table_name SELECT 1, 'a' UNION SELECT 2, 'b' UNION SELECT 3, 'c' UNION SELECT 4, 'd' UNION ... Combine this syntax with multiple INSERTs per transaction, and you can get quite a boost. In my case it beat the prepared statement approach by a significant margin. (N.B. I have only tested this in PostgreSQL 8.1, if that matters) I wonder if PostgreSQL provides a faster way to construct the row-set than SELECT..UNION ?
That's a cool technique Dave thanks for posting!
I hope this will help me. Thank's Dave, I was looking for something like this.
Hi guys, me again. I need to insert "multi value" values into table, but in form like this: INSERT INTO table_name (integer, integer, varchar, integer, varchar..) VALUES (SELECT (id from other table), SELECT (id from third table), 'TEXT VALUE', SELECT (id from fourth table), INTEGER VALUE.. etc In MySQL it would look somehow like this, but how to achieve this in PostgreSQL? Thank you for your assistance :) I'm using PostgreSQL 8.1.3
Hi all, I need to load .csv files into postgre (greenplum). Help me how to load. Thanks
A really efficient (once off) type way of getting CSV's into a database is simply to use OpenOffice Spreadsheet / Excel. Load you data into the tool, then right a quick formula on the first row, and first empty column, next to your data. Using the "SELECT a,b,c UNION" form. Make the formula ="SELECT " & A1 & "," & B1 & "," & C1 & " UNION" If you have strings, or date's just wrap them in the quotes as appropriate. Now, dupe that line to all lines in the sheet (say 30,000 times) (hilight all rows, and Ctrl-D) You are done, copy that "auto" generating SQL out of the spreadsheet to an SQL file and execute it. This is not good for "automated repetitive" type tasks, but works brilliantly for seeding databases.
I am using Insert statements with around 30 parameters. Half of them being constant thanks to the prepare. Still I just get a boost of 25%. I don´t really understand why insert is so slow anyway - do I really have to write all the stuff to a csv and then copy it into the database? That doesn´t make any sense to me.

Anyways, Postgres 8.4 seems to have a more efficient way of dealing with these things...
This method of insertion will take more time. Suppose if there exists 40 columns in a table, And we want to insert 75 rows. According to the method that described above we will put 75 rows and this will take more time. Is there any other method to execute multiple rows with a single query.

Post a Comment




  



Spell Checker by Foundeo

Recent Entries



foundeo


did you hack my cf?