Importing large CSV to postgres database

When importing very large csv files to database we might first try to use the import functionality on tools like DBeaver but sometime with large dataset it can take you hours and also can incur extra costs on data transfer if your database is in another VPC and you are trying to do it over the internet.

Postgres provide the copy command that is built for this purpose and is lightning fast, here is an example where you actually don’t have access to the database server itself and run the copy command on a host that exists in the same VPC and just has database connectivity.

ubuntu@mario-bastion:~/k8s$ psql -h aws-postgresxxx.rds.amazonaws.com -p 5432 -U user -d dbname -c "\copy targetdb.table (column)  from '/home/ubuntu/k8s/large.csv' with delimiter as ','"

Password for user user:

COPY 36840341

And that’s it you are done.

Sometime there can be issues on the available free size of RDS and you might see errors as below in that case the size on RDS needs to be checked.

ERROR:  could not extend file "base/17979/18129": wrote only 4096 of 8192 bytes at block 125131
HINT:  Check free disk space.

About: Muhammad Ali

I am a Java and Tibco expert. I have Master's degree in software engineering of distributed systems from Royal Institute of technology KTH Stockholm Sweden and I am an expert software engineer and integrator with over a decade of industry experience from Europe, Gulf and Asia regions. I am a strong opensource believer and encourage "sharing as caring" principle.