Lists: | pgsql-novice |
---|
From: | balasubramaniam <balasubramaniam(dot)b(at)gmail(dot)com> |
---|---|
To: | pgsql-novice(at)postgresql(dot)org |
Subject: | Bulk load billions of records into Postgres cluster |
Date: | 2017-07-01 04:29:32 |
Message-ID: | CACFhHyuehAgUm6cQ4RbELZC5HSnc9Zsi9hpQjo+g2q+kVW1i-Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-novice |
Hi All,
We have a proven NoSQL production setup with a few billion rows. We are
planning to move towards a more structured data model with few tables.
I am looking for a completely open-source and battle-tested database and
Postgres seems to be the right start.
Due to our increasing scale demands, I am planning to start with Postgresql
cluster. Ability to ingest data at scale, around a few TBs, in the fastest
possible duration is highly critical for our use case. I have read
through official
documentation </docs/current/static/populate.html>
and
also about COPY FROM command, but none of these talk specifically about
cluster setup.
1) What is the standard and fastest way to ingest billions of records into
Postgres at scale.
2) Is there a tool to generate the sql script for COPY FROM command for
ready use? I want to avoid writing another custom tool and maintain it.
Thanks in advance,
bala
From: | Aleksey Tsalolikhin <atsaloli(dot)tech(at)gmail(dot)com> |
---|---|
To: | balasubramaniam <balasubramaniam(dot)b(at)gmail(dot)com> |
Cc: | pgsql-novice(at)postgresql(dot)org |
Subject: | Re: Bulk load billions of records into Postgres cluster |
Date: | 2017-07-15 16:18:48 |
Message-ID: | CA+jMWock9NZL=uvb=TQcXxiG9WyeDCYC=f2ubex4SreK+z4cgw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-novice |
Hi Bala. What are you going to export from? If that NoSQL database can
dump its data in CSV format, Postgres can read it in using COPY FROM or
\copy (e.g.,
https://stackoverflow.com/questions/2987433/how-to-import-csv-file-data-into-a-postgresql-table)
In other words, it doesn't have to be an SQL script for this data transfer.
Aleksey
On Fri, Jun 30, 2017 at 9:29 PM, balasubramaniam <
balasubramaniam(dot)b(at)gmail(dot)com> wrote:
> Hi All,
>
> We have a proven NoSQL production setup with a few billion rows. We are
> planning to move towards a more structured data model with few tables.
>
> I am looking for a completely open-source and battle-tested database and
> Postgres seems to be the right start.
>
> Due to our increasing scale demands, I am planning to start with
> Postgresql cluster. Ability to ingest data at scale, around a few TBs, in
> the fastest possible duration is highly critical for our use case. I have
> read through official documentation
> </docs/current/static/populate.html> and also
> about COPY FROM command, but none of these talk specifically about cluster
> setup.
>
> 1) What is the standard and fastest way to ingest billions of records into
> Postgres at scale.
> 2) Is there a tool to generate the sql script for COPY FROM command for
> ready use? I want to avoid writing another custom tool and maintain it.
>
> Thanks in advance,
> bala
>