Re: START/END line number for COPY FROM

Lists: pgsql-hackers
From: Surafel Temesgen <surafel3000(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: START/END line number for COPY FROM
Date: 2018-12-20 13:02:11
Message-ID: CALAY4q8nGSXp0P5uf56vn-mD7reWqZP5k6PS1CGUm26X4FsYJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg스포츠 토토 베트맨SQL

Hi,

Currently we can skip header line on COPY FROM but having the ability to
skip and stop copying at any line can use to divide long copy operation and
enable to copy a subset of the file and skipping footer. Attach is a patch
for it

Regards

Surafel

Attachment Content-Type Size
copy_from_start_stop_line_v1.patch text/x-patch 10.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Surafel Temesgen <surafel3000(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: START/END line number for COPY FROM
Date: 2018-12-21 16:21:14
Message-ID: 27439.1545409274@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Surafel Temesgen <surafel3000(at)gmail(dot)com> writes:
> Currently we can skip header line on COPY FROM but having the ability to
> skip and stop copying at any line can use to divide long copy operation and
> enable to copy a subset of the file and skipping footer. Attach is a patch
> for it

I do not think this is a good idea. We have resisted attempts to add
ETL-like features to COPY on the grounds that it would add complexity
and cost performance, and that that's not what COPY is for. This
seems to fall squarely in the domain of something you should be doing
with another tool.

regards, tom lane


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Surafel Temesgen <surafel3000(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: START/END line number for COPY FROM
Date: 2019-01-04 14:37:30
Message-ID: 6db582e8-2349-d322-aa36-fe654becbc31@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 20/12/2018 14:02, Surafel Temesgen wrote:
> Currently we can skip header line on COPY FROM but having the ability to
> skip and stop copying at any line can use to divide long copy operation
> and enable to copy a subset of the file and skipping footer.

It seems a bit fragile to me if I want to skip a footer and need to
figure out the total line count, subtract one, and then oh, was it zero-
or one-based.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Surafel Temesgen <surafel3000(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: START/END line number for COPY FROM
Date: 2019-01-05 08:27:32
Message-ID: CALAY4q8dM8ss4G9J7mkybC_DnusmQ0SX3LGNX=hHmgdh+iW6MA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,
On Fri, Jan 4, 2019 at 5:37 PM Peter Eisentraut <
peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:

> It seems a bit fragile to me if I want to skip a footer and need to
> figure out the total line count, subtract one, and then oh, was it zero-
> or one-based.
>
>
But normally we don't say start copying from line number 0
regards
Surafel


From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Surafel Temesgen <surafel3000(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: START/END line number for COPY FROM
Date: 2019-01-05 10:09:44
Message-ID: CAKJS1f-vNDrtdWYG-tSjK=UzYMCJH5_Si9XtNrudy5VHqh+96Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 21 Dec 2018 at 02:02, Surafel Temesgen <surafel3000(at)gmail(dot)com> wrote:
> Currently we can skip header line on COPY FROM but having the ability to skip and stop copying at any line can use to divide long copy operation and enable to copy a subset of the file and skipping footer. Attach is a patch for it

I'm struggling a bit to see the sense in this. If you really want to
improve the performance of a long copy, then I think it makes more
sense to have performed the backup in multiple pieces in the first
place. Having the database read the input stream and ignore the first
N lines sounds like a bit of a waste of effort, and effort that
wouldn't be required if the COPY TO had been done in multiple pieces.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Surafel Temesgen <surafel3000(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: START/END line number for COPY FROM
Date: 2019-01-06 11:59:05
Message-ID: CALAY4q--jqJxGV2Z-rBDRp9Ra_C7rsKHnPSSFRnO1z8eBgOw1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,
On Sat, Jan 5, 2019 at 1:10 PM David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
wrote:

> On Fri, 21 Dec 2018 at 02:02, Surafel Temesgen <surafel3000(at)gmail(dot)com>
> wrote:
> > Currently we can skip header line on COPY FROM but having the ability to
> skip and stop copying at any line can use to divide long copy operation and
> enable to copy a subset of the file and skipping footer. Attach is a patch
> for it
>
> I'm struggling a bit to see the sense in this. If you really want to
> improve the performance of a long copy, then I think it makes more
> sense to have performed the backup in multiple pieces in the first
> place.
>

it is not always the case to have in control of the data importing it may
came from
external system

regards
Surafel


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Surafel Temesgen <surafel3000(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: START/END line number for COPY FROM
Date: 2019-01-07 12:59:58
Message-ID: 0d61a3fe-abaf-edd4-60b5-ca54b1678105@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/01/2019 12:59, Surafel Temesgen wrote:
> it is not always the case to have in control of the data importing it
> may came from
> external system

But the problem that David described remains: If your data loading
requirement is so complicated that you need to load the file in chunks,
then doing it by line numbers will require you to skip over the leading
lines at every subsequent chunk. That's not going to be good for larger
files.

I think your problem needs a different solution.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services