Re: Problems Formulating a SELECT

Lists: pgsql-sql
From: Charles Hauser <chauser(at)duke(dot)edu>
To: PSQL_list <pgsql-sql(at)postgresql(dot)org>
Subject: Problems Formulating a SELECT
Date: 2002-10-07 19:24:10
Message-ID: 1034018651.2282.38.camel@pandorina.biology.duke.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

I am trying to formulate a SELECT and could use some suggestions.

From the TABLES below I would like to find ALL contigs which contain the
same clones except that one (or more) has read='y' and the other(s) have
read='x'.
Or stated another way: find all contigs composed of (at least) both (x
and y) reads from the same clone.

For example:

In the data below, the contig '20020630.488.1'(contig_id:13805) is
composed of 2 clones (clone_id='12018' and '12019') which are
894027G09.x and 894027G09.y, respectively.

Example:

TABLE clone 'A' 'B'

clone_id 12018 12019
project 894 894
plate 27 27
row G G
col 9 9
read x y

Table clone_contig:

clone_id contig_id
12018 13805
12019 13805

TABLE contig:

contig_id 13805
assembly 20020630
ace 488
ver 1

CREATE TABLE clone (
clone_id SERIAL PRIMARY KEY,
project INTEGER REFERENCES library(project) NOT NULL,
plate INTEGER NOT NULL,
row CHAR(1) NOT NULL,
col INTEGER NOT NULL,
read CHAR(1) NOT NULL,
ver INTEGER NOT NULL,
seq TEXT NOT NULL,
L INTEGER NOT NULL,
Qvals TEXT NOT NULL,
TL INTEGER NOT NULL,
MQAT INTEGER NOT NULL,
Qstart INTEGER NOT NULL,
Qend INTEGER NOT NULL,
gb_id INTEGER REFERENCES gb(gb_id) NULL,
unigene BOOLEAN NULL,
UNIQUE (project,plate,row,col,read,ver)
);

CREATE TABLE contig (
contig_id SERIAL PRIMARY KEY,
assembly DATE NOT NULL,
ace INTEGER NOT NULL,
ver INTEGER NOT NULL,
length INTEGER NOT NULL,
seq TEXT NOT NULL,
UNIQUE (assembly,ace,ver)
);

CREATE TABLE clone_contig(
clone_id INTEGER REFERENCES clone(clone_id) ON DELETE CASCADE,
contig_id INTEGER REFERENCES contig(contig_id) ON DELETE CASCADE,
UNIQUE(clone_id,contig_id)
);

regards,

Charles


From: Richard Huxton <dev(at)archonet(dot)com>
To: Charles Hauser <chauser(at)duke(dot)edu>, PSQL_list <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Problems Formulating a SELECT
Date: 2002-10-08 08:38:37
Message-ID: 200210080938.37634.dev@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

On Monday 07 Oct 2002 8:24 pm, Charles Hauser wrote:
> I am trying to formulate a SELECT and could use some suggestions.
>
> From the TABLES below I would like to find ALL contigs which contain the
> same clones except that one (or more) has read='y' and the other(s) have
> read='x'.
> Or stated another way: find all contigs composed of (at least) both (x
> and y) reads from the same clone.
>
> For example:
>
> In the data below, the contig '20020630.488.1'(contig_id:13805) is
> composed of 2 clones (clone_id='12018' and '12019') which are
> 894027G09.x and 894027G09.y, respectively.
>
> TABLE clone 'A' 'B'
>
> clone_id 12018 12019
> project 894 894
> plate 27 27
> row G G
> col 9 9
> read x y
>
> Table clone_contig:
>
> clone_id contig_id
> 12018 13805
> 12019 13805

How about something like:

CREATE VIEW contig_x_vw AS SELECT contig_id FROM clone_config, clone WHERE
clone_config.clone_id = clone.clone_id AND read='x';
CREATE VIEW contig_y_vw AS [same but for y]

SELECT config_id FROM contig_x_vw, contig_y_vw WHERE contig_x_vw.clone_id =
contig_y_vw.clone.id;

You don't need the views, but they make the example easier.

- Richard Huxton


From: Charles Hauser <chauser(at)duke(dot)edu>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: PSQL_list <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Problems Formulating a SELECT
Date: 2002-10-08 15:04:19
Message-ID: 1034089459.4283.21.camel@pandorina.biology.duke.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Richard,

Thanks, a followup.

I believe this will not work (novice, so take w/ grain of salt).

I tried the following:

chlamy_est=> CREATE VIEW contig_x_vw AS SELECT contig_id FROM clone_contig, clone WHERE
chlamy_est-> clone_contig.clone_id = clone.clone_id AND read='x';
CREATE
chlamy_est=> CREATE VIEW contig_y_vw AS SELECT contig_id FROM clone_contig, clone WHERE
chlamy_est-> clone_contig.clone_id = clone.clone_id AND read='y';
CREATE
chlamy_est=> SELECT contig_id FROM contig_x_vw, contig_y_vw WHERE contig_x_vw.clone_id =
chlamy_est-> contig_y_vw.clone.id;
ERROR: Column reference "contig_id" is ambiguous

If I understand the logic, you SELECT:
all contig_ids where where clone.read = 'x' (VIEW contig_x_vw)
all contig_ids where where clone.read = 'y' (VIEW contig_y_vw)
find isect of these two wehere their clone_ids are same

However, their clone_ids will never be the same as in the example.
cloneA.project=cloneB.project
cloneA.plate=cloneB.plate
cloneA.row=cloneB.row
cloneA.col=cloneB.col

TABLE clone 'A' 'B'

clone_id 12018 12019
project 894 894
plate 27 27
row G G
col 9 9
read x y

??

Charles

> >
> > In the data below, the contig '20020630.488.1'(contig_id:13805) is
> > composed of 2 clones (clone_id='12018' and '12019') which are
> > 894027G09.x and 894027G09.y, respectively.
> >
> > TABLE clone 'A' 'B'
> >
> > clone_id 12018 12019
> > project 894 894
> > plate 27 27
> > row G G
> > col 9 9
> > read x y
> >
> > Table clone_contig:
> >
> > clone_id contig_id
> > 12018 13805
> > 12019 13805
>
> How about something like:
>
> CREATE VIEW contig_x_vw AS SELECT contig_id FROM clone_config, clone WHERE
> clone_config.clone_id = clone.clone_id AND read='x';
> CREATE VIEW contig_y_vw AS [same but for y]
>
> SELECT config_id FROM contig_x_vw, contig_y_vw WHERE contig_x_vw.clone_id =
> contig_y_vw.clone.id;
>
> You don't need the views, but they make the example easier.
>
> - Richard Huxton
>
>


From: Richard Huxton <dev(at)archonet(dot)com>
To: Charles Hauser <chauser(at)duke(dot)edu>
Cc: PSQL_list <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Problems Formulating a SELECT
Date: 2002-10-08 16:44:15
Message-ID: 200210081744.15596.dev@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

On Tuesday 08 Oct 2002 4:04 pm, Charles Hauser wrote:
> Richard,
>
> Thanks, a followup.
>
> I believe this will not work (novice, so take w/ grain of salt).
>
> I tried the following:
>
> chlamy_est=> CREATE VIEW contig_x_vw AS SELECT contig_id FROM clone_contig,
> clone WHERE chlamy_est-> clone_contig.clone_id = clone.clone_id AND
> read='x';
> CREATE
> chlamy_est=> CREATE VIEW contig_y_vw AS SELECT contig_id FROM clone_contig,
> clone WHERE chlamy_est-> clone_contig.clone_id = clone.clone_id AND
> read='y';
> CREATE
> chlamy_est=> SELECT contig_id FROM contig_x_vw, contig_y_vw WHERE
> contig_x_vw.clone_id = chlamy_est-> contig_y_vw.clone.id;
> ERROR: Column reference "contig_id" is ambiguous

My bad, I didn't test it - for the last one you'll want

SELECT contig_x_vw.contig_id FROM contig_x_vw ...

It doesn't matter which id you use (since you want the value in each) but you
will need to tell PG which one you want.

> If I understand the logic, you SELECT:
> all contig_ids where where clone.read = 'x' (VIEW contig_x_vw)
> all contig_ids where where clone.read = 'y' (VIEW contig_y_vw)
> find isect of these two wehere their clone_ids are same
>
>
> However, their clone_ids will never be the same as in the example.

It's checking the "contig_id"s are the same (from clone_contig) - if that
isn't what's wanted you can check the columns below match. Otherwise, you're
quite right, that's what it's doing.

> cloneA.project=cloneB.project
> cloneA.plate=cloneB.plate
> cloneA.row=cloneB.row
> cloneA.col=cloneB.col
>
>
> TABLE clone 'A' 'B'
>
> clone_id 12018 12019
> project 894 894
> plate 27 27
> row G G
> col 9 9
> read x y

Does that get you any further? If not, I'm back at my desk tomorrow 9am GMT

- Richard Huxton


From: Charles Hauser <chauser(at)duke(dot)edu>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Problems Formulating a SELECT
Date: 2002-10-08 18:19:15
Message-ID: 1034101155.4283.78.camel@pandorina.biology.duke.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Richard,

Thanks again.

> My bad, I didn't test it - for the last one you'll want
>
> SELECT contig_x_vw.contig_id FROM contig_x_vw ...
>
> It doesn't matter which id you use (since you want the value in each) but you
> will need to tell PG which one you want.

No problem, after I sent the email I caught the source of the ' "contig_id" is ambiguous' ERROR.

>
> It's checking the "contig_id"s are the same (from clone_contig) - ....

A contig_id match is what I wanted, but also a requirement that
clone.project,plate,col & row also match for cloneA and cloneB. I added
what I think gives me the correct match params (see below) and a quick
survey of the results looks to be ok.

Is there a method to remove duplicate results? For instance the query
below in part yields :

chlamy_est-> ;
contig_id
-----------
27170
27173
27173
27179
27179
27179
27179
27179

The repetitive occurrence of the same contig_id (27179) occurs as a
result of multiple pairs of clones matching the criteria for a given
contig_id. So for contig_id 27179 there are 5 pairs which match:

cloneA cloneB contig_id
894018D05.x1 <-> 894018D05.y1 27179
894080H12.x1 <-> 894080H12.y1 27179
894094E04.x2 <-> 894094E04.y2 27179
894095H03.x1 <-> 894095H03.y2 27179
963037B05.x2 <-> 963037B05.y1 27179

CREATE VIEW contig_x_vw AS SELECT contig_id,project,plate,row,col FROM clone_contig, clone WHERE
clone_contig.clone_id = clone.clone_id AND read='x';

CREATE VIEW contig_y_vw AS SELECT contig_id,project,plate,row,col FROM clone_contig, clone WHERE
clone_contig.clone_id = clone.clone_id AND read='y';

SELECT contig_x_vw.contig_id FROM contig_x_vw, contig_y_vw
WHERE contig_x_vw.project = contig_y_vw.project AND
contig_x_vw.plate = contig_y_vw.plate AND
contig_x_vw.col = contig_y_vw.col AND
contig_x_vw.row = contig_y_vw.row AND
contig_x_vw.contig_id = contig_y_vw.contig_id
;


> > cloneA.project=cloneB.project
> > cloneA.plate=cloneB.plate
> > cloneA.row=cloneB.row
> > cloneA.col=cloneB.col
> >
> >
> > TABLE clone 'A' 'B'
> >
> > clone_id 12018 12019
> > project 894 894
> > plate 27 27
> > row G G
> > col 9 9
> > read x y
>
> Does that get you any further? If not, I'm back at my desk tomorrow 9am GMT
>
> - Richard Huxton
>
>


From: Richard Huxton <dev(at)archonet(dot)com>
To: Charles Hauser <chauser(at)duke(dot)edu>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Problems Formulating a SELECT
Date: 2002-10-09 08:47:15
Message-ID: 200210090947.15918.dev@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

On Tuesday 08 Oct 2002 7:19 pm, Charles Hauser wrote:
> Richard,
>
[snip]
> Is there a method to remove duplicate results? For instance the query
> below in part yields :

SELECT DISTINCT ... is what you're after. I'd do it in the views so the join
has less rows to compare against.

> chlamy_est-> ;
> contig_id
> -----------
> 27170
> 27173
> 27173
> 27179
> 27179
> 27179
> 27179
> 27179

--
Richard Huxton