Re: GSoC project: K-medoids clustering in Madlib

Lists: Postg토토 캔SQL :
From: Akansha Singh <akansha(dot)singh(at)oracle(dot)com>
To: <maxence(dot)ahlouche(at)gmail(dot)com>
Cc: <atri(dot)jiit(at)gmail(dot)com>, <pgsql-students(at)postgresql(dot)org>, <devel(at)madlib(dot)net>, <Sujit(dot)Philip(at)emc(dot)com>, <Rahul(dot)Iyer(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-04-19 10:13:34
Message-ID: PostgreSQL : Re : GSOC 프로젝트 : 윈 토토의 K- 메드 클러스터링
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

Hi,
MADLib guys, Any Updates..?

On my Part I am trying to understand the modules placed in Github .I a trying to get hands on it.

http://madlib.net/

https://github.com/madlib/madlib/


From: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
To: "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>
Cc: "atri(dot)jiit(at)gmail(dot)com" <atri(dot)jiit(at)gmail(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-04-20 10:38:47
Message-ID: CAJeaomXfCTcLz2=pNCMU3qRSrsG16WOr9eWrR5VvinEz7UkgVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg토토 캔SQL :

Hi all!

I've had a bit of fun with the k-means clustering, and have made a small
script to visualize the result of the classification.
However, I couldn't guess how to assign a cluster to a point from the
output of the algorithm, could someone give me an indication, please?

My script is written in python3, and uses py-postgresql (
http://python.projects.pgfoundry.org/) as PostgreSQL interface. It also
requires Pillow (a PIL fork) which you can find here :
https://pypi.python.org/pypi/Pillow/2.0.0.

Before your first use, you may want to change the settings (on top of the
file) to connect to your PostgreSQL server.
The script will create a table in your database, populate it with random
groups of points, and then call the k-means algorithm on it. Finally, it
will generate a PNG image, displaying the points and the centroids.

For a first run, use something like this:
./k-means_test.py --regen -o clustered_data.png

You can call "./k-means_test.py -h" for a list of available options.

In attachment are my script and an example of its output.

By the way, I'll have a lot of work next week, as I have several exams
coming and a big project to do (about empirical orthogonal functions), so
I'll probably be inactive for a few days! Then I'll be on holidays, so I
will be able to focus on MADlib and GSoC :)

Regards,
Maxence

2013/4/19 Iyer, Rahul <Rahul(dot)Iyer(at)emc(dot)com>

> Hi Akansha,
>
> I am confused about the question - MADlib is open-source and
> available from Github. If you're having trouble in fork/clone or have a
> specific question about a module, we would be glad to help you. Please be
> specific about your question.
>
> - Rahul
> ---------------------------------------------------------
> *Rahul Iyer
> *Senior Software Engineer | Predictive Analytics
> rahul(dot)iyer(at)emc(dot)com
>
> On Apr 19, 2013, at 3:13 AM, Akansha Singh wrote:
>
> Hi, MADLib guys, Any Updates..? On my Part I am trying to understand the
> modules placed in Github .I a trying to get hands on it.
> http://madlib.net/ https://github.com/madlib/madlib/
>
>
>

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac

Attachment Content-Type Size
k-means_test.py application/octet-stream 5.8 KB

From: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
To: "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>
Cc: "atri(dot)jiit(at)gmail(dot)com" <atri(dot)jiit(at)gmail(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-04-20 10:41:18
Message-ID: CAJeaomWgP_DeuXnpO0zTVMjX12Lq8_DHSOQ9vpV52cucEhu_sg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

Oops, forgot to attach the output!

2013/4/20 Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>

> Hi all!
>
> I've had a bit of fun with the k-means clustering, and have made a small
> script to visualize the result of the classification.
> However, I couldn't guess how to assign a cluster to a point from the
> output of the algorithm, could someone give me an indication, please?
>
> My script is written in python3, and uses py-postgresql (
> http://python.projects.pgfoundry.org/) as PostgreSQL interface. It also
> requires Pillow (a PIL fork) which you can find here :
> https://pypi.python.org/pypi/Pillow/2.0.0.
>
> Before your first use, you may want to change the settings (on top of the
> file) to connect to your PostgreSQL server.
> The script will create a table in your database, populate it with random
> groups of points, and then call the k-means algorithm on it. Finally, it
> will generate a PNG image, displaying the points and the centroids.
>
> For a first run, use something like this:
> ./k-means_test.py --regen -o clustered_data.png
>
> You can call "./k-means_test.py -h" for a list of available options.
>
> In attachment are my script and an example of its output.
>
> By the way, I'll have a lot of work next week, as I have several exams
> coming and a big project to do (about empirical orthogonal functions), so
> I'll probably be inactive for a few days! Then I'll be on holidays, so I
> will be able to focus on MADlib and GSoC :)
>
> Regards,
> Maxence
>
>
> 2013/4/19 Iyer, Rahul <Rahul(dot)Iyer(at)emc(dot)com>
>
> Hi Akansha,
>>
>> I am confused about the question - MADlib is open-source and
>> available from Github. If you're having trouble in fork/clone or have a
>> specific question about a module, we would be glad to help you. Please be
>> specific about your question.
>>
>> - Rahul
>> ---------------------------------------------------------
>> *Rahul Iyer
>> *Senior Software Engineer | Predictive Analytics
>> rahul(dot)iyer(at)emc(dot)com
>>
>> On Apr 19, 2013, at 3:13 AM, Akansha Singh wrote:
>>
>> Hi, MADLib guys, Any Updates..? On my Part I am trying to understand the
>> modules placed in Github .I a trying to get hands on it.
>> http://madlib.net/ https://github.com/madlib/madlib/
>>
>>
>>
>
>
> --
> Maxence Ahlouche
> 06 06 66 97 00
> 93 avenue Paul DOUMER
> 24100 Bergerac
>

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac

Attachment Content-Type Size
image/png 4.9 KB

From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
Cc: "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-04-20 13:00:31
Message-ID: 2DAD673D-4C46-4039-9D83-AE0D5C16764C@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

Sent from my iPad

On 20-Apr-2013, at 16:11, Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com> wrote:

> Oops, forgot to attach the output!
>
>
> 2013/4/20 Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
>> Hi all!
>>
>> I've had a bit of fun with the k-means clustering, and have made a small script to visualize the result of the classification.
>> However, I couldn't guess how to assign a cluster to a point from the output of the algorithm, could someone give me an indication, please?
>>
>> My script is written in python3, and uses py-postgresql (http://python.projects.pgfoundry.org/) as PostgreSQL interface. It also requires Pillow (a PIL fork) which you can find here : https://pypi.python.org/pypi/Pillow/2.0.0.
>>
>> Before your first use, you may want to change the settings (on top of the file) to connect to your PostgreSQL server.
>> The script will create a table in your database, populate it with random groups of points, and then call the k-means algorithm on it. Finally, it will generate a PNG image, displaying the points and the centroids.
>>
>> For a first run, use something like this:
>> ./k-means_test.py --regen -o clustered_data.png
>>
>> You can call "./k-means_test.py -h" for a list of available options.
>>
>> In attachment are my script and an example of its output.
>>
>> By the way, I'll have a lot of work next week, as I have several exams coming and a big project to do (about empirical orthogonal functions), so I'll probably be inactive for a few days! Then I'll be on holidays, so I will be able to focus on MADlib and GSoC :)
>>
>> Regards,
>> Maxence
>>
>>
>>

Very interesting! The results look encouraging,although this is on Python :)

Good work!

Regards,

Atri


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
Cc: "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-04-20 14:31:28
Message-ID: CAOeZVie5EhMxE4XMm_5XBGg=X60mmga+FD1S1ao3N9=GYeZFtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

On Sat, Apr 20, 2013 at 7:31 PM, Maxence AHLOUCHE
<maxence(dot)ahlouche(at)gmail(dot)com> wrote:
>
>
>
> 2013/4/20 Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
>>
>> Very interesting! The results look encouraging,although this is on Python
>> :)
>>
>> Good work!
>>
>> Regards,
>>
>> Atri
>
>
> Thanks :)
> But do you know if there is a way to know the cluster that a point has been
> assigned to? Can the objective function have something to do with it? I
> haven't understood why it was returned yet!

I didnt get your question. Could you please elaborate a bit more?

Regards,

Atri


From: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-04-20 14:41:20
Message-ID: CAJeaomV1mSrqwg3u3QZCju+KaNoV2omiSjVtn2zQxk8Pt1V5HA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

Sure!

The k-means algorithms tries to group the points, but how can we know to
which group a point has been assigned?
What I mean is that, on the output, I would like to color the points with
the same color as the centroid they "depend" on.

And another question, which I thought could be related to the first one, is
why does the algorithms returns the objective function? What's its use?

Thanks ffor spending time for my questions :)

2013/4/20 Atri Sharma <atri(dot)jiit(at)gmail(dot)com>

> On Sat, Apr 20, 2013 at 7:31 PM, Maxence AHLOUCHE
> <maxence(dot)ahlouche(at)gmail(dot)com> wrote:
> >
> >
> >
> > 2013/4/20 Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
> >>
> >> Very interesting! The results look encouraging,although this is on
> Python
> >> :)
> >>
> >> Good work!
> >>
> >> Regards,
> >>
> >> Atri
> >
> >
> > Thanks :)
> > But do you know if there is a way to know the cluster that a point has
> been
> > assigned to? Can the objective function have something to do with it? I
> > haven't understood why it was returned yet!
>
> I didnt get your question. Could you please elaborate a bit more?
>
> Regards,
>
> Atri
>

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
Cc: "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-04-20 14:59:41
Message-ID: CAOeZVifHBMpaNcMyRhAtQE0=0kLGeF1ptekDkKLzjx1djxzCdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

On Sat, Apr 20, 2013 at 8:11 PM, Maxence AHLOUCHE
<maxence(dot)ahlouche(at)gmail(dot)com> wrote:
> Sure!
>
> The k-means algorithms tries to group the points, but how can we know to
> which group a point has been assigned?
> What I mean is that, on the output, I would like to color the points with
> the same color as the centroid they "depend" on.
>
> And another question, which I thought could be related to the first one, is
> why does the algorithms returns the objective function? What's its use?
>
> Thanks ffor spending time for my questions :)

No problem

You can probably maintain a data structure for this purpose. A simple
Vector would suffice, I think. You will need to empty the Vectors in
each iteration of the algorithm, until the algorithm doesnt finish.
Then, the vectors shall contain the final memberships.

So, for each Vector, you designate the current centroid and put the
points assigned to that centroid's groups in that Vector. Then, if
another iteration of your algorithm shall run, you can empty the
vectors and reassign the centroids.

Atri
--
Regards,

Atri
l'apprenant


From: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-04-21 17:46:46
Message-ID: CAJeaomW50Bt-RFZs73b6M4iDaqj8mFFrfjXzHsdhSJb4iJzOHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

Hello!

I've pretty much improved my visualizer:

- It is now able to color the points according to their assigned
cluster. It has occured to me last night that it was the main inconvenient
of the k-means algorithm: a point is assigned to the nearest centroid's
cluster, which is easy to compute.
- It's now able to load data from a file, if it has been previously
generated, so it is no longer mandatory to specify the number of clusters
wished when using "old" data.
- It displays the original clustering and the k-means++ clustering for
comparison purposes, and it is very easy to add new clusterings.
- It is available on GitHub!
https://github.com/viodlen/clustering_visualizer

Still, there is room for improvements:

- It only tests with 2-dimensional spaces. This won't evolve, as it
would get difficult to visualize.
- It only uses the euclidean distance. This can be fixed, but would be
heavy to implement, and probably ugly (a hashmap to match the python's
distance function with the MADlib's one).
- For now, the points are reparted in the clusters folowing a gaussian
law (not sure of my vocabulary here). This can be easily fixed, and will
probably be in a future version.

In attachment is an output example. It shows the poor results of the
k-means algorithm on contiguous clusters.

It also shows the interest of the k-means++ algorithm: the small light-blue
cluster (in the original clustering) is correctly identified as a complete
cluster by the k-means++ algorithm, when it was only a part of a bigger
cluster with the simple k-means algorithm.

The characteristics of the k-means algorithm make it easy to calculate the
clusters only from the points and the centroids, but this won't be true for
the k-medoids algorithm. So, sadly, if I implement the latter, it won't be
possible to keep the same function signature: some more data will have to
be returned, along with the centroids list.

I've also wondered if it would be useful to implement the clustering
algorithms for non-float vectors (for example, strings), provided the user
gives a distance function for this type?
--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac

Attachment Content-Type Size
image/png 18.3 KB

From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
Cc: "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-04-21 18:03:07
Message-ID: 46950856-D053-4ED5-8E4C-9F620E15C981@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

Sent from my iPad

On 21-Apr-2013, at 23:16, Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com> wrote:

> Hello!
>
> I've pretty much improved my visualizer:
> It is now able to color the points according to their assigned cluster. It has occured to me last night that it was the main inconvenient of the k-means algorithm: a point is assigned to the nearest centroid's cluster, which is easy to compute.
> It's now able to load data from a file, if it has been previously generated, so it is no longer mandatory to specify the number of clusters wished when using "old" data.
> It displays the original clustering and the k-means++ clustering for comparison purposes, and it is very easy to add new clusterings.
> It is available on GitHub! https://github.com/viodlen/clustering_visualizer
> Still, there is room for improvements:
>
> It only tests with 2-dimensional spaces. This won't evolve, as it would get difficult to visualize.
> It only uses the euclidean distance. This can be fixed, but would be heavy to implement, and probably ugly (a hashmap to match the python's distance function with the MADlib's one).
> For now, the points are reparted in the clusters folowing a gaussian law (not sure of my vocabulary here). This can be easily fixed, and will probably be in a future version.
> In attachment is an output example. It shows the poor results of the k-means algorithm on contiguous clusters.
>
> It also shows the interest of the k-means++ algorithm: the small light-blue cluster (in the original clustering) is correctly identified as a complete cluster by the k-means++ algorithm, when it was only a part of a bigger cluster with the simple k-means algorithm.
>
>
> The characteristics of the k-means algorithm make it easy to calculate the clusters only from the points and the centroids, but this won't be true for the k-medoids algorithm. So, sadly, if I implement the latter, it won't be possible to keep the same function signature: some more data will have to be returned, along with the centroids list.
>
>
>
> I've also wondered if it would be useful to implement the clustering algorithms for non-float vectors (for example, strings), provided the user gives a distance function for this type?
>
>

Interesting! Good work!

Could you draw up a summary, giving your findings about the performance of different algorithms,and which one should be implemented,or both(k means++ vs k medoids).

Regards,

Atri


From: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, hellerstein(at)cs(dot)berkeley(dot)edu
Cc: "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-04-21 21:21:14
Message-ID: CAJeaomWVrTvP5O3oYePCnmTnMQYkz2_JkC_8MgrrDf6ui+z1uA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

2013/4/21 Atri Sharma <atri(dot)jiit(at)gmail(dot)com>

>
> Interesting! Good work!
>
> Could you draw up a summary, giving your findings about the performance of
> different algorithms,and which one should be implemented,or both(k means++
> vs k medoids).
>
> Regards,
>
> Atri
>

From the few articles I've already read, I've found that K-medoids
clustering usually goes faster on standard datasets such as the ones I
generate). But I'll look for more detailed information during the week, and
report what I'll have found here!
By the way, have you got any idea of other forms of datasets that could be
useful to test?

2013/4/21 <hellerstein(at)cs(dot)berkeley(dot)edu>

> Very cool!
>
> May I suggest generating a visualization in a web toolkit? Perhaps the
> new vega library would be simplest (http://trifacta.github.io/vega/) or
> the more popular but lower-level D3.js?
>
> More generally, a project to connect MADlib outputs to vega vis
> specifications seems like it would be enormously useful!
>
> Joe
>

I'll give it a look during my holidays, in a week! It would indeed be nice
if one just had to open a webpage to test my work!
Considering your other idea, aren't MADlib outputs PostgreSQL/GreenPlum
outputs? If so, only a database connector is required, which probably
already exists (I may be wrong, I had never heard of D3.js or Vega before,
and I don't know well the MADlib project yet).

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: hellerstein(at)cs(dot)berkeley(dot)edu
Cc: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>, "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>
Subject: Re: [MADlib-devel] GSoC project: K-medoids clustering in Madlib
Date: 2013-04-25 05:16:04
Message-ID: CAOeZVidRd8yrfNfig1YjTj6DTJRFRcv+yeoSYkZp28+cspK4vg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

On 4/21/13, hellerstein(at)cs(dot)berkeley(dot)edu <hellerstein(at)cs(dot)berkeley(dot)edu> wrote:
> Very cool!
>
>
> May I suggest generating a visualization in a web toolkit? Perhaps the new
> vega library would be simplest (http://trifacta.github.io/vega/) or the more
> popular but lower-level D3.js?
>
>
> More generally, a project to connect MADlib outputs to vega vis
> specifications seems like it would be enormously useful!

Yes, the idea seems awesome. Generating these kind of results in a web
toolkit should serve multiple purposes.

Regards,

Atri


From: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
To: "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>
Cc: "atri(dot)jiit(at)gmail(dot)com" <atri(dot)jiit(at)gmail(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-05-02 12:04:38
Message-ID: CAJeaomUm=My26fb5_x9dvfxPdVZd5RPpFG=NL87UWgLM8mEpUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

Hello!

I've submitted my proposal on the GSoC website.
I've been inactive for the last few days, because I was setting up a server
in order to make all my tests. As it is the first time I do this, I've met
a bunch of (usually stupid) problems, but it's now almost entirely
configured! I hope I'll soon be able to provide a web visualization for the
k-means algorithm, but it will probably be a simple png at first. It will
allow you to test my work without having to download or install anything.

Regards,
Maxence


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
Cc: "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-05-02 12:30:27
Message-ID: 890D6A77-C166-4526-82EF-1AEB110AEC28@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

Sent from my iPad

On 02-May-2013, at 17:34, Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com> wrote:

> Hello!
>
> I've submitted my proposal on the GSoC website.
> I've been inactive for the last few days, because I was setting up a server in order to make all my tests. As it is the first time I do this, I've met a bunch of (usually stupid) problems, but it's now almost entirely configured! I hope I'll soon be able to provide a web visualization for the k-means algorithm, but it will probably be a simple png at first. It will allow you to test my work without having to download or install anything.
>
>

Great.All the best and looking forward to the web visualisation.

Regards,

Atri


From: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
To: "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>
Cc: "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, hlinnakangas(at)vmware(dot)com, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-05-08 10:57:08
Message-ID: CAJeaomXOTfn=nMwFTaKmBq6pDMBY5e6U+HunGay5qeALCC6ATA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

Hi!

I'm pasting here the comment Heikki Linnakangas left on my GSoC proposal
(available here:
http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/viod/1)

I don't think we have anyone from the madlib project signed up as a mentor
currently. Is there someone in the madlib project that would be willing to
mentor this? We'll need to get a mentor assigned for this ASAP, before we
can even consider this.

I don't think we have anyone from the madlib project signed up as a mentor
currently. Is there someone in the madlib project that would be willing to
mentor this? We'll need to get a mentor assigned for this ASAP, before we
can even consider this.

So, would anyone from MADlib be interested in co-mentoring this project? I
think Atri Sharma was willing to mentor this project, on the PostgreSQL
side.

Thanks in advance!

Regards,
Maxence


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
Cc: "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "hlinnakangas(at)vmware(dot)com" <hlinnakangas(at)vmware(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-05-08 12:21:45
Message-ID: 785CF657-13B8-4DC7-913C-4CB62AD1ABC9@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

Sent from my iPad

On 08-May-2013, at 16:27, Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com> wrote:

> Hi!
>
> I'm pasting here the comment Heikki Linnakangas left on my GSoC proposal (available here: http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/viod/1)
>
> I don't think we have anyone from the madlib project signed up as a mentor currently. Is there someone in the madlib project that would be willing to mentor this? We'll need to get a mentor assigned for this ASAP, before we can even consider this.
>
> I don't think we have anyone from the madlib project signed up as a mentor currently. Is there someone in the madlib project that would be willing to mentor this? We'll need to get a mentor assigned for this ASAP, before we can even consider this.
> So, would anyone from MADlib be interested in co-mentoring this project? I think Atri Sharma was willing to mentor this project, on the PostgreSQL side.
>
> Thanks in advance!
>
> Regards,
> Maxence

I am still available as a co mentor.Rahul, would you be willing to be the mentor,please?

Regards,

Atri


From: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
To: "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, akansha(dot)singh(at)oracle(dot)com
Cc: "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, hlinnakangas(at)vmware(dot)com, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-05-08 12:24:52
Message-ID: CAJeaomWot53HaWEkGg3AS=+3XB+xLfsXG=ede1ZgCuJ7C=tk1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

Akansha Singh said:

> HI, I would like to extend my help to mentor if possible.
>

Sure, you should contact the GSoC responsibles in PostgreSQL! I'm not in
charge for deciding who will mentor this project.

Regards,
Maxence


From: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "hlinnakangas(at)vmware(dot)com" <hlinnakangas(at)vmware(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-05-13 17:20:25
Message-ID: CAJeaomX8dSKg3m419ka4kvY4Mx4f4-rGw2w_4FCqYvV+ij=v9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

2013/5/8 Atri Sharma <atri(dot)jiit(at)gmail(dot)com>

>
> On 08-May-2013, at 16:27, Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
> wrote:
>
> Hi!
>
> I'm pasting here the comment Heikki Linnakangas left on my GSoC proposal
> (available here:
> http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/viod/1
> ):
>
> I don't think we have anyone from the madlib project signed up as a mentor
> currently. Is there someone in the madlib project that would be willing to
> mentor this? We'll need to get a mentor assigned for this ASAP, before we
> can even consider this.
>
> So, would anyone from MADlib be interested in co-mentoring this project? I
> think Atri Sharma was willing to mentor this project, on the PostgreSQL
> side.
>
>
> I am still available as a co mentor.Rahul, would you be willing to be the
> mentor,please?
>

Ping, MADlib?

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac


From: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
To: akansha(dot)singh(at)oracle(dot)com
Cc: "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "hlinnakangas(at)vmware(dot)com" <hlinnakangas(at)vmware(dot)com>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-05-14 06:04:29
Message-ID: CAJeaomWwme4PsjQa9CUXW8Qg=_3syPX0DGNp6+_cng4YSTvZSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

Hello!

Akansha Singh said:

> Hi, I will be obliged if given a chance to mentor.
>

On the MADlib side so? Just put a comment here then:
http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/viod/1 ,
so that Heikki can be notified!

Regards,
Maxence

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
Cc: Akansha Singh <akansha(dot)singh(at)oracle(dot)com>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "hlinnakangas(at)vmware(dot)com" <hlinnakangas(at)vmware(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-05-14 06:42:56
Message-ID: CAOeZVidT7NynXwGgwp6v9y4HgN6uTBkFgJNMa7U2WJ2cVrhd9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

On Tue, May 14, 2013 at 11:34 AM, Maxence AHLOUCHE
<maxence(dot)ahlouche(at)gmail(dot)com> wrote:
> Hello!
>
> Akansha Singh said:
>>
>> Hi, I will be obliged if given a chance to mentor.
>
>
> On the MADlib side so? Just put a comment here then:
> http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/viod/1 ,
> so that Heikki can be notified!
>
> Regards,
> Maxence
>
> --
> Maxence Ahlouche
> 06 06 66 97 00
> 93 avenue Paul DOUMER
> 24100 Bergerac

You cannot be a mentor and a student at the same time. I think a
senior member of the community clearly expressed this to Ms. Akansha
earlier.

Maxence, we need a member of the MADLib community to be the mentor. I
would suggest you to contact them for the same.

Regards,

Atri

Regards,

Atri

--
Regards,

Atri
l'apprenant


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>, Akansha Singh <akansha(dot)singh(at)oracle(dot)com>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-05-14 07:06:07
Message-ID: 5191E25F.8050203@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

On 14.05.2013 09:42, Atri Sharma wrote:
> Maxence, we need a member of the MADLib community to be the mentor. I
> would suggest you to contact them for the same.

Well, Philip Sujit and Rahul Iyer are CC'd on this thread. Looking at
the mailing list archives and commit history, I believe they are the two
most active people working on Madlib. I would expect Philip or Rahul to
mentor, or for them to point at someone else who knows the code and the
community well enough to mentor.

Philip, Rahul, is either one of you interested in mentoring any of the
proposed GSoC projects, under the PostgreSQL umbrella? If you are, we
need to get you signed up in the next couple of days, so that you can
take part in reviewing and ranking the proposals.

- Heikki


From: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Rahul Iyer <riyer(at)gopivotal(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>, Caleb Welton <cwelton(at)gopivotal(dot)com>, Sujit Philip <sphilip(at)gopivotal(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-05-25 09:21:40
Message-ID: CAJeaomXbmAdwOBUU-OAcQpG_RbwhwJj0fk+V=aa+0KKn3JtNLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-students

2013/5/24 Josh Berkus <josh(at)agliodbs(dot)com>

>
> Well, the problem is that you needed to do it over a week ago. Google
> has pretty strict deadlines for GSOC. As such, we had to reject all
> proposals for work on Madlib this year.
>
> Next year, we will get one or more of your team signed up early in the
> GSOC process.
>
> Well, sad :(
I'm gonna try to work on my project this summer anyway, but as I have
another job, I won't be able to follow the deadlines I had planned.

Maybe next year then, if my university considers GSoC a valid placement!

Regards,
Maxence

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac