[Gsoc] Bugs to work on

classic Classic list List threaded Threaded
7 messages Options
Soumitra Agarwal Soumitra Agarwal
Reply | Threaded
Open this post in threaded view
|

[Gsoc] Bugs to work on

Hi,

I am Soumitra Agarwal, senior year undergrad. at the Indian Institute of
Technology, Guwahati. I have been working on machine learning based
algorithms for the past 2 years as part of my course curriculum (focused on
Computer Vision).

I was part of the Red Hen Labs for Google Summer of Code 2016 and would love
to contribute to Scilab as a part of GSoC 2018 (and otherwise as well). I am
well versed with the traditional as well as recent machine learning
algorithms (presently doing research related to neural trees) and thus the
project titled 'Machine Learning features in Scilab' possesses my primary
interest.

I completed the build for Scilab and can display my name on the banner. I
had a few queries regarding the projects (pertaining to which bugs to work
on before submitting a proposal etc.) and wanted to confirm whether this is
the best channel for communication and if I am not too late already.

Regards,
Soumitra Agarwal





--
Sent from: http://mailinglists.scilab.org/Scilab-GSOC-Mailing-Lists-Archives-f2646148.html
_______________________________________________
gsoc mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/gsoc
mandroid6 mandroid6
Reply | Threaded
Open this post in threaded view
|

Re: Bugs to work on

Hi Soumitra,

I went through your proposal for the Machine Learning project. Good work
with the presentation!

Surely the Investment in improving and cleaning the already present Atoms
Toolboxes would be useful, but this by itself would consume an entire
summer's effort. There are alot of toolboxes to cover, if we decide to take
up this task.

As you have mentioned, your approach carries the same idea for integrating
Jupyter within Scilab but isn't necessarily an extension to last year's
work. For the jupyter integration, currently we need to have Scilab side
module (PIMS) which understands Python commands necessary to communicate
with the kernel.

Could you briefly explain how this summer's work would be different from the
current integration of Jupyter?

I like the part about  "implementing a feeder/subscriber mechanism".
How you are planning to create the mentioned mechanism?

Even a rough idea would be sufficient at this stage.

Thanks.

Mandar



--
Sent from: http://mailinglists.scilab.org/Scilab-GSOC-Mailing-Lists-Archives-f2646148.html
_______________________________________________
gsoc mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/gsoc
Soumitra Agarwal Soumitra Agarwal
Reply | Threaded
Open this post in threaded view
|

Re: Bugs to work on

Hi Mandar,

I could probably take up one or two toolboxes (it only accounts for 10% of
the summer) during which I can get more accustomed to the codebase as well.

Regarding the integration, the basic idea is to have pre-written python
scripts stored on the server side and then using a shell scripting command
(unix?) in scilab we execute the required file with user driven parameters.
I have written a very small example that runs locally  here
<https://github.com/SoumitraAgarwal/ScilabDump/tree/master/Jupyter>  
(running Unix.sci prints Hello Scilab! by executing the python script). I
also proposing that we store the models locally (on Scilab side) and also
that we keep storing and updating the model on the same side. So, this is
how the feeder/subscriber mechanism should work:

1. Scilab calls the execution of a shell script with some pre-stated
parameters which in turn pings the Jupyer server, provides it with the
parameters and which python script to execute (which is written and stored
on the Jupyter server).

2. Generally deep learning models run and update over several thousand
iterations. So, a subscriber would wait on the feeder (Jupyter side) which
gives the updated model to it once every few iterations.

Storing the model locally lets a lot of users use a single server. I hope
that clarifies my idea.

Regards,
Soumitra



--
Sent from: http://mailinglists.scilab.org/Scilab-GSOC-Mailing-Lists-Archives-f2646148.html
_______________________________________________
gsoc mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/gsoc
mandroid6 mandroid6
Reply | Threaded
Open this post in threaded view
|

Re: Bugs to work on

Hi Soumitra,

Thanks for the explanation. That definitely clears your approach.

I am still skeptical about running unix scripts within Scilab on the local
machine. I took a look at your sample script (Unix.sci). Passing all the
arguments required for running a model through Jupyter would require us to
edit the unix scripts. Maybe after discussing this further, it may prove to
be a good approach.

Also I partly agree with your idea of storing the models locally, but that
is assuming that we are not creating separate workspace/environment for
every user. One major point of having a Jupyter integration is to enable any
system which "simply" supports Scilab to use machine learning through the
available server, without having to deal with model weights/training data
locally.

I think one focus this year can be to work on the connection between the
feeder --> subscriber through secure channels (ssl/ssh maybe) and also allow
multiple users to use the same server at any given moments.

Lets see how the discussions unfold over the next few weeks.
All the best till then!

Mandar





--
Sent from: http://mailinglists.scilab.org/Scilab-GSOC-Mailing-Lists-Archives-f2646148.html
_______________________________________________
gsoc mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/gsoc
mandroid6 mandroid6
Reply | Threaded
Open this post in threaded view
|

Re: Bugs to work on

In reply to this post by Soumitra Agarwal
Hi Soumitra,

Thanks for the explanation. That definitely clears your approach.

I am still skeptical about running unix scripts within Scilab on the local
machine. I took a look at your sample script (Unix.sci). Passing all the
arguments required for running a model through Jupyter would require us to
edit the unix scripts. Maybe after discussing this further, it may prove to
be a good approach.

Also I partly agree with your idea of storing the models locally, but that
is assuming that we are not creating separate workspace/environment for
every user. One major point of having a Jupyter integration is to enable any
system which "simply" supports Scilab to use machine learning through the
available server, without having to deal with model weights/training data
locally.

I think one focus this year can be to work on the connection between the
feeder --> subscriber through secure channels (ssl/ssh maybe) and also allow
multiple users to use the same server at any given moments.

Lets see how the discussions unfold over the next few weeks.
All the best till then!

Mandar





--
Sent from: http://mailinglists.scilab.org/Scilab-GSOC-Mailing-Lists-Archives-f2646148.html
_______________________________________________
gsoc mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/gsoc
Soumitra Agarwal Soumitra Agarwal
Reply | Threaded
Open this post in threaded view
|

Re: Bugs to work on

Hi Mandar,

I have written some codes  here
<https://github.com/SoumitraAgarwal/ScilabDump/tree/master/Jupyter>   which
would probably be able to demo the functioning I was explaining (and
probably reduce your scepticism ;) ). The code works like the following :

1. The Driver.sci script is the script we expect the user to edit. This
script stores the arguments, the script to run (on the server) and the
server from which it has to run in 3 files on the local machine. Once done
it calls Unix.sci. The work of our user is done now.

2. Unix.sci picks up the arguments, script and server from the files and
then pings the ssh server (For the demo I installed my rsa-pubkey on the
server mentioned in the codes). A connection to the server is established
and the python script printer.py is called (on the server side, but since
you cannot see the server I was running this on I have included a copy).

3. Printer.py picks up the inline arguments given to it by Unix.sci and
proceeds.

I hope this opens us up to new possibilities that we can explore given the
setup. As for the case of storing the models locally, once the work for
Printer.py (this is just a dummy script and we would probably have a deep
learning script here which would give us a model as output) is done we ping
our local machine from the server and save/pickle the model into on of our
local directory. This mechanism ensures that all the computation related to
the python code is happening on the server side and the user can directly
use python deep learning codes on the fly, in exchange for a small part of
his local storage.

I like the idea of creating a separate workspace though, if it we are not
short on the storage prowess.

I tried using an ssh implementation right now which seems like the way to go
forward. Here is a happy screenshot of the driver script running

<http://mailinglists.scilab.org/file/t497762/34.png>

Thanks for the feedback. I would love to hear more.

Regards,
Soumitra




--
Sent from: http://mailinglists.scilab.org/Scilab-GSOC-Mailing-Lists-Archives-f2646148.html
_______________________________________________
gsoc mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/gsoc
mandroid6 mandroid6
Reply | Threaded
Open this post in threaded view
|

Re: Bugs to work on

Hey Soumitra,

Yes I understand the approach and yes it would work in most cases.

My only concern is the need to edit the unix script (maybe through Scilab
scripts directly?) for every parameter thats needed for bigger models (nn).
We can definitely have a cleaner workflow designed over the summer, my
comments were just the initial thoughts on the idea.

For separate workspaces our main goal right now wpuld be to just be able to
demonstrate this on a remote server, even if the storage space is limited.
It can be scaled later as required.

It's great to see that you are trying out different ideas (ssh) before we
actually finalize the project's direction. Going good!

Mandar




--
Sent from: http://mailinglists.scilab.org/Scilab-GSOC-Mailing-Lists-Archives-f2646148.html
_______________________________________________
gsoc mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/gsoc