[Gsoc] Progress and wiki GSoC 2018

classic Classic list List threaded Threaded
3 messages Options
Soumitra Agarwal Soumitra Agarwal
Reply | Threaded
Open this post in threaded view
|

[Gsoc] Progress and wiki GSoC 2018

Hi!

I am presently working on the project 'Machine Learning features in Scilab'
as part of the GSoC 2018 program.

I would be posting updates and upgrades (including daily progress and other
tit-bits) on the  wiki page
<https://wiki.scilab.org/agarwalsoumitra1504%40gmail.com/Daily%20reports%20for%20Machine%20learning%20features%20in%20Scilab>
. One can checkout the following  repository
<https://github.com/SoumitraAgarwal/Scilab-gsoc>   for analysing progress.

Best,
Soumitra



--
Sent from: http://mailinglists.scilab.org/Scilab-GSOC-Mailing-Lists-Archives-f2646148.html
_______________________________________________
gsoc mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/gsoc
Soumitra Agarwal Soumitra Agarwal
Reply | Threaded
Open this post in threaded view
|

Re: Progress and wiki GSoC 2018

Hi everyone!

It has been more than 2 weeks since I began working on the project titled
'Machine learning features in Scilab' for Google Summer of Code 2018 and I
think that this is a good time to share my progress with the community.


The coding effort was divided into two streams namely

1. *Development* : Initiative to create a standalone machine learning
toolbox written completely in Scilab
2. *Experimentation* : Initiative to run machine learning scripts already
written in python using a feeder subscriber mechanism which can be called by
a user with scripts residing on a server. It also includes any other effort
made ideate machine learning easier to do in Scilab other than the
development part.

Development


The standalone machine learning toolbox presently contains the following
parts

 1. Algorithms
 2. Preprocessing
 3. Visualisation

The following algorithms have been implemented in form of macros on the
github repository
<https://github.com/SoumitraAgarwal/Scilab-gsoc/tree/master/Development/Algorithms>  

 - Decision Tree classification (CART)
 - Linear regression
 - Logistic regression
 - Naive Bayes (Gaussian)
 - Polynomial regression
 -  K-Means clustering

The following pre-processing methods have also been added to the set of
macros

- Normalization
- Scaling (Zero mean, unit variance)
- Train Test split

 Experimentation


The work under the experimentation domain began with the setup of a GCP
server with ipython (Jupyter) server with only a set of specific keys able
to log into the machine. This machine would act as our server and would do
the computation for the python scripts that have the machine learning
algorithms pre-written. Our client tries to log into the machine, start up a
kernel and copy the kernel configuration file to its local machine. The
scripts for this can be found on the  github sub-repository
<https://github.com/SoumitraAgarwal/Scilab-gsoc/tree/master/Experimentation>
. This then can be integrated with the approach used in the project last
year to run the script as an interim to a larger Scilab code.

The next step was to ensure an authentication mechanism so that a user
doesn't have the permission to do anything other than just run a kernel and
copy its script. How to analyse which kernel a user has started still eludes
us, but using the command option with the authorized_keys parameter in the
OpenSSH mechanism we were able to lock a users ability to execute commands
on a server.

Any advice on how to tag a kernel with a user under the present setup or any
other suggestions would be welcome.

Regards,
Soumitra Agarwal




--
Sent from: http://mailinglists.scilab.org/Scilab-GSOC-Mailing-Lists-Archives-f2646148.html
_______________________________________________
gsoc mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/gsoc
Soumitra Agarwal Soumitra Agarwal
Reply | Threaded
Open this post in threaded view
|

Re: Progress and wiki GSoC 2018

Hello everyone!

We are closing in on the mid-way mark for the GSoC 2018 program and I would
like to share my progress. As discussed in the earlier part of this thread I
was working on two simultaneous sections of Machine Learning Development for
Scilab. After some debate and analysis, it was accepted that I work on a
single part (Development, i.e. the initiative to create a standalone toolbox
for Machine Learning) till my second evaluation since it was very
straightforward and then continue with the experimentation section
subsequently.

Today we stand with a toolbox comprising of 22 algorithmic macros, 8
preprocessing macros and 2 visualisation macros (details for which can be
found on the  daily updates page
<https://wiki.scilab.org/agarwalsoumitra1504%40gmail.com/Daily%20reports%20for%20Machine%20learning%20features%20in%20Scilab#preview>
).

This <https://github.com/SoumitraAgarwal/Scilab-gsoc>   github repo is where
all the current development is pushed.

The agenda going forward till the 2nd evaluation is to pen down the builder
scripts and run tests on the modules we have right now so that it could be
made available for general use.

After the 2nd evaluation (though this would start earlier given the pace) we
plan on resuming work on the Experimentation section which is the initiative
to run machine learning scripts already written in python using a feeder
subscriber mechanism which can be called by a user with scripts residing on
a server. It also includes any other effort made ideate machine learning
easier to do in Scilab other than the development part.

Any suggestions for amendments or direction for the project would be
welocome!

Regards,
Soumitra Agarwal



--
Sent from: http://mailinglists.scilab.org/Scilab-GSOC-Mailing-Lists-Archives-f2646148.html
_______________________________________________
gsoc mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/gsoc