Dear Caio, Dhruv and Amanda,
I would like to include my colleague Philippe Saadé to the exchanges on Machine Learning for Scilab.
He is an experienced mathematician working with us at ESI Group, and has an interesting vision on the subject.
He will be scientific advisor and mentor for a joint internship on Machine learning starting mid june.
[hidden email]: Could you maybe share with us your view on the subject?
We can keep this exchange public if it is alright with you all, since I believe our success on the subject will depend on our capacity to centralize and merge our community efforts.
You can all collaborate on the project on our forge:
Yann @ Scilab
Hi Caio, sorry for the late.
I think we should ask ourselves what SciLAB's focus and what audience are.
I feel a lack of knowing what users of Scilab seek.
Me, for example, I want to do everything from protyping to running the script on hundreds of Intel Xeon servers with the least possible effort.
Even with less effort than it would have if the script were built in Python.
I am sure that new data structures will expand the use of SciLAB.
But what advantage will this bring to users?
Python, as example, have already optimized data structures and libraries.
-- Amanda Osvaldo
On Wed, 2017-04-26 at 14:32 -0300, Caio Souza wrote:
gsoc mailing list
GSoC-Machine_Learning.pdf (763K) Download Attachment
I wanted to know from where should I start the work for the Machine Learning toolbox since the 30th May is soon approaching.
Have already worked through all the mentioned ml algorithms/models in my proposal through scikit-learn, as well as trying and experimenting with tensorflow in python.
What else should I do before 30th May?
In reply to this post by Yann Debray-2
I took some time to jump in the discussion due to the fact that I wanted to get a better understanding of the current status of your discussions, a better understanding of Mandar's profile and expertise, and also what is easy/hard to do with Scilab to meet some serious and legitimate demands from Scilab's users.
As I am the last to join the discussion, I will voluntarily reset my mind and start again the discussions with you so that we can try to structure the project and converge quickly on an achievable list of goals for this GSoC.
For that purpose, I would like to list a series of questions on which we need to share a mutual list of answers and common understanding.
This should serve as a basis to decide what to do, how and when.
So, feel free to fill in...
As we know, we need to be active and efficient for the 30th of May!
Thanks for your feedback and feel free to share your point of view.
Le 18/05/2017 à 21:50, Amanda Osvaldo a écrit :
gsoc mailing list
It looks like you didn't receive my first email ?
Envoyé de mon mobile
Le 31 mai 2017 à 16:20, Amanda Osvaldo <[hidden email]> a écrit :
gsoc mailing list
face-surprise.png (2K) Download Attachment
Will it be okay, if I add information answerable by me?
It is an open discussion, I would also suggest you to take a look at PIMS and run the examples you already did in python inside scilab, since you are going to work, your thoughts and challenges are welcome.
On Wed, May 31, 2017 at 2:24 PM, mandroid6 <[hidden email]> wrote:
gsoc mailing list
Okay, I will do that.
As discussed with Yann yesterday, first I am assessing all existing features in Scilab and MATLAB. So that we will have a thorough list of features which need to added on priority.
Existing features like those in the neural network module https://atoms.scilab.org/toolboxes/neuralnetwork/2.0
can simply be modified if needed.
I have noted all existing toolboxes in SCILAB mentioned below and am trying to work with them
1.Artificial neural network toolbox
2.Neural Network Module
3.Regression tools - A toolbox for linear and non linear regression analysis
4.NaN-toolbox - A statistics and machine learning toolbox
5.libsvm and liblinear - Libraries for SVM and large-scale linear classification
So should I test their functionality by working through actual data-sets, or simply check their functions?
How should I proceed?
My thoughts about the questions brought up:
I cant say how mature it is, it's something to be investigated. However it is useful to call python directly from scilab, but there are other ways to use ML frameworks without exposing them directly from scilab code. We can call it in C/C++ and "link" to scilab, so the user wouldn't need to interact with python.
Supposing PIMS work perfectly, it wouldn't be hard to make it work, but it would be like "writing python @ scilab", in the other hand not exposing python directly to the user would involve much more work, but would have a better overall result.
Handling large data sets, without loading everything in RAM would need to be supported by the ML framework used and/or done completely by us. As far as I know wouldnt be possible to keep "pointers" to the ML framework, because data is organized differently from, for example, python to scilab. This means that anything that is used/visible by python must sit in python "memory space" and anything used/visible by scilab must sit in scilab memory space. We would have duplicated data, to handle that we would need to allocate/deallocate and exchange things often according with use.
Use Cases, Usability and Easiness:
Here the first thing to come to my mind is Demos. Having well written and at least one example to each feature can demonstrate all those things.
So far, there are too many things to achieve and it's good to have the "final picture" in our minds, but IMHO the GSoC is the first iteration of the ML toolbox, not the deadline.
On Thu, Jun 1, 2017 at 8:58 AM, mandroid6 <[hidden email]> wrote:
gsoc mailing list
In reply to this post by mandroid6
Report On Neural Network Module:
1. Most activation functions are included (hardlimit, sigmoid, tangent-sigmoid, linear)
2. Functions available to create nn in a step-wise manner, which can easily be followed always
load data--> select/set input and target --> select model --> train using ann_FFBP_gd, ann_FFBP_gd --> predict for test set using ann_ADALINE_predict, etc
3. End to end functions pre-written for neural network design
4. gui for visualizing training process is intuitive, and gives user idea about change in cost with epochs
1. Documentation about entire process of neural network design, for beginners can be included
2. More engineering problems can be solved using this module, and presented to users as a quick start guide
3. Few deep learning algorithms like cnn and rnn can also be included
Since yesterday I am working with PIMS, to implement scikit-learn methods for testing whether it works correctly or not.
On many occasions I am getting these 2 errors even on using basic builtin python functions (like math.sqrt)
Scilab has found a critical error (EXCEPTION_ACCESS_VIOLATION)
with "!!_invoke_" function.
recursive extraction is not valid in this context
Can any body suggest how to debug this issue?
While working with PIMS, I realized working with it depends on inputs from people other than the mentors. So it would take time to fully work on the project through PIMS.
As discussed earlier with Yann, I am thinking whether it would be more practical to start developing a predefined list of ml algorithms through SCILAB code, while trying to work with PIMS on the side.
This would help speed up the project work as I need to start building the machine learning toolbox now.
Hence can we form a final list of machine learning algorithms which I should start building through SCILAB code?
As of now I have successfully implemented linear regression using scikit-learn and PIMS.
Results are as expected, no bugs found.
Still there is a difficulty in using sklearn.cross_validation for splitting dataset into training and test set.
For now, I have manually created 4 separate csv files for X_train, X_test and y_train and y_test.
Here X is the input data, and y is the target data.
Please find attached below, the required files.
Steps to reproduce results:
1. Windows 8.1 64 bit
2. PIMS 1.1 through atomsInstall
3. python 2.7 64 bit
4. Place all attached csv files in the same folder as the Scilab script
5. Run the linear_regression.sce script
As discussed with Philippe Saadé and Yann Debray yesterday, I was trying to work out the Comparison of kernel ridge regression and SVR example from scitkit-learn documentation using PIMS.
Sharing the issues, which I encountered while doing the same through this document.
Please suggest any alternative methods I should try to resolve the above issues.
While looking at the PIMS approach for library implementation, I was searching a common list of all essential ml model design methods, using python libraries. This article below can act as a working cheat-sheet for this project.
I am adding the github profile link in the ml project document created by Caio.
After having a detailed discussion with Simon Marchetto regarding possible use of PIMS for machine learning toolbox,various notable points have been taken:
1. Few of the issues mentioned above can be solved right now, but it doesn’t promise a fully working machine learning implementation using scikit-learn
2. Issues related to use of default python syntaxes like those for creating lists and dictionaries, can be reduced. However many other default syntaxes and builtin methods will take time to be resolved.
3. Requirement of help documentation for machine learning functions, through the common scilab help interface possible in a later release
4. Visualization of numpy arrays, or python lists and other data types is essential for any user trying to use PIMS in Scilab for machine learning. We can start by numpy arrays and gradually include all other data types over time.
5. Need to discuss possible approaches for building the machine learning toolbox in Scilab, in a joint exchange with Philippe Saade, Simon Marchetto, Yann Debray and my project mentors.
So when would it be possible to have this exhange?
|Free forum by Nabble||Edit this page|