[Scilab-users] More rapid calculation

classic Classic list List threaded Threaded
17 messages Options
fujimoto2005 fujimoto2005
Reply | Threaded
Open this post in threaded view
|

[Scilab-users] More rapid calculation

I am using 6.00 for windows.
I am doing a simulation using a random number matrix with a huge size. The
size of the random matrix is 10 ^ 4 x 25000.
I am using a PC equipped with 8 cores and 16 threads.
It takes a considerable time to finish the simulation, but CPU utilization
is as low as 10-20%. Is there any way to increase the CPU usage and finish
the calculation sooner?
The loop is not used. I am using matrix functions.
I also considered parallel computing, but it is said that it can not be used
on windows.

Best regards.



--
Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html
_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
Samuel GOUGEON Samuel GOUGEON
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

Le 14/02/2018 à 14:35, fujimoto2005 a écrit :
> I am using 6.00 for windows.
> I am doing a simulation using a random number matrix with a huge size. The
> size of the random matrix is 10 ^ 4 x 25000.
> I am using a PC equipped with 8 cores and 16 threads.
> It takes a considerable time to finish the simulation, but CPU utilization
> is as low as 10-20%. Is there any way to increase the CPU usage and finish
> the calculation sooner?

So may be it's a RAM issue. If you need a lot of intermediate memory and
you don't have it (here you need 2GB per copy, if you are using decimal
numbers), it usually goes to the disk space.. which is unbearably slow.

My two cents..
Samuel

_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
fujimoto2005 fujimoto2005
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

Dear Samuel
Thanks for your reply.

My PC is equipped with 64GB SDRAM.
Isn't it enough?

Best regards.



--
Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html
_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
mottelet mottelet
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

In reply to this post by fujimoto2005
Hello,

A priori, there is no reason why your calculation should use more than
one CPU core, which explains why you see only 1/8=12,5% CPU use.

S.


Le 14/02/2018 à 14:35, fujimoto2005 a écrit :

> I am using 6.00 for windows.
> I am doing a simulation using a random number matrix with a huge size. The
> size of the random matrix is 10 ^ 4 x 25000.
> I am using a PC equipped with 8 cores and 16 threads.
> It takes a considerable time to finish the simulation, but CPU utilization
> is as low as 10-20%. Is there any way to increase the CPU usage and finish
> the calculation sooner?
> The loop is not used. I am using matrix functions.
> I also considered parallel computing, but it is said that it can not be used
> on windows.
>
> Best regards.
>
>
>
> --
> Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html
> _______________________________________________
> users mailing list
> [hidden email]
> http://lists.scilab.org/mailman/listinfo/users

_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
Antoine Monmayrant Antoine Monmayrant
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

Hello,


If your problem is embarrassingly parallel (ie you run your simulations
many times independently for different random matrices), you might speed
up the overall simulation by running more than one instance of scilab in
parallel.


Antoine


Le 14/02/2018 à 18:00, Stéphane Mottelet a écrit :

> Hello,
>
> A priori, there is no reason why your calculation should use more than
> one CPU core, which explains why you see only 1/8=12,5% CPU use.
>
> S.
>
>
> Le 14/02/2018 à 14:35, fujimoto2005 a écrit :
>> I am using 6.00 for windows.
>> I am doing a simulation using a random number matrix with a huge
>> size. The
>> size of the random matrix is 10 ^ 4 x 25000.
>> I am using a PC equipped with 8 cores and 16 threads.
>> It takes a considerable time to finish the simulation, but CPU
>> utilization
>> is as low as 10-20%. Is there any way to increase the CPU usage and
>> finish
>> the calculation sooner?
>> The loop is not used. I am using matrix functions.
>> I also considered parallel computing, but it is said that it can not
>> be used
>> on windows.
>>
>> Best regards.
>>
>>
>>
>> --
>> Sent from:
>> http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html
>> _______________________________________________
>> users mailing list
>> [hidden email]
>> http://lists.scilab.org/mailman/listinfo/users
>
> _______________________________________________
> users mailing list
> [hidden email]
> http://lists.scilab.org/mailman/listinfo/users
>

_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
fujimoto2005 fujimoto2005
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

This post was updated on .
In reply to this post by mottelet
Dear Mottellet

But there is an explanation "If we use the Intel MKL on Windows, then Scilab
use all the cores available on the processor. "
https://wiki.scilab.org/Documentation/ParallelComputingInScilab

So I expected there are some ways to increase the usage of CPU.
Am I wrong?



--
Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html
_______________________________________________
users mailing list
users@lists.scilab.org
http://lists.scilab.org/mailman/listinfo/users
mottelet mottelet
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

If your program does not take advantage of the MKL Intel library, it means that its CPU usage is not dominated by linear algebra stuff. If you don't tell us more we won't be able to help...

S.

Quoting fujimoto2005 <[hidden email]>:

Dear Mottellet

But there is an explanation "If we use the Intel MKL on Windows, then Scilab
use all the cores available on the processor. "
https://wiki.scilab.org/Documentation/ParallelComputingInScilab

So I expected there are some ways to increase the usage of cpu.



--
Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html
_______________________________________________
users mailing list
[hidden email].<a href="orghttp://lists.scilab.org/mailman/listinfo/users" target="_blank">orghttp://lists.scilab.org/mailman/listinfo/users




_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
Samuel GOUGEON Samuel GOUGEON
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

Le 14/02/2018 à 19:15, [hidden email] a écrit :
>
> If your program does not take advantage of the MKL Intel library, it
> means that its CPU usage is not dominated by linear algebra stuff.
>
I was actually wondering that the bootle neck is not the CPU. This is
why i was thinking about the RAM.
But with 64 GB, this means that more than <32 copies are simultaneously
defined/reserved (internally and/or in the Scilab program).
Fujimoto2005, couldn't it be the case, in your Scilab code?
Are you cleaning properly all intermediate variables after usage?

Samuel

PS: What a powerful PC! :)

_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
mottelet mottelet
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

In reply to this post by mottelet

Here is an example where 100% CPU is used (4 cores) with pure linear algebra.

S.

Quoting [hidden email]:

If your program does not take advantage of the MKL Intel library, it means that its CPU usage is not dominated by linear algebra stuff. If you don't tell us more we won't be able to help...

S.

Quoting fujimoto2005 <[hidden email]>:

Dear Mottellet

But there is an explanation "If we use the Intel MKL on Windows, then Scilab
use all the cores available on the processor. "
https://wiki.scilab.org/Documentation/ParallelComputingInScilab

So I expected there are some ways to increase the usage of cpu.



--
Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html
_______________________________________________
users mailing list
[hidden email].<a href="orghttp://lists.scilab.org/mailman/listinfo/users" target="_blank">orghttp://lists.scilab.org/mailman/listinfo/users


 




_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users

cpu.png (103K) Download Attachment
mottelet mottelet
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

In reply to this post by mottelet

Here is an example (pure linear algebra) using 100% cpu (4 cores) :

http://www.utc.fr/~mottelet/Images/cpu.png

S.

Quoting [hidden email]:

If your program does not take advantage of the MKL Intel library, it means that its CPU usage is not dominated by linear algebra stuff. If you don't tell us more we won't be able to help...

S.

Quoting fujimoto2005 <[hidden email]>:

Dear Mottellet

But there is an explanation "If we use the Intel MKL on Windows, then Scilab
use all the cores available on the processor. "
https://wiki.scilab.org/Documentation/ParallelComputingInScilab

So I expected there are some ways to increase the usage of cpu.



--
Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html
_______________________________________________
users mailing list
[hidden email].<a href="orghttp://lists.scilab.org/mailman/listinfo/users" target="_blank">orghttp://lists.scilab.org/mailman/listinfo/users


 




_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
fujimoto2005 fujimoto2005
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

Dear all
Thank you for your replies.

I attached my code and a snapshot of task manager.
The snapshot shows a typical situation.
CPU utilization is usually between 10 and 20%.
There are about 3 times when the CPU utilization instantaneously reaches
from 40% to 50%.
The memory usage does not exceed 20 GB.
Always at least 40 GB is free.

Thank you.

corSmplebaseZ1andZ2.sce
<http://mailinglists.scilab.org/file/t497065/corSmplebaseZ1andZ2.sce>  

snapshot_of_task_manager.png
<http://mailinglists.scilab.org/file/t497065/snapshot_of_task_manager.png>  




--
Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html
_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
fujimoto2005 fujimoto2005
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

This post was updated on .
Sorry, my code contains a line "exec user file "
I attached a revised version which contains the line as a user function.
Please see it.

Best regards

corSmplebaseZ1andZ2_with_UserFunc.sce
<http://mailinglists.scilab.org/file/t497065/corSmplebaseZ1andZ2_with_UserFunc.sce



--
Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html
_______________________________________________
users mailing list
users@lists.scilab.org
http://lists.scilab.org/mailman/listinfo/users
mottelet mottelet
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

Le 15/02/2018 à 00:02, fujimoto2005 a écrit :

> Sorry, my code contains line "exec user file "
> I attached revised version which contains the line as a user function.
> Please see it.
>
> Best regards
>
> corSmplebaseZ1andZ2_with_UserFunc.sce
> <http://mailinglists.scilab.org/file/t497065/corSmplebaseZ1andZ2_with_UserFunc.sce>
>
>
>
> --
> Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html
> _______________________________________________
> users mailing list
> [hidden email]
> http://lists.scilab.org/mailman/listinfo/users

Hello,

In your code, most of the cpu time is spent between lines 40-54 (random
generation of big matrices), then between lines 54-60, where one of the
bottlenecks is the use of repmat (and you use it twice) and the
"cumsum".In previous posts of Heinz Nabielek related to code
optimization, you may have noticed that using matrix multiplication by a
vector of ones gives the same result BUT uses the BLAS ! For example,
compare these timings, with size(timePoints_V)=[1 25000] and sample=5000) :

tic;repmat(timePoints_V,2*sample,1);disp(toc())

    12.372273

tic;ones(2*sample,1)*timePoints_V;disp(toc())

    1.823105

On my machine (MacPro, OSX, Scilab 6.0.0), this last piece of code uses
100% cpu (four cores).

S.

--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la Matière Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de Compiègne
CS 60319, 60203 Compiègne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet

_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
fujimoto2005 fujimoto2005
Reply | Threaded
Open this post in threaded view
|

Re: More rapid calculation

Dear Mottelet
Thank you for your useful advice.

1, By changing repmat(timePoints_V,2*sample,1) to
timePoints_M=ones(2*sample,1)*timePoints_V and using it, calculation time is
improved by 25 seconds.

2, "cumsum" is not a bottleneck because it takes only 2 seconds to finish.
Also, if I change cumsum(wY1_M,'c') to linear algebra version
wY1_M*triu(ones(time_step,time_step)), calculation time increases to 2
minutes although CPU usage rate rose greatly. "cumsum" function seems
efficient function.

3, Is there any way to improve random number matrices generation? The
attached file is a snapshot of the task manager when generating random
matrices. It shows the CPU utilization remains low despite using many slots.
What is the cause?
snapshot_of_task_manager2.png
<http://mailinglists.scilab.org/file/t497065/snapshot_of_task_manager2.png>  

4, If I can generate multiple random matrices with smaller row size at the
same time and integrate them at the end, I expect the processing time will
be shorter, but can not  I do such a thing?

Best regards.



--
Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html
_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
Samuel GOUGEON Samuel GOUGEON
Reply | Threaded
Open this post in threaded view
|

[Scilab-users] repmat() slow compared to .* and .*. <= Re: More rapid calculation

In reply to this post by mottelet
Le 15/02/2018 à 11:45, Stéphane Mottelet a écrit :

> Le 15/02/2018 à 00:02, fujimoto2005 a écrit :
>> .../...
>
> Hello,
>
> In your code, most of the cpu time is spent between lines 40-54
> (random generation of big matrices), then between lines 54-60, where
> one of the bottlenecks is the use of repmat (and you use it twice) and
> the "cumsum".In previous posts of Heinz Nabielek related to code
> optimization, you may have noticed that using matrix multiplication by
> a vector of ones gives the same result BUT uses the BLAS ! For
> example, compare these timings, with size(timePoints_V)=[1 25000] and
> sample=5000) :
>
> tic;repmat(timePoints_V,2*sample,1);disp(toc())
>
>    12.372273
>
> tic;ones(2*sample,1)*timePoints_V;disp(toc())
>
>    1.823105
>
> On my machine (MacPro, OSX, Scilab 6.0.0), this last piece of code
> uses 100% cpu (four cores).

Thank you Stéphane for having pointed out the repmat() slowness.

Additional tests show that the Kronecker product is even slightly faster
than .*

A new version of repmat() is proposed on review:
https://codereview.scilab.org/19782
It is rewritten mainly using .*., which simplifies a lot the code.

This version is more than 7x faster than the current one, and uses both
CPU of my PC.
It is roughly the ratio 12.37/1.82 ~ 6.8 that you give here-above.

Best regards
Samuel

_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
mottelet mottelet
Reply | Threaded
Open this post in threaded view
|

Re: repmat() slow compared to .* and .*. <= Re: More rapid calculation

Hello Samuel,

It is a good initiative. Looking at your proposed code, I see that you
use "execstr" on strings which are forged on the fly. Although the
obtained expression will be faster (this was the goal), AFAIK such
constructs are not "compilable" the same way as the straight expression.
For the time being, Scilab does not use JIT compilation, but I think
that such  constructs are typically not optimal and that it/then/else
constructs should be used instead.

S.


Le 16/02/2018 à 07:48, Samuel Gougeon a écrit :

> Le 15/02/2018 à 11:45, Stéphane Mottelet a écrit :
>> Le 15/02/2018 à 00:02, fujimoto2005 a écrit :
>>> .../...
>>
>> Hello,
>>
>> In your code, most of the cpu time is spent between lines 40-54
>> (random generation of big matrices), then between lines 54-60, where
>> one of the bottlenecks is the use of repmat (and you use it twice)
>> and the "cumsum".In previous posts of Heinz Nabielek related to code
>> optimization, you may have noticed that using matrix multiplication
>> by a vector of ones gives the same result BUT uses the BLAS ! For
>> example, compare these timings, with size(timePoints_V)=[1 25000] and
>> sample=5000) :
>>
>> tic;repmat(timePoints_V,2*sample,1);disp(toc())
>>
>>    12.372273
>>
>> tic;ones(2*sample,1)*timePoints_V;disp(toc())
>>
>>    1.823105
>>
>> On my machine (MacPro, OSX, Scilab 6.0.0), this last piece of code
>> uses 100% cpu (four cores).
>
> Thank you Stéphane for having pointed out the repmat() slowness.
>
> Additional tests show that the Kronecker product is even slightly
> faster than .*
>
> A new version of repmat() is proposed on review:
> https://codereview.scilab.org/19782
> It is rewritten mainly using .*., which simplifies a lot the code.
>
> This version is more than 7x faster than the current one, and uses
> both CPU of my PC.
> It is roughly the ratio 12.37/1.82 ~ 6.8 that you give here-above.
>
> Best regards
> Samuel
>
> _______________________________________________
> users mailing list
> [hidden email]
> http://lists.scilab.org/mailman/listinfo/users

_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users
Samuel GOUGEON Samuel GOUGEON
Reply | Threaded
Open this post in threaded view
|

Re: repmat() slow compared to .* and .*. <= Re: More rapid calculation

Hello Stéphane,

Le 16/02/2018 à 08:24, Stéphane Mottelet a écrit :
Hello Samuel,

It is a good initiative. Looking at your proposed code, I see that you use "execstr" on strings which are forged on the fly. Although the obtained expression will be faster (this was the goal), AFAIK such constructs are not "compilable" the same way as the straight expression. For the time being, Scilab does not use JIT compilation, but I think that such  constructs are typically not optimal and that it/then/else constructs should be used instead.

Here, an if/then/else or rather select/case construct is not possible, since the number of cases is unknown, open, not limited. So the construct would anyway include a final else including an execstr() instruction.

But, even if avoiding execstr() is not a priority, there is here another solution, that is now implemented.

The final execstr() for the overloading routing can't be avoided. This is typically the case of the processing of open unknown cases.

Samuel


_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users