# [Scilab-users] Count specific values in text file

18 messages
Open this post in threaded view
|

## [Scilab-users] Count specific values in text file

 Hello all, Basic query. I have text files of Pi and e to a million places and I want to scan the number for the occurrences of particular values and the separation of those values in the number. This code snippet works on a vector: // create vector of elements A=[1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 4 4 5 5 5 5 8 8]; // Count number of values, e.g. 1 Val_count1= sum(A==1) disp(Val_count1) // answer 7 I can open the text file with pinum=mopen('pi-million.txt','rt') - but does it need to be changed to a vector where each value is an element?. At the moment the text file is just a single line of numbers (no decimal pint at the start (e.g. 314159...etc). Any pointers would be good. Sorry if this is basic one! Thanks -- Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html_______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: {EXT} Count specific values in text file

 Hello, > De la part de arctica1963 > Envoyé : mardi 11 février 2020 13:11 > > I have text files of Pi and e to a million places [...] > > I can open the text file with pinum=mopen('pi-million.txt','rt') - but does it > need to be changed to a vector where each value is an element?. At the > moment the text file is just a single line of numbers (no decimal pint at the > start (e.g. 314159...etc). You might have a look at csvRead and csvTextScan https://help.scilab.org/docs/6.0.2/en_US/csvRead.htmlhttps://help.scilab.org/docs/6.0.2/en_US/csvTextScan.htmlHope this helps, regards -- Christophe Dang Ngoc Chan Mechanical calculation engineer General This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error), please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: Count specific values in text file

 In reply to this post by arctica1963 Hello, Le 11/02/2020 à 13:10, arctica1963 a écrit : > Hello all, > > Basic query. I have text files of Pi and e to a million places and I want to > scan the number for the occurrences of particular values and the separation > of those values in the number. I am afraid that i do not catch clearly the purpose. Could you post a sample, and the expected result for it? You may also have a look to grep(). Regards Samuel _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: [EXTERNAL] Count specific values in text file

 In reply to this post by arctica1963 Hi Is that what you're expecting i.e. a way to reshape your matrix among other things ? Paul ################################################## mode(0) A=[1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 4 4 5 5 5 5 8 8]; loc = find(A == 1); occ = size(loc, "*"); printf("Number of occurences = %d",occ); // reshaping B=[1 2 3 4 5 6 7 8 9 1 1 1 ;    1 1 1 4 4 5 5 5 5 8 8 0]; [n,m] = size(B); C = matrix(B,(n*m),1); // then look for your specific values -----Message d'origine----- De : users [mailto:[hidden email]] De la part de arctica1963 Envoyé : mardi 11 février 2020 13:11 À : [hidden email] Objet : [EXTERNAL] [Scilab-users] Count specific values in text file Hello all, Basic query. I have text files of Pi and e to a million places and I want to scan the number for the occurrences of particular values and the separation of those values in the number. This code snippet works on a vector: // create vector of elements A=[1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 4 4 5 5 5 5 8 8]; // Count number of values, e.g. 1 Val_count1= sum(A==1) disp(Val_count1) // answer 7 I can open the text file with pinum=mopen('pi-million.txt','rt') - but does it need to be changed to a vector where each value is an element?. At the moment the text file is just a single line of numbers (no decimal pint at the start (e.g. 314159...etc). Any pointers would be good. Sorry if this is basic one! Thanks -- Sent from: https://urldefense.com/v3/__http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html__;!!Cn2zSGkVug!dS2Fn2dKbFa5OF5GfEkOTNYlP5qA1dPf0ClsKnbfRbq7V6EWja6b9vP-OaWfFOX5zIqjbi3g\$  _______________________________________________ users mailing list [hidden email] https://urldefense.com/v3/__http://lists.scilab.org/mailman/listinfo/users__;!!Cn2zSGkVug!dS2Fn2dKbFa5OF5GfEkOTNYlP5qA1dPf0ClsKnbfRbq7V6EWja6b9vP-OaWfFOX5zLxxkxMH\$  _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: [EXTERNAL] Count specific values in text file

 Hi,is this what you're looking for?```// path to the txt file path = 'pathToFile' // read the file as string piAsString = csvRead(path, [],['.'],'string') // split the string at the decimal [piAsString] = strsplit(piAsString,'.'); // just get the digits piDigits = strtod(strsplit(piAsString(2))); // search: how often does appears a certain value in a specific range within the digits searchRange = 100; searchVal = 1; [locations] = find(piDigits(1:searchRange) == searchVal); printf("The number %d appears %d times in the first %d digits of Pi\n",searchVal,length(locations),searchRange) ; printf("The number %d appears at following locations: \n", searchVal); disp(locations');```Best regards,Philipp _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: [EXTERNAL] Count specific values in text file

 Hello Philipp Your suggestion is kind of what I am trying to do, but the text file is not a CSV structure. It is just a single, very big number on one row. A small chunk: 31415926535897932384626433832795028841971693993751058209749445923078164 etc (no spaces between digits) How best to load the text file and then count the number of occurrences of a specific digit (e.g. 1)? At the moment the file essentially contains a very big integer. Interested to see the best method of doing this. Thanks -- Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html_______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: [EXTERNAL] Count specific values in text file

 Hi,tabul function might help.sample data download from https://thestarman.pcministry.com/math/pi/picalcs.htmunzip the PimultDP.Zipfollowing lines do the "counting" job. have not tried the millions decimal places, good luck. --> data = mgetl('PI25K_DP.TXT'); --> data2=strsplit(data);--> tabul(data2)ans  =       ans(1)!9  !!   !!8  !!   !!7  !!   !!6  !!   !!5  !!   !!4  !!   !!3  !!   !!2  !!   !!1  !!   !!0  !       ans(2)   2509.   2465.   2480.   2541.   2567.   2549.   2491.   2403.   2519.   2476.---- On Wed, 12 Feb 2020 15:57:23 +0800 arctica1963 <[hidden email]> wrote ----Hello Philipp Your suggestion is kind of what I am trying to do, but the text file is not a CSV structure. It is just a single, very big number on one row. A small chunk: 31415926535897932384626433832795028841971693993751058209749445923078164 etc (no spaces between digits) How best to load the text file and then count the number of occurrences of a specific digit (e.g. 1)? At the moment the file essentially contains a very big integer. Interested to see the best method of doing this. Thanks -- Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: [EXTERNAL] Count specific values in text file

 In reply to this post by arctica1963 Hi,Option 1:You manually put a "." inside the file and run the script.Since you only mentioned "pi" and "e" it's little efford.Option 2:You still can use csvRead even without having a "." as separator.Note: Search range adapted to count only in digits behind the decimal sign.You may change as you wish.Best regards,Philipp```// path to the txt file path = 'pathToFile' // read the file as string piAsString = csvRead(path, [],[],'string') // split the string at '' (use no token) [piAsString] = strsplit(piAsString,''); // convert string to double piDigits = strtod(piAsString); // search: how often appears a certain value in a specific range within the digits searchRange = 100; searchVal = 1; // adapt the range, to search only places behind the decimal sign[locations] = find(piDigits(2:searchRange+1) == searchVal); // uncomment if wished // printf("location \t number\n"); // for i= 2:searchRange+1 // printf("%d \t %d \n", i-1, piDigits(i)); // end printf("The number %d appears %d times in the first %d digits of Pi\n",searchVal,length(locations),searchRange) ; printf("The number %d appears at following locations: \n", searchVal); disp(locations');``` Am Mi., 12. Feb. 2020 um 08:57 Uhr schrieb arctica1963 <[hidden email]>:Hello Philipp Your suggestion is kind of what I am trying to do, but the text file is not a CSV structure. It is just a single, very big number on one row. A small chunk: 31415926535897932384626433832795028841971693993751058209749445923078164 etc (no spaces between digits) How best to load the text file and then count the number of occurrences of a specific digit (e.g. 1)? At the moment the file essentially contains a very big integer. Interested to see the best method of doing this. Thanks -- Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: [EXTERNAL] Count specific values in text file

 In reply to this post by arctica1963 Le 12/02/2020 à 08:57, arctica1963 a écrit : > Hello Philipp > > Your suggestion is kind of what I am trying to do, but the text file is not > a CSV structure. It is just a single, very big number on one row. A small > chunk: > > 31415926535897932384626433832795028841971693993751058209749445923078164 etc > (no spaces between digits) > > How best to load the text file and then count the number of occurrences of a > specific digit (e.g. 1)? At the moment the file essentially contains a very > big integer. --> t = "31415926535897932384626433832795028841971693993751058209749445923078164"; --> length(regexp(t,"/1/"))   ans  =     6. --> length(regexp(t,"/41/"))   ans  =     2. _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: [EXTERNAL] Count specific values in text file

 In reply to this post by der_Phil Hi Philipp, Your script works fine using csvRead. One odd thing. Tested a few files with pi, and ones with >= 500,000 gave an error: csvRead: can not read file pi-million.txt: Error in the column structure. (same for 500,000), but worked with 250,000? Not sure if this an internal bug/limit in csvRead. Thanks for the pointers everyone Lester -- Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html_______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: ?==?utf-8?q? ?==?utf-8?q? [EXTERNAL] Count specific values in text file

 Hello Lester,Could this be related to http://bugzilla.scilab.org/show_bug.cgi?id=15788 ?Which version of scilab are you using (the bug above was fixed in 6.0.2)?AntoineLe Mercredi, Février 12, 2020 12:31 CET, arctica1963 <[hidden email]> a écrit: Hi Philipp,Your script works fine using csvRead.One odd thing. Tested a few files with pi, and ones with >= 500,000 gave anerror:csvRead: can not read file pi-million.txt: Error in the column structure.(same for 500,000), but worked with 250,000? Not sure if this an internalbug/limit in csvRead.Thanks for the pointers everyoneLester--Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html_______________________________________________users mailing list[hidden email]http://lists.scilab.org/mailman/listinfo/users   _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: [EXTERNAL] Count specific values in text file

 In reply to this post by arctica1963 Hi all, Nice challenge, I have a simple answer to display an histogram (with 10 classes) of the PI digits stored in a pi.txt file (without any dot separator): F="pi.txt"; fd = mopen(F, 'r'); histplot(ascii(strcat(string(0:9))), mget(fileinfo(F)(1), "c", fd)); mclose(fd); The idea is to read the file using mget() and store each ascii value into a vector. These data are plot as a histogram of "0" to "9" ascii value. -- Clément -- Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html_______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: ?==?utf-8?q? ?==?utf-8?q? [EXTERNAL] Count specific values in text file

 In reply to this post by Antoine Monmayrant-2 Hello Antoine I am using version 6.0.2. Not sure if it is the same issue as there is only one line in the file, but a lot of digits to handle. The error message is not exactly clear. Lester -- Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html_______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: ?==?utf-8?q? ?==?utf-8?q? ?= e

 Hello Lester,Could you:(1) Share the file so that we can try & see whether we can reproduce the bug?(2) Share a minimum working example that shows the bug / no bug depending on the lenght of the file?This could help us confirm/report a bug.Cheers,AntoineLe Mercredi, Février 12, 2020 15:15 CET, arctica1963 <[hidden email]> a écrit: Hello AntoineI am using version 6.0.2. Not sure if it is the same issue as there is onlyone line in the file, but a lot of digits to handle. The error message isnot exactly clear.Lester--Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html_______________________________________________users mailing list[hidden email]http://lists.scilab.org/mailman/listinfo/users   _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: ?= ?==?utf-8?q? e

 PI250KDP.TXT   PI500KDP.TXT   clear // path to the txt file path = 'PI250KDP.txt' // read the file as string piAsString = csvRead(path, [],[],'string') // split the string at '' (use no token) [piAsString] = strsplit(piAsString,''); // convert string to double piDigits = strtod(piAsString); // search: how often appears a certain value in a specific range within the digits searchRange = 10000; searchVal = 1; // adapt the range, to search only places behind the decimal sign [locations] = find(piDigits(1:searchRange+1) == searchVal); // small tweak here printf("The number %d appears %d times in the first %d digits of Pi\n",searchVal,length(locations),searchRange) ; printf("The number %d appears at following locations: \n", searchVal); disp(locations'); The 250K file works; the 500k file fails -- Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html_______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
Open this post in threaded view
|

## Re: ?= ?==?utf-8?q? e

 to add on, it might not just affect csvRead, but also others file IO functions as well:how to reproduce:// Not OKa = ones(1,300000); b = strcat(string(a));mputl(b,'test.txt');c = mgetl('test.txt');then a is 250,000 mgetl get the string correctly, but when a is 300,000, it return empty string.on the other hand, file in single column of data working fine. // OKa = ones(1,300000); b = string(a);mputl(b,'test.txt');c = mgetl('test.txt');thanks.rgds,CL---- On Wed, 12 Feb 2020 23:40:45 +0800 arctica1963 <[hidden email]> wrote ----PI250KDP.TXT PI500KDP.TXT clear // path to the txt file path = 'PI250KDP.txt' // read the file as string piAsString = csvRead(path, [],[],'string') // split the string at '' (use no token) [piAsString] = strsplit(piAsString,''); // convert string to double piDigits = strtod(piAsString); // search: how often appears a certain value in a specific range within the digits searchRange = 10000; searchVal = 1; // adapt the range, to search only places behind the decimal sign [locations] = find(piDigits(1:searchRange+1) == searchVal); // small tweak here printf("The number %d appears %d times in the first %d digits of Pi\n",searchVal,length(locations),searchRange) ; printf("The number %d appears at following locations: \n", searchVal); disp(locations'); The 250K file works; the 500k file fails -- Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
 Well, that sounds like a bug! Could you report it? Cheers, Antoine Le 12/02/2020 à 17:02, Chin Luh Tan a écrit : to add on, it might not just affect csvRead, but also others file IO functions as well: how to reproduce: // Not OK a = ones(1,300000); b = strcat(string(a)); mputl(b,'test.txt'); c = mgetl('test.txt'); then a is 250,000 mgetl get the string correctly, but when a is 300,000, it return empty string. on the other hand, file in single column of data working fine.  // OK a = ones(1,300000); b = string(a); mputl(b,'test.txt'); c = mgetl('test.txt'); thanks. rgds, CL ---- On Wed, 12 Feb 2020 23:40:45 +0800 arctica1963 [hidden email] wrote ---- PI250KDP.TXT PI500KDP.TXT clear // path to the txt file path = 'PI250KDP.txt' // read the file as string piAsString = csvRead(path, [],[],'string') // split the string at '' (use no token) [piAsString] = strsplit(piAsString,''); // convert string to double piDigits = strtod(piAsString); // search: how often appears a certain value in a specific range within the digits searchRange = 10000; searchVal = 1; // adapt the range, to search only places behind the decimal sign [locations] = find(piDigits(1:searchRange+1) == searchVal); // small tweak here printf("The number %d appears %d times in the first %d digits of Pi\n",searchVal,length(locations),searchRange) ; printf("The number %d appears at following locations: \n", searchVal); disp(locations'); The 250K file works; the 500k file fails -- Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users ```_______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users ``` _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users
 sure, here it is:rgds,CL---- On Thu, 13 Feb 2020 20:15:05 +0800 Antoine Monmayrant <[hidden email]> wrote ----Well, that sounds like a bug!Could you report it?Cheers,AntoineLe 12/02/2020 à 17:02, Chin Luh Tan a écrit :_______________________________________________users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users to add on, it might not just affect csvRead, but also others file IO functions as well:how to reproduce:// Not OKa = ones(1,300000); b = strcat(string(a));mputl(b,'test.txt');c = mgetl('test.txt');then a is 250,000 mgetl get the string correctly, but when a is 300,000, it return empty string.on the other hand, file in single column of data working fine. // OKa = ones(1,300000); b = string(a);mputl(b,'test.txt');c = mgetl('test.txt');thanks.rgds, CL---- On Wed, 12 Feb 2020 23:40:45 +0800 arctica1963 [hidden email] wrote ----PI250KDP.TXT PI500KDP.TXT clear // path to the txt file path = 'PI250KDP.txt' // read the file as string piAsString = csvRead(path, [],[],'string') // split the string at '' (use no token) [piAsString] = strsplit(piAsString,''); // convert string to double piDigits = strtod(piAsString); // search: how often appears a certain value in a specific range within the digits searchRange = 10000; searchVal = 1; // adapt the range, to search only places behind the decimal sign [locations] = find(piDigits(1:searchRange+1) == searchVal); // small tweak here printf("The number %d appears %d times in the first %d digits of Pi\n",searchVal,length(locations),searchRange) ; printf("The number %d appears at following locations: \n", searchVal); disp(locations'); The 250K file works; the 500k file fails -- Sent from: http://mailinglists.scilab.org/Scilab-users-Mailing-Lists-Archives-f2602246.html _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users ```_______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users ```_______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users