100,000! in less than 5 minutes using 8GB RAM machine.
At IIT KGP, our professor Biswas who taught C-class has told us that one of our seniors wrote a factorial program and it can print up 100 and more than that and it consists of some big number lines.
Our Professor Pawan Kumar sir, who is so humble, who always guides least performing guys has taught us on coming up with our own algorithms, in a way we were almost reinventing those algorithms without knowing the algorithm and that has enabled me thought of this.
That got me thinking, solve it myself instead of copying. I wrote this program before i learned about char arrays, pointers and any advanced stuff in C. I just knew “while, for and integer array”, nothing more than that.
The idea behind this algorithm is to calculate k-th last digit at k-th iteration using carry array.
Later i used malloc, char arrays and it could print up to 500,000. It can scale to up to any number if machine has memory, Not only just factorials but also computes multiplications of array of numbers. I could print 400,000 factorial in 3GB RAM Machine without using file printing.
#include<stdio.h>
#include<stdlib.h>
int main()
{
//max outputsize is 5,000,000. 5 millions. maxOutCounter * maxOutputIndex.
//change these values if you are getting out of memory exception.
int *z,i,j,k,n,t,currentPointer=-1,outputModuloCounter=0,maxOutCounter=50000,maxOutputIndex=100;
char **output;
printf("enter n value\n");
scanf("%d",&n);
if(n==0) {
printf("\n\nfactorial of 0 is:\n\n1\n");
return 0;
}
z=(int *)malloc(n*sizeof(int));
output = (char**)malloc(sizeof(char*) * maxOutputIndex);
for(i=1;i<n;i++)
z[i]=0;
z[0]=1;
for(j=0;j<n;j++)
{
while(z[j] > 0)
{
t=z[j]%10;
for(i=j+1;i<n;i++)
{
z[i]=(i+1)*t+z[i];
t=z[i]%10;
z[i]=z[i]/10;
if(t == 0 && z[i] == 0 && z[n-1] == 0)
break;
}
if(currentPointer == -1 || outputModuloCounter == maxOutCounter) {
outputModuloCounter = 0;
currentPointer += 1;
output[currentPointer] = (char *)malloc(maxOutCounter*sizeof(char));
}
output[currentPointer][outputModuloCounter++]=t+48;
z[j]=z[j]/10;
}
}
printf("\n\nfactorial of (%d) is:\n\n",n);
j = currentPointer;
k = outputModuloCounter;
while(j >= 0) {
for(i = k-1;i >= 0;i--) {
printf("%c",output[j][i]);
}
k = maxOutCounter;
j--;
}
printf("\n");
printf("no of digits are %d \n", (currentPointer * maxOutCounter) + outputModuloCounter);
return 0;
}
The above implementation is in-memory optimised version. This avoids out of memory exceptions even though output doesn’t require such space.
Github Link to above Implementation .
The above one requires memory and it can slow the program as we are holding large memory, so i made changes output to a file and this will never run into out of memory exception and this is cool. it can run for days without any issues.
#include<stdio.h>
#include<stdlib.h>
long reverse(char *fileName1, char *fileName2);
// Give a file as a argument.
int main(int argc, char * argv[])
{
int *z,i,j,n,t;
FILE *fp;
long digits;
char *temp = "temp.txt";
printf("enter n value\n");
scanf("%d",&n);
if(n==0)
{
printf("\n\nfactorial of 0 is:\n\n1\n");
return 0;
}
z=(int *)malloc(n*sizeof(int));
fp = fopen(temp, "w");
for(i=1;i<n;i++)
z[i]=0;
z[0]=1;
for(j=0;j<n;j++)
{
while(z[j] > 0)
{
t=z[j]%10;
for(i=j+1;i<n;i++)
{
z[i]=(i+1)*t+z[i];
t=z[i]%10;
z[i]=z[i]/10;
if(t == 0 && z[i] == 0 && z[n-1] == 0)
break;
}
fputc(t+48, fp);
z[j]=z[j]/10;
}
}
fclose(fp);
printf("no of digits are %d \n", reverse(temp, argv[1]));
remove(temp);
return 0;
}
//Reverse Content of fileName1 to fileName2.
long reverse(char *fileName1, char *fileName2)
{
FILE *f1, *f2;
char ch;
long count, lastPosition;
if (f1 = fopen(fileName1, "r"))
{
if(f2 = fopen(fileName2, "w"))
{
fseek(f1, -1L, 2);
lastPosition = ftell(f1);
count = lastPosition+1;
while (count)
{
ch = fgetc(f1);
fputc(ch, f2);
fseek(f1, -2L, 1);
count--;
}
}
else
{
printf("Error writing to file %s\n", fileName2);
}
}
else
{
printf("Error opening file %s\n", fileName1);
}
fclose(f1);
fclose(f2);
return lastPosition+1;
}
Link to above implementation
Link to the Implementation from 2008.
Let’s go through below example for our approach on multiplication. (I will add visualisations)
14 * 47 * 23
Initialize array of 4 elements as a[0]= 1, a[1]=a[2]=a[3]=0;
1st iteration:
1a: 1 * 14 + array[1]= 14 , take the last digit (4) and store 1 into array[1]
1b: Do 4 * 47 + array[2] = 188, take the last digit (8) and store 18 into array[2]
1c: Do 8 * 23 + array[3] = 184, take last digit (4) and store 18 into array[2]
So 4 is the last digit.
and do array[0] = array[0]/10; (the value becomes 0)
In second iteration we calculate 2nd last digit.
2nd Iteration:
Start from array[1] since array[0] is 0.
2a: Do 1 * 47 + array[2] = 65, take last digit (5) and store 6 into array[2]
2b: Do 5 * 23+ array[3] = 133, take last digit (3) and store 13 into array[3]
2nd last digit is 3 and array[1] becomes 0.
3rd iteration:
Start from array[2] since array[1] is 0.
3a: Do 6 * 23 + array[3] = 151, take last digit 1 and store 15.
3rd last element is 1 and array[2] becomes 0;
4th iteration:
Start form array[3] since array[2] is 0.
There are no more numbers to be multiplied here. so take last digit 5.
4rd last digit is 5 and array[3] becomes 1.
5th iteration:
5th last digit is 1 and array[3] = 0;
There are no more values to be calculated and hence program stops.
Take all digits in reverse order and our number here is 15134, as it is expected.
This is the best algorithm i have ever seen. there are many optimizations we can do on top of this.
If you output to a file instead of holding in in-memory, all we need memory is O(n) in-memory and 2 * #fact(n) ~ n*logn bytes disk memory, nothing more than that.
The time complexity is O(n² log-n) and required multiplications are also same.
If you consider multiplication and addition time, it will be O(n² log³n) consisting n²logn multiplications O(n) memory.
Time complexity: O(n² * log³-n)
Multiplications: 1+ 2+ …. + n*log-n
Additions: 1+ 2+ …. + n*log-n
Divisions: 1+ 2+ …. + n*log-n
Space complexity: O(n) in-memory and 2*n*log-n disk memory.
We can reduce the complexity further by calculating carry array using just addition instead of multiplication. Since we always multiply with a single digit and positive value or 0, so that carryarray calculation can become 10*logn instead of logn*logn. This means that we can calculate factorial just using addition, division, remainder.
Hope you have enjoyed reading this.
— — — — — — — — — I will add optimizations as i get time — — — — — —