Today I am writing about the first question on the website http://projecteuler.net. This site is dedicated to fun math puzzles which require computer programming knowledge to solve.

Question 1:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

I generally approach problems like these with a level of “brute force”. I find a solution that works, and then I refine it to be more efficient. That works sometimes, but as I found when I progressed into calculating 1000 digit numbers, it is generally less than optimal.

Prime: My first attempt sees me cycling through all digits from 1-999, testing to see if they are divisible by 3 or 5, and then adding them to my rolling sum if they are. The code for that looks something like this:

double Run()
{
	limit = 1000000000;
	sum = 0;

	for(int i = 0; i < limit; i++)
	{
		if(i % 3 == 0 || i % 5 == 0)
			sum += i;
	}

	return sum;
}

This Euler problem is pretty easy, and this runs fairly fast. To solve this puzzle took 7 microseconds. If I adjust the cap to find solve for numbers less than 1,000,000 the solution is found in 6.7 milliseconds (~1000 times longer). Finally, finding the solution for numbers 1 to 1,000,000,000 takes 7.8 seconds (~1000 times longer still). I like this method because you can see how linear the execution time is. If I add three zeros to the number, I add three zeros to the time as well. Clean.

Optimization 1: The first thing I decided to try was to take the comparison process out entirely. To do that, instead of cycling from 1-999 and checking each digit to see if it is divisible by 3 or 5, I instead just start adding all of the multiples of 3 together and all of the multiples of 5 together. Obviously, this causes some redundancy, since multiples of 15 will be counted twice. Therefore, I need to loop a third time to remove one instance of the multiples of 15. Before actually doing this, I wasn't sure if it would be an optimization or be worse than my original. I may be removing comparisons, but I am looping 3 times instead of 1. Here is the code I came up with:

double Run()
{
	limit = 1000000;
	return Calc(3) + Calc(5) - Calc(15);
}

double Calc(double num)
{
	sum = 0;
	for(int i = num; i < limit; i += num)
		sum += i;

	return sum;
}

I split the loops up via my function Calc in the above code. First I tested it from 1-999 and it completed correctly in 4.9 microseconds. Next, I ran the code and tested from 1-1,000,000 and it completed in 4.5 milliseconds. Finally, the numbers 1-1,000,000,000 finished in 4.4 seconds. So we see speed gains in the ballpark of a 1:1.5 ratio with this optimization.

Optimization 1 works because:

  • Less iterations (no matter what the limit is, Optimization 1 runs only 60% of the iterations of Prime)
  • Less (zero) comparison functions over Prime

Optimization 2: The next step of the optimization process is to remove the iteration loop altogether. I had an inkling that this was possible, but it was the Project Euler forums that showed me the formula to do it. Given that in Optimization 1 we are adding all of the multiples of the given number up (3+6+9+12+15...999) inside of the iteration we can begin to see a better solution. The first step is to factor out the given number and substitute ‘p’ for the final number (so that we can change the limit in the future): 3 * (1+2+3+4+5...p). Next (and this is where the Project Euler forums helped a lot) is to substitute the formula for adding a series of numbers (also explained here) and change 3 to our variable ‘num’: num * p * (p+1) / 2. That's it! Now we have modified the "Calc" function to determine the answer without iterations at all. Here it is:

double Calc(double num)
{
	int p = (limit - 1) / num;
	return num * (p * (p + 1.0)) / 2;
}

After running this version we can actually see an interesting pattern occur. Numbers 1-1,000 ran in 309 nanoseconds. Numbers 1-1,000,000 rand in 309 nanoseconds. Finally, numbers 1-1,000,000,000 ran in 309 nanoseconds. It is important to note that 309 nanoseconds seems to be the smallest time length that QueryPerformanceCounter will return unless the time taken is 0. I cannot seem to get a value between 0 and 309 in nanoseconds. What we have found is a constant runtime. It does not matter how big our set is; since we are using a basic formula the speed does not change. Obviously, the speed increase ratio over optimization 1 will depend on our set, but in order of magnitude they are 1:1256, 1:4499996, 1:14239482.

This concludes my optimization of Project Euler's first problem. If you have any comments or any more optimization ideas, please post them and I will definitely try it out.