PrevNext

If you've never encountered any number theory before, AoPS is a good place to start.

Resources
AoPS

practice problems, set focus to number theory!

AoPS

good book

Resources

Prime Factorization

A positive integer aa is called a divisor or a factor of a non-negative integer bb if bb is divisible by aa, which means that there exists some integer kk such that b=kab = ka. An integer n>1n > 1 is prime if its only divisors are 11 and nn. Integers greater than 11 that are not prime are composite.

Every positive integer has a unique prime factorization: a way of decomposing it into a product of primes, as follows:

n=p1a1p2a2pkakn = {p_1}^{a_1} {p_2}^{a_2} \cdots {p_k}^{a_k}

where the pip_i are distinct primes and the aia_i are positive integers.

Now, we will discuss how to find the prime factorization of any positive integer.

C++

vector<int> factor(int n) {
vector<int> ret;
for (int i = 2; i * i <= n; i++) {
while (n % i == 0) {
ret.push_back(i);
n /= i;
}
}
if (n > 1) { ret.push_back(n); }
return ret;
}

Java

List<Integer> factor(int n) {
List<Integer> factors = new ArrayList<>();
for (int i = 2; i * i <= n; i++) {
while (n % i == 0) {
factors.add(i);
n /= i;
}
}
if (n > 1) { factors.add(n); }
return factors;
}

Python

from typing import List
def factor(n: int) -> List[int]:
ret = []
i = 2
while i * i <= n:
while n % i == 0:
ret.append(i)
n //= i
i += 1
if n > 1:
ret.append(n)
return ret

This algorithm runs in O(n)\mathcal{O}(\sqrt{n}) time, because the for loop checks divisibility for at most n\sqrt{n} values. Even though there is a while loop inside the for loop, dividing nn by ii quickly reduces the value of nn, which means that the outer for loop runs less iterations, which actually speeds up the code.

Let's look at an example of how this algorithm works, for n=252n = 252.

iinnret\texttt{ret}
22252252{}\{\}
22126126{2}\{2\}
226363{2,2}\{2, 2\}
332121{2,2,3}\{2, 2, 3\}
3377{2,2,3,3}\{2, 2, 3, 3\}

At this point, the for loop terminates, because ii is already 3 which is greater than 7\lfloor \sqrt{7} \rfloor. In the last step, we add 77 to the list of factors vv, because it otherwise won't be added, for a final prime factorization of {2,2,3,3,7}\{2, 2, 3, 3, 7\}.

Focus Problem – try your best to solve this problem before continuing!

Solution - Counting Divisors

The most straightforward solution is just to do what the problem asks us to do - for each xx, find the number of divisors of xx in O(x)\mathcal{O}(\sqrt x) time.

C++

#include <iostream>
using namespace std;
int main() {
int n;
cin >> n;
for (int q = 0; q < n; q++) {
int x;
int div_num = 0;

Java

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Divisors {
public static void main(String[] args) throws IOException {
BufferedReader read = new BufferedReader(new InputStreamReader(System.in));
int queryNum = Integer.parseInt(read.readLine());
StringBuilder ans = new StringBuilder();
for (int q = 0; q < queryNum; q++) {

Python

Warning!

Due to Python's big constant factor, the following code TLEs on quite a few test cases.

ans = []
for _ in range(int(input())):
div_num = 0
x = int(input())
i = 1
while i * i <= x:
if x % i == 0:
div_num += 1 if i**2 == x else 2
i += 1
ans.append(div_num)
print("\n".join(str(i) for i in ans))

This solution runs in O(nx)\mathcal{O}(n \sqrt x) time, which is just fast enough to get AC. However, we can actually speed this up to get an O((x+n)logx)\mathcal{O}((x + n) \log x) solution!

First, let's discuss an important property of the prime factorization. Consider:

x=p1a1p2a2pkakx = {p_1}^{a_1} {p_2}^{a_2} \cdots {p_k}^{a_k}

Then the number of divisors of xx is simply (a1+1)(a2+1)(ak+1)(a_1 + 1) \cdot (a_2 + 1) \cdots (a_k + 1).

Why is this true? The exponent of pip_i in any divisor of xx must be in the range [0,ai][0, a_i] and each different exponent results in a different set of divisors, so each pip_i contributes ai+1a_i + 1 to the product.

xx can have O(logx)\mathcal{O}(\log x) distinct prime factors, so if we can find the prime factorization of xx efficiently, we can use it with the above property to answer queries in O(logx)\mathcal{O}(\log x) time instead of the previous O(x)\mathcal{O}(\sqrt x) time.

Here's how we find the prime factorization of xx in O(logx)\mathcal{O}(\log x) time with O(xlogx)\mathcal{O}(x \log x) preprocessing:

  1. For each k106k \leq 10^6, find any prime number that divides kk. To find this, we can use the Sieve of Eratosthenes which runs in O(nlogn)\mathcal{O}(n \log n), where nn is the larger numbers we consider. There's also a version of the sieve that runs in linear time, but we won't be needing it.
  2. We can find the prime factorization of xx by repeatedly dividing it by the prime numbers we calculated earlier until x=1x = 1.

Using this method gives us the following code:

C++

#include <iostream>
using namespace std;
const int MAX_N = 1e6;
// max_div[i] contains the largest prime that goes into i
int max_div[MAX_N + 1];
int main() {
for (int i = 2; i <= MAX_N; i++) {

Java

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Divisors {
private static final int MAX_N = (int)Math.pow(10, 6);
public static void main(String[] args) throws IOException {
// maxDiv[i] contains the largest prime that can divide i
int[] maxDiv = new int[MAX_N + 1];
for (int i = 2; i <= MAX_N; i++) {

Python

MAX_N = 10**6
# max_div[i] contains the largest prime that can go into i
max_div = [0 for _ in range(MAX_N + 1)]
for i in range(2, MAX_N + 1):
if max_div[i] == 0:
for j in range(i, MAX_N + 1, i):
max_div[j] = i
ans = []

GCD & LCM

GCD

The greatest common divisor (GCD) of two integers aa and bb is the largest integer that is a factor of both aa and bb. In order to find the GCD of two non-negative integers, we use the Euclidean Algorithm, which is as follows:

gcd(a,b)={ab=0gcd(b,amodb)b0\gcd(a, b) = \begin{cases} a & b = 0 \\ \gcd(b, a \bmod b) & b \neq 0 \end{cases}

This algorithm can be implemented with a recursive function as follows:

Java

public int gcd(int a, int b) { return b == 0 ? a : gcd(b, a % b); }

C++

int gcd(int a, int b) { return b == 0 ? a : gcd(b, a % b); }

For C++14, you can use the built-in __gcd(a,b).

In C++17, there exists std::gcd and std::lcm in the <numeric> header, so there's no need to code your own GCD and LCM if you're using that version.

Python

def gcd(a: int, b: int) -> int:
return a if b == 0 else gcd(b, a % b)

You won't have to actually implement this in-contest, as the built-in math library has a gcd and lcm function.

This function runs in O(logab)\mathcal{O}(\log ab) time because ab    b(moda)<b2a\le b \implies b\pmod a <\frac{b}{2}.

The worst-case scenario for the Euclidean algorithm is when aa and bb are consecutive Fibonacci numbers FnF_n and Fn+1F_{n + 1}. In this case, the algorithm will calculate that gcd(Fn,Fn+1)=gcd(Fn1,Fn)==gcd(0,F1)\gcd(F_n, F_{n + 1}) = \gcd(F_{n - 1}, F_n) = \dots = \gcd(0, F_1). This takes a total of n+1n+1 calls, which is proportional to log(FnFn+1)\log \left(F_n F_{n+1}\right).

LCM

The least common multiple (LCM) of two integers aa and bb is the smallest integer divisible by both aa and bb. The LCM can be calculated with the GCD using this property:

lcm(a,b)=abgcd(a,b)\operatorname{lcm}(a, b) = \frac{a \cdot b}{\gcd(a, b)}

Warning!

Coding lcm\text{lcm} as a * b / gcd(a, b) might cause integer overflow if the value of a * b is greater than the max size of the data type of a * b (e.g. the max size of int in C++ and Java is around 2 billion). Dividng a by gcd(a, b) first, then multiplying it by b will prevent integer overflow if the result fits in an int.

Also, these two functions are associative, meaning that if we want to take the GCD or LCM of more than two elements, we can do so two at a time, in any order. For example,

gcd(a1,a2,a3,a4)=gcd(a1,gcd(a2,gcd(a3,a4))).\gcd(a_1, a_2, a_3, a_4) = \gcd(a_1, \gcd(a_2, \gcd(a_3, a_4))).

Euler's Totient Function

Resources
cp-algo

Theory and Exercises

CF

Well-covered article

Properties

Euler's totient function - written using phi ϕ(n)\phi(n) - counts the number of positive integers in the interval [1,n][1,n] which are coprime to nn. Two numbers aa and bb are coprime if their greatest common divisor is equal to 1 i.e. gcd(a,b)=1gcd(a,b)=1.

Here are the values of ϕ(n)\phi(n) for the first 20 numbers:

n1234567891011121314151617181920
ϕ(n)\phi(n)112242646410412688166188

The totient function is multiplicative meaning that ϕ(nm)=ϕ(n)ϕ(m)\phi(nm)=\phi(n) \cdot \phi(m), where nn and mm are coprime - gcd(n,m)=1gcd(n, m)=1. For example ϕ(15)=ϕ(35)=ϕ(3)ϕ(5)=24=8\phi(15)=\phi(3 \cdot 5)=\phi(3) \cdot \phi(5) = 2 \cdot 4 = 8.

Let's take a look at some edge cases for ϕ(n)\phi(n):

  • If n is a prime number then ϕ(n)=n1\phi(n)=n-1 because gcd(n,x)=1gcd(n, x)=1 for all 1x<n1 \leq x < n
  • If n is a power of a prime number, n=pqn=p^q where p is a prime number and 1q1 \leq q then
    there are exactly pq1p^{q-1} numbers divisible by pp, so ϕ(pq)=pqpq1=pq1(p1)\phi(p^q)=p^{q} - p^{q-1} = p^{q-1}(p - 1)

Using the multiplicative property and the last edge case we can compute the value of ϕ(n)\phi(n) from the factorization of number nn. Let the factorization be n=p1q1p2q2pkqkn=p_1^{q_1} \cdot p_2^{q_2} \cdot \ldots \cdot p_k^{q_k} where pip_i is a prime factor of nn, then:

ϕ(n)=ϕ(p1q1)ϕ(p2q2)ϕ(pkqk)=p1q11(p11)p2q21(p21)pkqk1(qk1)\phi(n)=\phi(p_1^{q_1}) \cdot \phi(p_2^{q_2}) \cdot \ldots \cdot \phi(p_k^{q_k}) = p_1^{q_1-1}(p_1 - 1) \cdot p_2^{q_2-1}(p_2 - 1) \cdot \ldots \cdot p_k^{q_k-1}(q_k - 1)

Below is an implementation for factorization in O(n)\mathcal{O}(\sqrt{n}). It can be a little bit tricky to understand why we subtract ans/pans/p from ansans. For example ans=pqxans=p^q \cdot x ,where pp is a prime factor and xx is the rest of the prime factorization. By subtracting ansp=pq1x\frac{ans}{p}=p^{q-1} \cdot x we end up with: pqxpq1x=pq1x(p1)p^q \cdot x - p^{q-1} \cdot x = p^{q-1} \cdot x \cdot (p - 1) which is exactly the form of ϕ(n)\phi(n) described a few lines above.

C++

int phi(int n) {
int ans = n;
for (int p = 2; p * p <= n; p++) {
if (n % p == 0) {
while (n % p == 0) { n /= p; }
ans -= ans / p;
}
}
if (n > 1) { ans -= ans / n; }
return ans;
}

Java

public static int phi(int n) {
int ans = n;
for (int p = 2; p * p <= n; p++) {
if (n % p == 0) {
while (n % p == 0) { n /= p; }
ans -= ans / p;
}
}
if (n > 1) { ans -= ans / n; }
return ans;
}

Python

def phi(n: int) -> int:
ans = n
p = 2
while p * p <= n:
if n % p == 0:
while n % p == 0:
n //= p
ans -= ans // p
p += 1
if n > 1:
ans -= ans // n
return ans

Usually in problems we need to precompute the totient of all numbers between 11 and nn, then factorizing is not efficient. The idea is the same as the Sieve of Eratosthenes. Since it's almost the same with the Sieve of Eratosthenes the time complexity will be: O(NloglogN)\mathcal{O}(N\log\log{N}).

C++

void precompute() {
for (int i = 1; i < MAX_N; i++) { phi[i] = i; }
for (int i = 2; i < MAX_N; i++) {
// If i is prime
if (phi[i] == i) {
for (int j = i; j < MAX_N; j += i) { phi[j] -= phi[j] / i; }
}
}
}

Java

public static void precompute() {
for (int i = 1; i < MAX_N; i++) { phi[i] = i; }
for (int i = 2; i < MAX_N; i++) {
// If i is prime
if (phi[i] == i) {
for (int j = i; j < MAX_N; j += i) { phi[j] -= phi[j] / i; }
}
}
}

Python

def precompute():
for i in range(1, MAX_N):
phi[i] = i
for i in range(2, MAX_N):
# If i is prime
if phi[i] == i:
for j in range(i, MAX_N, i):
phi[j] -= phi[j] // i

Focus Problem – try your best to solve this problem before continuing!

Solution

We are asked to compute the sum of GCDs for all pairs (i,j)(i, j) such that 1i<jn1 \le i < j \le n:

i=1nj=i+1ngcd(i,j).\sum_{i=1}^{n} \sum_{j=i+1}^{n} \gcd(i,j).

Let's define a helper function f(n)f(n) which computes the sum of GCDs for a fixed second element nn:

f(n)=i=1ngcd(i,n).f(n) = \sum_{i=1}^{n} \gcd(i, n).

This function sums gcd(i,n)\gcd(i, n) for all ini \le n. The terms where gcd(i,n)=d\gcd(i, n) = d are exactly those where dd divides nn and gcd(id,nd)=1\gcd(\frac{i}{d}, \frac{n}{d}) = 1. The number of such integers ii is given by Euler's totient function ϕ(nd)\phi(\frac{n}{d}). Thus, we can rewrite f(n)f(n) as a sum over the divisors of nn:

f(n)=dndϕ(nd).f(n) = \sum_{d|n} d \cdot \phi\left(\frac{n}{d}\right).

We can compute f(n)f(n) for all nn up to 10610^6 efficiently. Instead of factoring every number, we iterate through every possible divisor ii and update all its multiples jj. For a fixed ii, we add the contribution iϕ(ji)i \cdot \phi(\frac{j}{i}) to f(j)f(j) for all jj that are multiples of ii.

Finally, the problem asks for pairs where i<ji < j. The value f(j)f(j) includes the case i=ji=j (where gcd(j,j)=j\gcd(j, j) = j), which we must exclude. The answer for a given nn is the prefix sum of these adjusted values:

ans[n]=j=1n(f(j)j).\text{ans}[n] = \sum_{j=1}^{n} (f(j) - j).

The total complexity of this precomputation is bounded by the harmonic series sum: i=1NNiNlnN\sum_{i=1}^{N} \frac{N}{i} \approx N \ln N

C++

#include <bits/stdc++.h>
using namespace std;
using ll = long long;
const int MAX_N = 1e6;
ll phi[MAX_N + 1];
ll f[MAX_N + 1];
ll sum[MAX_N + 1];
int main() {

Problems

StatusSourceProblem NameDifficultyTags
ACEasy
Show TagsPrime Factorization
CFEasy
Show TagsDivisibility, Modular Arithmetic
CFEasy
Show TagsNT
CFEasy
Show TagsDivisibility
CSESEasy
Show TagsFunctional Graph, Prime Factorization
CSESNormal
Show TagsDivisibility
CFNormal
Show TagsPrime Factorization
CCNormal
Show TagsDivisibility
CSESHard
Show TagsDivisibility
SPOJHard
Show TagsDivisibility, Euler Totient, LCM
CFHard
Show TagsDivisibility
ACHard
Show TagsDivisibility

Module Progress:

Join the USACO Forum!

Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!

PrevNext