The FFT in Competitive Programming

So I was grinding some problems on Codeforces last weekend when I came across yet another polynomial multiplication task. I coded a simple O(n²) approach, then Time limit exceeded.

That's when I realized I needed the Fast Fourier Transform (FFT) algorithm. If you're serious about competitive programming, FFT is one of those tools that you absolutely need in your arsenal. It transforms polynomial multiplication from an O(n²) pain into a O(n log n).

Why FFT?

Let's be real - FFT problems don't show up every day. But when they do, they're usually worth a lot of points. More importantly, understanding FFT gives you a powerful technique for dealing with convolutions and certain types of optimization problems.

Some classic problem types where FFT shines:

Polynomial multiplication
String matching with wildcards
Counting the number of ways to make a sum with given elements
Substring problems that involve counting occurrences

The Core Idea

At its heart, FFT is about representing polynomials in different ways. The standard way is to list the coefficients: 3x² + 2x + 1. But here's the magic - we can also represent a polynomial by its values at specific points.

If I know the values of a polynomial at n different points, I can reconstruct the entire polynomial. FFT is essentially an efficient way to convert between these two representations.

The Implementation

I made this FFT code using GPT (note i used c++ as it is the only valid language a competitive programmer should use). If you don't understand the code, then don't worry most top coders also have problem understanding it. Take your time to understand it, or use black-box approach while solving questions.

#include <bits/stdc++.h>
using namespace std;
typedef complex<double> cd;
const double PI = acos(-1);

void fft(vector<cd> &a, bool invert) {
    int n = a.size();
    if (n == 1) return;

    vector<cd> a0(n / 2), a1(n / 2);
    for (int i = 0; 2 * i < n; i++) {
        a0[i] = a[2 * i];
        a1[i] = a[2 * i + 1];
    }
    fft(a0, invert);
    fft(a1, invert);

    double ang = 2 * PI / n * (invert ? -1 : 1);
    cd w(1), wn(cos(ang), sin(ang));
    for (int i = 0; 2 * i < n; i++) {
        a[i] = a0[i] + w * a1[i];
        a[i + n / 2] = a0[i] - w * a1[i];
        if (invert) {
            a[i] /= 2;
            a[i + n / 2] /= 2;
        }
        w *= wn;
    }
}

vector<int> multiply(vector<int> const &a, vector<int> const &b) {
    vector<cd> fa(a.begin(), a.end()), fb(b.begin(), b.end());
    int n = 1;
    while (n < a.size() + b.size()) 
        n <<= 1;
    fa.resize(n);
    fb.resize(n);

    fft(fa, false);
    fft(fb, false);
    for (int i = 0; i < n; i++)
        fa[i] *= fb[i];
    fft(fa, true);

    vector<int> result(n);
    for (int i = 0; i < n; i++)
        result[i] = round(fa[i].real());
    return result;
}

This implementation is straightforward - the fft function handles the transform itself, while multiply takes care of padding, calling FFT, multiplying in the frequency domain, and then calling inverse FFT.

A More Optimized Implementation

In contests, milliseconds matter. Here's a faster implementation that uses bit-reversal and iterative FFT:

#include <bits/stdc++.h>
using namespace std;
typedef complex<double> cd;
const double PI = acos(-1);

int reverse(int num, int lg_n) {
    int res = 0;
    for (int i = 0; i < lg_n; i++) {
        if (num & (1 << i))
            res |= 1 << (lg_n - 1 - i);
    }
    return res;
}

void fft(vector<cd> & a, bool invert) {
    int n = a.size();
    int lg_n = 0;
    while ((1 << lg_n) < n) lg_n++;

    for (int i = 0; i < n; i++) {
        if (i < reverse(i, lg_n))
            swap(a[i], a[reverse(i, lg_n)]);
    }

    for (int len = 2; len <= n; len <<= 1) {
        double ang = 2 * PI / len * (invert ? -1 : 1);
        cd wlen(cos(ang), sin(ang));
        for (int i = 0; i < n; i += len) {
            cd w(1);
            for (int j = 0; j < len / 2; j++) {
                cd u = a[i+j], v = a[i+j+len/2] * w;
                a[i+j] = u + v;
                a[i+j+len/2] = u - v;
                w *= wlen;
            }
        }
    }

    if (invert) {
        for (cd & x : a)
            x /= n;
    }
}

vector<int> multiply(vector<int> const& a, vector<int> const& b) {
    vector<cd> fa(a.begin(), a.end()), fb(b.begin(), b.end());
    int n = 1;
    while (n < a.size() + b.size()) 
        n <<= 1;
    fa.resize(n);
    fb.resize(n);

    fft(fa, false);
    fft(fb, false);
    for (int i = 0; i < n; i++)
        fa[i] *= fb[i];
    fft(fa, true);

    vector<int> result(n);
    for (int i = 0; i < n; i++)
        result[i] = round(fa[i].real());
    return result;
}

Handling Modular Arithmetic

In most CP problems, you'll need to handle modular arithmetic. For small modulos, you can use the following approach:

vector<int> multiply_mod(vector<int> const& a, vector<int> const& b, int mod) {
    vector<cd> fa(a.begin(), a.end()), fb(b.begin(), b.end());
    int n = 1;
    while (n < a.size() + b.size()) 
        n <<= 1;
    fa.resize(n);
    fb.resize(n);

    fft(fa, false);
    fft(fb, false);
    for (int i = 0; i < n; i++)
        fa[i] *= fb[i];
    fft(fa, true);

    vector<int> result(n);
    for (int i = 0; i < n; i++)
        result[i] = (int)(round(fa[i].real())) % mod;
    return result;
}

For larger modulos like 1e7 or 998244343 which we normally see, you might need to use the Number Theoretic Transform (NTT) instead, but that's a topic for another post. The resources linked have covered this if you need.

A Real CP Problem Example

Let's solve a classic problem: given two polynomials A and B, find their product.

#include <bits/stdc++.h>
using namespace std;
typedef complex<double> cd;
const double PI = acos(-1);

void fft(vector<cd> &a, bool invert) {
    int n = a.size();
    int lg_n = 0;
    while ((1 << lg_n) < n) lg_n++;

    for (int i = 0; i < n; i++) {
        int rev = 0;
        for (int j = 0; j < lg_n; j++) {
            if (i & (1 << j))
                rev |= 1 << (lg_n - 1 - j);
        }
        if (i < rev)
            swap(a[i], a[rev]);
    }

    for (int len = 2; len <= n; len <<= 1) {
        double ang = 2 * PI / len * (invert ? -1 : 1);
        cd wlen(cos(ang), sin(ang));
        for (int i = 0; i < n; i += len) {
            cd w(1);
            for (int j = 0; j < len / 2; j++) {
                cd u = a[i+j], v = a[i+j+len/2] * w;
                a[i+j] = u + v;
                a[i+j+len/2] = u - v;
                w *= wlen;
            }
        }
    }

    if (invert)
        for (int i = 0; i < n; i++)
            a[i] /= n;
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(0);
    
    int n, m;
    cin >> n >> m;
    
    vector<int> a(n+1), b(m+1);
    for (int i = 0; i <= n; i++)
        cin >> a[i];
    for (int i = 0; i <= m; i++)
        cin >> b[i];
    
    vector<cd> fa(a.begin(), a.end()), fb(b.begin(), b.end());
    int size = 1;
    while (size < a.size() + b.size())
        size <<= 1;
    fa.resize(size);
    fb.resize(size);
    
    fft(fa, false);
    fft(fb, false);
    
    for (int i = 0; i < size; i++)
        fa[i] *= fb[i];
        
    fft(fa, true);
    
    vector<int> result(n + m + 1);
    for (int i = 0; i <= n + m; i++)
        result[i] = round(fa[i].real());
    
    for (int i = 0; i <= n + m; i++) {
        cout << result[i];
        if (i < n + m) cout << " ";
    }
    cout << "\n";
    
    return 0;
}

Tips for FFT in Contests

Have a template ready: FFT isn't something you want to implement from scratch during a contest. not even tourist or jiangly does it.
Watch out for numerical errors: FFT involves floating-point calculations, which can lead to precision issues. Always use round() when converting back to integers.
Know when to use NTT: For problems with large modulos, consider using NTT instead.
Optimize your code: The iterative version of FFT is much faster than the recursive one.
Consider memory usage: FFT requires O(n) additional memory, which can be an issue for large inputs.

Search This Blog

FFT tricks in Competitive Programming.

FFT in Competitve Programming

The FFT in Competitive Programming

Why FFT?

The Core Idea

The Implementation

A More Optimized Implementation

Handling Modular Arithmetic

A Real CP Problem Example

Tips for FFT in Contests

More FFT Problem Types in Competitive Programming

String Matching and Pattern Finding

Dynamic Programming Optimization

Number Theory Problems

Signal Processing Inspired Problems

Final Thoughts

Comments

Post a Comment