Different results when using OpenMP and FFTW (Openmp) / Ingrom

You are calling FFTW with the same plan p in all threads. Since the plan includes the location of the input and output buffers (the ones supplied to the fftw_plan_dft_whatever plan constructor), all concurrent calls to fftw_execute will utilise those same buffers and not the private copies. The solution is to construct a separate plan for each thread:

#pragma omp parallel private(j,i,mxy) firstprivate(in,out)
{
    // The following OpenMP construct enforces thread-safety
    // Remove if the plan constructor is thread-safe
    #pragma omp critical (plan_ops)
    fftw_plan my_p = fftw_plan_dft_whatever(..., in, out, ...);
    // my_p now refers the private in and out arrays

    #pragma omp for
    for(int j = 0; j < Ny; j++) {
        for(int i = 0; i < Nx; i++){
            mxy   = i + j*Nx;
            in[i+1] = b_2D[mxy] + I*0.0 ;
        }
        fftw_execute(my_p);
        for(int i = 0; i < Nx; i++){
            mxy   = i + j*Nx;
            b_2D[mxy] = cimag(out[i+1]) ;
        }
    }

    // See comment above for the constructor operation
    #pragma omp critical (plan_ops)
    fftw_destroy_plan(my_p);
}

Read more here: Source link