gradient domain path tracing (GPT)

(L) [2015/08/14] [bachi] [gradient domain path tracing (GPT)] Wayback!

>> MohamedSakr wrote:2 days ago I found this online, [LINK https://gist.github.com/BachiLi/4f5c6e5a4fef5773dab1]
tested it, works pretty well, but my mathematical experience couldn't allow me to modify the resultant poisson reconstruction much.
from the results, some how they are matching L2 minimization, how to do L1 minimization?? "although in the code it is mentioned that it uses Laplacian, and I guess Laplacian is L1, but the results are some how compared to the paper L2 images"
[LINK https://mediatech.aalto.fi/publications/graphics/GPT/supplemental_material/index.html]
also read here about L1 and L2 minimizations:
[LINK http://www.quora.com/What-is-the-difference-between-L1-and-L2-regularization]

Hi,
You are right that the code uses L2 optimization instead of L1.  For the L1 optimization, I believe GPT people use the iterative reweighted least-squares (IRLS) algorithm.  To understand IRLS, we need to start from the linear algebra solution for the L2 problem:
Following the nomenclature in the GPT paper, let the "primal image" be a vector v_p, and the "gradient images" be vectors v_x and v_y.  The L2 optimization seeks for an image v such that:
alpha * (v-v_p)^2 + (Dx*v - v_x)^2 + (Dy*v - v_y)^2
is minimized.  The Dx and Dy here are the finite difference operators which can be represented as matrices.  From calculus 101 we know that if there is a minimal value it must be at where the derivative is 0, so we can solve the L2 optimization by solving the following linear system:
(alpha * I -(Dx^T*Dx +Dy^T*Dy)) v = alpha * v_p - (Dx * v_x + Dy * v_y),
where I is the identity matrix (the Dx^T*Dx and Dy^T*Dy are the laplacian operators, I think this is what you were mentioning).  Although the matrices are enormously large, they are sparse so it is solvable by using iterative techniques like conjugate gradient.

Now, for the L1 optimization we need to minimize the L1 cost function:
alpha * |v-v_p| + |Dx*v - v_x| + |Dy*v - v_y|,
where the squared differences are replaced by absolute differences.  The derivatives trick is no longer directly applicable because of the absolute values.  The strategy used by the IRLS algorithm is to cast back the problem to the previous L2 optimization by minimizing the following cost function:
alpha * (v-v_p)^2 * w_p + (Dx*v - v_x)^2 * w_x + (Dy*v - v_y)^2 * w_y.
Here the weight w_p, w_x, w_y are vectors and * represents dot product.  If we can set
w_p = 1/|v-v_p|, w_x = 1.0/|Dx*v - v_x|, w_y = 1.0/|Dy*v - v_y|,
the cost function becomes the original L1 cost function.  Furthermore, we can solve this cost function, again, by using the derivative trick.  The corresponding linear system is:
(alpha * w_p * I - (w_x*Dx^T*Dx + w_y*Dy^T*Dy)) v = alpha * w_p * v_p - (w_x * Dx * v_x + w_y * Dy * v_y)
Unfortunately, to set w we need the final image v, but to solve the image v we need the weight w, which is a chicken and egg problem.  The solution to the problem is to start from an initial guess and gradually converge to correct answer.  The complete IRLS algorithm looks like this:
Let w_p, w_x, w_y = 1
Until converge
-- Solve for v: (alpha * w_p * I - (w_x*Dx^T*Dx + w_y*Dy^T*Dy)) v = alpha * w_p * v_p - (w_x * Dx * v_x + w_y * Dy * v_y)
-- Let w_p = 1 / (|v - v_p| + eps), w_x = 1 / (|v - v_x| + eps), w_y = 1 / (|v - v_y| + eps)
(eps is a small number to prevent division by zero)
And that's it.  I hope this is clear enough for you to implement L1 optimization.  I did not put it in the code because I feel that this is not the core component of the algorithm, and I did not find any open source implementation of this...
P.S. It would be great if we can have the beloved latex support back :p

(L) [2015/08/15] [MohamedSakr] [gradient domain path tracing (GPT)] Wayback!

ok,
after some googling, tracing code line by line, I can understand the code to some extent and convert the code "as it is" to C++
now the function takes mainly a single matrix X,
function [Z, obj]= LRR_IRLS(X,lambda,rc,rho,display)
where I can imagine X here can represent v_p.
so how to use the gradients here? v_x and v_y
edit: the function code:
Code: [LINK # Select all]function [Z, obj]= LRR_IRLS(X,lambda,rc,rho,display)
% Solve the smoothed LRR problem (6) by IRLS shown in the following paper:
% Canyi Lu, Zhouchen Lin, Shuicheng Yan, Smoothed Low Rank and Sparse
% Matrix Recovery by Iteratively Reweighted Least Squares Minimization,
% IEEE Transactions on Image Processing (TIP), 2014
%
% Written by Canyi Lu (canyilu@gmail.com), December 2014.
%
if nargin < 5
display = false;
end
[~, n] = size(X);
XtX = lambda*(X'*X);
maxiter = 500;
mu = rc*norm(X,2);
tol2 = 1e-5;
I = eye(n);
Z = zeros(n,n);
Z_old = Z;
W1 = eye(n,n);
W2 = ones(n,1);
W = W1*diag(W2);
if display
obj = zeros(maxiter,1);
end
for t = 1 : maxiter
% calculate Z: XtX*Z + Z*W - XtX = 0
% X = lyap(A,B,C) solves AX+XB+C=0.
Z = lyap(XtX,W,-XtX);

% calculate W1 = (Z^T*Z+mu*I)^{-0.5} with SVD
% [~,S,V]=svd(Z,'econ');
% s = diag(S);
% s = 1./sqrt( s.*s + mu^2 );
% W1 = V*diag(s)*V';

% or calculate W1 = (Z^T*Z+mu*I)^{-0.5} without SVD
W1 = (Z'*Z+mu^2*I)^(-0.5);
% calculate W2 which is a diagonal matrix
E = X-X*Z;
E = dot(E,E);
W2 = sqrt((E+mu^2));
W = W1*diag(W2);

% update mu
mu = mu/rho;

% compute the objective function value
if display
EE = X-X*Z;
obj(t) = nuclearnorm(Z)+lambda*sum(sqrt(sum(EE.*EE)));
end
if norm(Z_old-Z,'fro')/norm(Z,'fro')<tol2
break;
end
Z_old = Z;
end
if display
if t<maxiter
obj(t+1:end) = [];
end
else
obj = [];
end

(L) [2015/08/15] [MohamedSakr] [gradient domain path tracing (GPT)] Wayback!

>> bachi wrote:Let w_p, w_x, w_y = 1
Until converge
-- Solve for v: (alpha * w_p * I - (w_x*Dx^T*Dx + w_y*Dy^T*Dy)) v = alpha * w_p * v_p - (w_x * Dx * v_x + w_y * Dy * v_y)
-- Let w_p = 1 / (|v - v_p| + eps), w_x = 1 / (|v - v_x| + eps), w_y = 1 / (|v - v_y| + eps)
(eps is a small number to prevent division by zero)
I think I was so dumb  [SMILEY :D]
I looked for a "ready to go" solution, but may be I can create one from these equations.
Code: [LINK # Select all]void L1_MinimizationSolve(int width, int height,
   const double* imgData, const double* imgGradX,
   const double* imgGradY, double dataCost,
   int iterations, double* imgOut) {
   const double eps = 1e-6;
   int nodeCount = width * height;
   double* LapY = (double*) malloc(sizeof(double) * height);
   double* LapX = (double*) malloc(sizeof(double) * width);
   for(int x = 0; x < width; x++) {
      LapX[x] = 2.0 * cos(M_PI * x / (width - 1));
   }
   for(int y = 0; y < height; y++) {
      LapY[y] = -4.0 + (2.0 * cos(M_PI * y / (height - 1)));
   }

   //Bachi
   //Let w_p, w_x, w_y = 1
   //Until converge
   // -- Solve for v: (alpha * w_p * I - (w_x*Dx^T*Dx + w_y*Dy^T*Dy)) v = alpha * w_p * v_p - (w_x * Dx * v_x + w_y * Dy * v_y)
   // -- Let w_p = 1 / (|v - v_p| + eps), w_x = 1 / (|v - v_x| + eps), w_y = 1 / (|v - v_y| + eps)
   //(eps is a small number to prevent division by zero)
   double* w_p = (double*) malloc(sizeof(double) * nodeCount);
   double* w_x = (double*) malloc(sizeof(double) * nodeCount);
   double* w_y = (double*) malloc(sizeof(double) * nodeCount);
   memset (w_p, 1, sizeof (double) * nodeCount);
   memset (w_x, 1, sizeof (double) * nodeCount);
   memset (w_y, 1, sizeof (double) * nodeCount);

   for(int it = 0; it < iterations; ++it) {
      //Solve for v: (alpha * w_p * I - (w_x*Dx^T*Dx + w_y*Dy^T*Dy)) v = alpha * w_p * v_p - (w_x * Dx * v_x + w_y * Dy * v_y)
      //how?

      //updating weights
      //is this correct?
      for(int n = 0; n < nodeCount; ++n) {
         w_p[n] = 1 / min(abs(imgOut[n] - imgData[n]), eps);
         w_x[n] = 1 / min(abs(imgOut[n] - imgGradX[n]), eps);
         w_y[n] = 1 / min(abs(imgOut[n] - imgGradY[n]), eps);
      }
   }
}
now the whole problem is the first equation, how to solve it? "using FFTW3 for example", I'm terrible with matrix solvers and need to learn this part. "mathematically, I understand how it works, but I don't know how to apply it in FFTW3 for example"

(L) [2015/08/16] [MohamedSakr] [gradient domain path tracing (GPT)] Wayback!

questions: "just to clarify terms of the equations"
>> Let w_p, w_x, w_y = 1
Until converge
-- Solve for v: (alpha * w_p * I - (w_x*Dx^T*Dx + w_y*Dy^T*Dy)) v = alpha * w_p * v_p - (w_x * Dx * v_x + w_y * Dy * v_y)
-- Let w_p = 1 / (|v - v_p| + eps), w_x = 1 / (|v - v_x| + eps), w_y = 1 / (|v - v_y| + eps)
(eps is a small number to prevent division by zero)
I will solve the equations using Eigen, what I need to know: (which is a matrix, which is a vector, order of multiplication)
alpha : real
w_p, w_x, w_y : column vector
I : identity
v : result matrix
Dx * v_x, Dy * v_y : input gradient matrix difference
v_p : input image matrix
and I wanna confirm about last step equation, as it is written 2 times with different values:
>> Here the weight w_p, w_x, w_y are vectors and * represents dot product. If we can set
w_p = 1/|v-v_p|, w_x = 1.0/|Dx*v - v_x|, w_y = 1.0/|Dy*v - v_y|,
>> -- Let w_p = 1 / (|v - v_p| + eps), w_x = 1 / (|v - v_x| + eps), w_y = 1 / (|v - v_y| + eps)
edit:
the whole system should be iterative linear system "as far as I understand" , Ax = b, where A is the result matrix v, x is the result column vector of left hand side, b is the result column vector of right hand side.
edit2:
>> alpha * (v-v_p)^2 * w_p + (Dx*v - v_x)^2 * w_x + (Dy*v - v_y)^2 * w_y.
Here the weight w_p, w_x, w_y are vectors and * represents dot product.
and: w_p = 1 / (|v - v_p| + eps)
a little confusion here, you mentioned w_p is a vector, and |v - v_p| is a matrix, why adding eps to a matrix?, and the division makes w_p as a matrix "not a vector".

(L) [2015/08/17] [MohamedSakr] [gradient domain path tracing (GPT)] Wayback!

here is the code, but it gives black output, not sure what I'm doing wrong here.
Code: [LINK # Select all]void L1_MinimizationSolve(int width, int height,
const double* imgData, const double* imgGradX,
const double* imgGradY, double dataCost,
int iterations, double* imgOut) {
//Bachi
//Let w_p, w_x, w_y = 1
//Until converge
// -- Solve for v: (alpha * w_p * I - (w_x*Dx^T*Dx + w_y*Dy^T*Dy)) v = alpha * w_p * v_p - (w_x * Dx * v_x + w_y * Dy * v_y)
// -- Let w_p = 1 / (|v - v_p| + eps), w_x = 1 / (|v - v_x| + eps), w_y = 1 / (|v - v_y| + eps)
//(eps is a small number to prevent division by zero)
//ftLapY is Dx^T*Dx
//ftLapX is Dy^T*Dy
const double eps = 1e-6;
int nodeCount = width * height;
double* fftBuff = (double*)fftw_malloc(sizeof(*fftBuff) * nodeCount);
Vec* w_p = (Vec*)fftw_malloc(sizeof(*w_p) * nodeCount);
Vec* w_x = (Vec*)fftw_malloc(sizeof(*w_x) * nodeCount);
Vec* w_y = (Vec*)fftw_malloc(sizeof(*w_y) * nodeCount);

//initialize weights
for (int n = 0; n < nodeCount; ++n) {
   w_p[n].x = 1.;
   w_p[n].y = 1.;
   w_p[n].z = 1.;
   w_x[n].x = 1.;
   w_x[n].y = 1.;
   w_x[n].z = 1.;
   w_y[n].x = 1.;
   w_y[n].y = 1.;
   w_y[n].z = 1.;
}
double* wp = (double*)w_p;
double* wx = (double*)w_x;
double* wy = (double*)w_y;
//compute two 1D lookup tables for computing the DCT of a 2D Laplacian on the fly
double* ftLapY = (double*)fftw_malloc(sizeof(*ftLapY) * height);
double* ftLapX = (double*)fftw_malloc(sizeof(*ftLapX) * width);
for (int x = 0; x < width; x++) {
   ftLapX[x] = 2.0 * cos(M_PI * x / (width - 1));
}
for (int y = 0; y < height; y++) {
   ftLapY[y] = -4.0 + (2.0 * cos(M_PI * y / (height - 1)));
}
//Create a DCT-I plan for, which is its own inverse.
fftw_plan fftPlan;
fftPlan = fftw_plan_r2r_2d(height, width,
   fftBuff, fftBuff,
   FFTW_REDFT00, FFTW_REDFT00, FFTW_ESTIMATE); //use FFTW_PATIENT when plan can be reused
for (int it = 0; it < iterations; ++it) {
   for (int iChannel = 0; iChannel < 3; iChannel++) {
      int nodeAddr = 0;
      int pixelAddr = iChannel;
      int rightPixelAddr = 3 + iChannel;
      int topPixelAddr = (width * 3) + iChannel;
      double dcSum = 0.0;
      // compute h_hat from u, gx, gy (see equation 48 in Bhat's paper), as well as the DC term of u's DCT.
      for (int y = 0; y < height; y++)
      for (int x = 0; x < width; x++,
         nodeAddr++, pixelAddr += 3, rightPixelAddr += 3, topPixelAddr += 3) {
         // Compute DC term of u's DCT without computing the whole DCT.
         double dcMult = 1.0;
         if ((x > 0) && (x < width - 1))
            dcMult *= 2.0;
         if ((y > 0) && (y < height - 1))
            dcMult *= 2.0;
         dcSum += dcMult * imgData[pixelAddr];
         //fftBuff[nodeAddr] = dataCost * imgData[pixelAddr];
         fftBuff[nodeAddr] = dataCost * imgData[pixelAddr] * wp[pixelAddr];
         // Subtract g^x_x and g^y_y, with boundary factor of -2.0 to account for boundary reflections implicit in the DCT
         if ((x > 0) && (x < width - 1))
            //fftBuff[nodeAddr] -= (imgGradX[rightPixelAddr] - imgGradX[pixelAddr]);
            fftBuff[nodeAddr] -= (imgGradX[rightPixelAddr] - imgGradX[pixelAddr]) * wx[pixelAddr];
         else
            //fftBuff[nodeAddr] -= (-2.0 * imgGradX[pixelAddr]);
            fftBuff[nodeAddr] -= (-2.0 * imgGradX[pixelAddr]) * wx[pixelAddr];
         if ((y > 0) && (y < height - 1))
            //fftBuff[nodeAddr] -= (imgGradY[topPixelAddr] - imgGradY[pixelAddr]);
            fftBuff[nodeAddr] -= (imgGradY[topPixelAddr] - imgGradY[pixelAddr]) * wy[pixelAddr];
         else
            //fftBuff[nodeAddr] -= (-2.0 * imgGradY[pixelAddr]);
            fftBuff[nodeAddr] -= (-2.0 * imgGradY[pixelAddr]) * wy[pixelAddr];
      }
      //transform h_hat to H_hat by taking the DCT of h_hat
      fftw_execute(fftPlan);
      //compute F_hat using H_hat (see equation 29 in Bhat's paper)
      nodeAddr = 0;
      for (int y = 0; y < height; y++)
      for (int x = 0; x < width; x++, nodeAddr++) {
         //float ftLapResponse = ftLapY[y] + ftLapX[x];
         double ftLapResponse = ftLapY[y] * wy[pixelAddr] + ftLapX[x] * wx[pixelAddr];
         //fftBuff[nodeAddr] /= (dataCost - ftLapResponse);
         fftBuff[nodeAddr] /= (dataCost * wp[pixelAddr] - ftLapResponse);
      }
      /* Set the DC term of the solution to the value computed above (i.e., the DC term of imgData).
      * set dcSum to the desired average when dataCost=0
      */
      fftBuff[0] = dcSum;
      //transform F_hat to f_hat by taking the inverse DCT of F_hat
      fftw_execute(fftPlan);
      double fftDenom = 4.0 * (width - 1) * (height - 1);
      pixelAddr = iChannel;
      for (int iNode = 0; iNode < nodeCount; iNode++, pixelAddr += 3) {
         imgOut[pixelAddr] = fftBuff[iNode] / fftDenom;
         wp[pixelAddr] = 1. / min(abs(imgOut[pixelAddr] - imgData[pixelAddr]), eps);
         wx[pixelAddr] = 1. / min(abs(imgOut[pixelAddr] - imgGradX[pixelAddr]), eps);
         wy[pixelAddr] = 1. / min(abs(imgOut[pixelAddr] - imgGradY[pixelAddr]), eps);
      }
   }
}
fftw_free(fftBuff);
fftw_free(w_p);
fftw_free(w_x);
fftw_free(w_y);
fftw_free(ftLapX);
fftw_free(ftLapY);
fftw_destroy_plan(fftPlan);
}
I tested it without the new additions "wx, wy, wp..." , 1 iteration, and it worked similar to the L2 code "no problems" , once I put the wx, wy, wp "which equals 1 in the first iteration" , it gives a black image, certainly VS2013 compiler bug...
edit:
it works now, testing with high iteration count "10000" , with 100 iterations the result was bad.

(L) [2015/08/17] [MohamedSakr] [gradient domain path tracing (GPT)] Wayback!

almost there "corrected many errors" , not the correct result yet, may be math gurus can help  [SMILEY :D]
Code: [LINK # Select all]void L1_MinimizationSolve(int width, int height,
const double* imgData, const double* imgGradX,
const double* imgGradY, double dataCost,
int iterations, double* imgOut) {
//Bachi
//Let w_p, w_x, w_y = 1
//Until converge
// -- Solve for v: (alpha * w_p * I - (w_x*Dx^T*Dx + w_y*Dy^T*Dy)) v = alpha * w_p * v_p - (w_x * Dx * v_x + w_y * Dy * v_y)
// -- Let w_p = 1 / (|v - v_p| + eps), w_x = 1 / (|Dx*v - v_x| + eps), w_y = 1 / (|Dy*v - v_y| + eps)
//(eps is a small number to prevent division by zero)
//ftLapY is Dx^T*Dx
//ftLapX is Dy^T*Dy
const double eps = 1e-6;
int nodeCount = width * height;
double* fftBuff = (double*)fftw_malloc(sizeof(*fftBuff) * nodeCount);
//4 instead of 3, VS2013 got a compiler bug here "may be due to alignment"...
std::vector<double> wp(nodeCount * 4, 1.);
std::vector<double> wx(nodeCount * 4, 1.);
std::vector<double> wy(nodeCount * 4, 1.);
//compute two 1D lookup tables for computing the DCT of a 2D Laplacian on the fly
double* ftLapY = (double*)fftw_malloc(sizeof(*ftLapY) * height);
double* ftLapX = (double*)fftw_malloc(sizeof(*ftLapX) * width);
for (int x = 0; x < width; x++) {
   ftLapX[x] = 2.0 * cos(M_PI * x / (width - 1));
}
for (int y = 0; y < height; y++) {
   ftLapY[y] = -4.0 + (2.0 * cos(M_PI * y / (height - 1)));
}
//Create a DCT-I plan for, which is its own inverse.
fftw_plan fftPlan;
fftPlan = fftw_plan_r2r_2d(height, width,
   fftBuff, fftBuff,
   FFTW_REDFT00, FFTW_REDFT00, FFTW_ESTIMATE); //use FFTW_PATIENT when plan can be reused
for (int it = 0; it < iterations; ++it) {
   for (int iChannel = 0; iChannel < 3; iChannel++) {
      int nodeAddr = 0;
      int pixelAddr = iChannel;
      int rightPixelAddr = 3 + iChannel;
      int topPixelAddr = (width * 3) + iChannel;
      double dcSum = 0.0;
      // compute h_hat from u, gx, gy (see equation 48 in Bhat's paper), as well as the DC term of u's DCT.
//#pragma omp parallel for schedule(dynamic, 1) // OpenMP
      for (int y = 0; y < height; y++)
      for (int x = 0; x < width; x++,
         nodeAddr++, pixelAddr += 3, rightPixelAddr += 3, topPixelAddr += 3) {
         // Compute DC term of u's DCT without computing the whole DCT.
         double dcMult = 1.0;
         if ((x > 0) && (x < width - 1))
            dcMult *= 2.0;
         if ((y > 0) && (y < height - 1))
            dcMult *= 2.0;
         dcSum += dcMult * imgData[pixelAddr];
         //if (wp[pixelAddr] != 1. || wx[pixelAddr] != 1. || wy[pixelAddr] != 1.)
         // std::cout << "problem with weight value at:" << x << ", " << y << ": " << wp[pixelAddr] << " " << wx[pixelAddr] << " " << wy[pixelAddr] << std::endl;
         //fftBuff[nodeAddr] = dataCost * imgData[pixelAddr];
         fftBuff[nodeAddr] = dataCost * imgData[pixelAddr] * wp[pixelAddr];
         // Subtract g^x_x and g^y_y, with boundary factor of -2.0 to account for boundary reflections implicit in the DCT
         if ((x > 0) && (x < width - 1))
            //fftBuff[nodeAddr] -= (imgGradX[rightPixelAddr] - imgGradX[pixelAddr]);
            fftBuff[nodeAddr] -= (imgGradX[rightPixelAddr] - imgGradX[pixelAddr]) * wx[pixelAddr];
         else
            //fftBuff[nodeAddr] -= (-2.0 * imgGradX[pixelAddr]);
            fftBuff[nodeAddr] -= (-2.0 * imgGradX[pixelAddr]) * wx[pixelAddr];
         if ((y > 0) && (y < height - 1))
            //fftBuff[nodeAddr] -= (imgGradY[topPixelAddr] - imgGradY[pixelAddr]);
            fftBuff[nodeAddr] -= (imgGradY[topPixelAddr] - imgGradY[pixelAddr]) * wy[pixelAddr];
         else
            //fftBuff[nodeAddr] -= (-2.0 * imgGradY[pixelAddr]);
            fftBuff[nodeAddr] -= (-2.0 * imgGradY[pixelAddr]) * wy[pixelAddr];
      }
      //transform h_hat to H_hat by taking the DCT of h_hat
      fftw_execute(fftPlan);
      //compute F_hat using H_hat (see equation 29 in Bhat's paper)
      nodeAddr = 0;
//#pragma omp parallel for schedule(dynamic, 1) // OpenMP
      for (int y = 0; y < height; y++)
      for (int x = 0; x < width; x++, nodeAddr++) {
         //float ftLapResponse = ftLapY[y] + ftLapX[x];
         double ftLapResponse = (ftLapY[y] * wy[pixelAddr]) + (ftLapX[x] * wx[pixelAddr]);
         //fftBuff[nodeAddr] /= (dataCost - ftLapResponse);
         fftBuff[nodeAddr] /= ((dataCost * wp[pixelAddr]) - ftLapResponse);
      }
      /* Set the DC term of the solution to the value computed above (i.e., the DC term of imgData).
      * set dcSum to the desired average when dataCost=0
      */
      fftBuff[0] = dcSum;
      //transform F_hat to f_hat by taking the inverse DCT of F_hat
      fftw_execute(fftPlan);
      double fftDenom = 4.0 * (width - 1) * (height - 1);
      nodeAddr = 0;
      pixelAddr = iChannel;
      rightPixelAddr = 3 + iChannel;
      topPixelAddr = (width * 3) + iChannel;
//#pragma omp parallel for schedule(dynamic, 1) // OpenMP
      for (int y = 0; y < height; y++)
      for (int x = 0; x < width; x++,
         nodeAddr++, pixelAddr += 3, rightPixelAddr += 3, topPixelAddr += 3) {
         //pixelAddr = iChannel + iNode * 3;
         imgOut[pixelAddr] = fftBuff[nodeAddr] / fftDenom;

         wp[pixelAddr] = 1. / dmax(abs(imgOut[pixelAddr] - imgData[pixelAddr]) , eps);
         if ((x > 0) && (x < width - 1))
            //fftBuff[nodeAddr] -= (imgGradX[rightPixelAddr] - imgGradX[pixelAddr]);
            wx[pixelAddr] = 1. / dmax(abs(imgOut[rightPixelAddr] - imgOut[pixelAddr] - imgGradX[pixelAddr]) , eps);
         else
            //fftBuff[nodeAddr] -= (-2.0 * imgGradX[pixelAddr]);
            wx[pixelAddr] = 1. / dmax(abs(-2.0 * imgOut[pixelAddr] - imgGradX[pixelAddr]) , eps);
         if ((y > 0) && (y < height - 1))
            //fftBuff[nodeAddr] -= (imgGradY[topPixelAddr] - imgGradY[pixelAddr]);
            wy[pixelAddr] = 1. / dmax(abs(imgOut[topPixelAddr] - imgOut[pixelAddr] - imgGradY[pixelAddr]) , eps);
         else
            //fftBuff[nodeAddr] -= (-2.0 * imgGradY[pixelAddr]);
            wy[pixelAddr] = 1. / dmax(abs(-2.0 * imgOut[pixelAddr] - imgGradY[pixelAddr]) , eps);
      }
   }
}
fftw_free(fftBuff);
fftw_free(ftLapX);
fftw_free(ftLapY);
fftw_destroy_plan(fftPlan);
}

gradient domain path tracing (GPT) back

Board: Home Board index Raytracing General Development