openCL Kernel to calculate Pi is not correct value











up vote
0
down vote

favorite












Good day,



I have an openCL kernel that is using the Leibniz formula to calculate pi. Currently my issue is that the value I get back isn't pi, but instead just 4.



__kernel void calculatePi(int numIterations, __global float *outputPi,
__local float* local_result, int numWorkers)
{
__private const uint gid = get_global_id(0);
__private const uint lid = get_local_id(0);
__private const uint offset = numIterations*gid*2;
__private float sum = 0.0f;

// Have the first worker initialize local_result
if (gid == 0)
{
for (int i = 0; i < numWorkers; i++)
{
local_result[i] = 0.0f;
}
}

// Have all workers wait until this is completed
barrier(CLK_GLOBAL_MEM_FENCE);

// Have each worker calculate their portion of pi
// This is a private value
for (int i=0; i<numIterations; i++)
{
if (i % 2 == 0)
{
sum += 1 / (1 + 2*i + offset);
}
else
{
sum -= 1 / (1 + 2*i + offset);
}
}

// Have each worker move their value to the appropriate
// local_result slot so that the first worker can see it
// when reducing next
local_result[gid] = sum;

// Make sure all workers complete this task before continuing
barrier(CLK_LOCAL_MEM_FENCE);

// Have the first worker add up all of the other worker's values
// to get the final value
if (lid == 0)
{
outputPi[0] = 0;
for (int i = 0; i < numWorkers; i++)
{
outputPi[0] += local_result[i];
}

outputPi[0] *= 4;
}
}


I've steered all of my inputs to my output to verify that they are what I expect. numIterations is 16 and numWorkers is also 16.



When sum is calculated then for the first worker, I would expect the sum to be
1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 + 1/13 - 1/15 + 1/17 - 1/19 + 1/21 - 1/23 + 1/25 - 1/27 + 1/29 - 1/31



Using this calculator for the first 16 times, I expect the result to be around 3.2 : https://scratch.mit.edu/projects/19546118/



If I modify my last bit of code to be this so that I can look at a worker's calculated value of "sum":



    // Have the first worker add up all of the other worker's values
// to get the final value
if (lid == 0)
{
outputPi[0] = sum * 4;
}


Then the value returned for the first worker is 4 instead of the expected 3.2



Modifying to any other number except lid == 0, all other workers are reporting their sum as 0. So my question is why is that the calculated value? Am I doing something wrong with my sum variable? This should be a private variable and the for loop should be sequential from my understanding for each worker but numerous loops are executed in parallel based on the number of workers.



Here's a link to my github that has the kernel and main code uploaded.



https://github.com/TreverWagenhals/TreverWagenhals/tree/master/School/Heterogeneous%20Computing/Lab2



Thanks










share|improve this question
























  • I would expect offset = numIterations / (gid*2); or am i wrong?
    – OznOg
    11 hours ago










  • No I don't think so. The next term for the next worker (gid==1) should start at 1/33 (I showed the first 16 terms above). So, the offset is 16*1*2 = 32. For the first loop of the for-loop, the calculation would then be + 1/(1 +2(0) + 32) = + 1/33, which is what is expected. Just for sanity, I tried what you said too and it results in a floating point exception (16/(15*2) = 16/30, which is not an integer anymore, so that makes sense that it isn't correct.
    – Trever Wagenhals
    11 hours ago

















up vote
0
down vote

favorite












Good day,



I have an openCL kernel that is using the Leibniz formula to calculate pi. Currently my issue is that the value I get back isn't pi, but instead just 4.



__kernel void calculatePi(int numIterations, __global float *outputPi,
__local float* local_result, int numWorkers)
{
__private const uint gid = get_global_id(0);
__private const uint lid = get_local_id(0);
__private const uint offset = numIterations*gid*2;
__private float sum = 0.0f;

// Have the first worker initialize local_result
if (gid == 0)
{
for (int i = 0; i < numWorkers; i++)
{
local_result[i] = 0.0f;
}
}

// Have all workers wait until this is completed
barrier(CLK_GLOBAL_MEM_FENCE);

// Have each worker calculate their portion of pi
// This is a private value
for (int i=0; i<numIterations; i++)
{
if (i % 2 == 0)
{
sum += 1 / (1 + 2*i + offset);
}
else
{
sum -= 1 / (1 + 2*i + offset);
}
}

// Have each worker move their value to the appropriate
// local_result slot so that the first worker can see it
// when reducing next
local_result[gid] = sum;

// Make sure all workers complete this task before continuing
barrier(CLK_LOCAL_MEM_FENCE);

// Have the first worker add up all of the other worker's values
// to get the final value
if (lid == 0)
{
outputPi[0] = 0;
for (int i = 0; i < numWorkers; i++)
{
outputPi[0] += local_result[i];
}

outputPi[0] *= 4;
}
}


I've steered all of my inputs to my output to verify that they are what I expect. numIterations is 16 and numWorkers is also 16.



When sum is calculated then for the first worker, I would expect the sum to be
1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 + 1/13 - 1/15 + 1/17 - 1/19 + 1/21 - 1/23 + 1/25 - 1/27 + 1/29 - 1/31



Using this calculator for the first 16 times, I expect the result to be around 3.2 : https://scratch.mit.edu/projects/19546118/



If I modify my last bit of code to be this so that I can look at a worker's calculated value of "sum":



    // Have the first worker add up all of the other worker's values
// to get the final value
if (lid == 0)
{
outputPi[0] = sum * 4;
}


Then the value returned for the first worker is 4 instead of the expected 3.2



Modifying to any other number except lid == 0, all other workers are reporting their sum as 0. So my question is why is that the calculated value? Am I doing something wrong with my sum variable? This should be a private variable and the for loop should be sequential from my understanding for each worker but numerous loops are executed in parallel based on the number of workers.



Here's a link to my github that has the kernel and main code uploaded.



https://github.com/TreverWagenhals/TreverWagenhals/tree/master/School/Heterogeneous%20Computing/Lab2



Thanks










share|improve this question
























  • I would expect offset = numIterations / (gid*2); or am i wrong?
    – OznOg
    11 hours ago










  • No I don't think so. The next term for the next worker (gid==1) should start at 1/33 (I showed the first 16 terms above). So, the offset is 16*1*2 = 32. For the first loop of the for-loop, the calculation would then be + 1/(1 +2(0) + 32) = + 1/33, which is what is expected. Just for sanity, I tried what you said too and it results in a floating point exception (16/(15*2) = 16/30, which is not an integer anymore, so that makes sense that it isn't correct.
    – Trever Wagenhals
    11 hours ago















up vote
0
down vote

favorite









up vote
0
down vote

favorite











Good day,



I have an openCL kernel that is using the Leibniz formula to calculate pi. Currently my issue is that the value I get back isn't pi, but instead just 4.



__kernel void calculatePi(int numIterations, __global float *outputPi,
__local float* local_result, int numWorkers)
{
__private const uint gid = get_global_id(0);
__private const uint lid = get_local_id(0);
__private const uint offset = numIterations*gid*2;
__private float sum = 0.0f;

// Have the first worker initialize local_result
if (gid == 0)
{
for (int i = 0; i < numWorkers; i++)
{
local_result[i] = 0.0f;
}
}

// Have all workers wait until this is completed
barrier(CLK_GLOBAL_MEM_FENCE);

// Have each worker calculate their portion of pi
// This is a private value
for (int i=0; i<numIterations; i++)
{
if (i % 2 == 0)
{
sum += 1 / (1 + 2*i + offset);
}
else
{
sum -= 1 / (1 + 2*i + offset);
}
}

// Have each worker move their value to the appropriate
// local_result slot so that the first worker can see it
// when reducing next
local_result[gid] = sum;

// Make sure all workers complete this task before continuing
barrier(CLK_LOCAL_MEM_FENCE);

// Have the first worker add up all of the other worker's values
// to get the final value
if (lid == 0)
{
outputPi[0] = 0;
for (int i = 0; i < numWorkers; i++)
{
outputPi[0] += local_result[i];
}

outputPi[0] *= 4;
}
}


I've steered all of my inputs to my output to verify that they are what I expect. numIterations is 16 and numWorkers is also 16.



When sum is calculated then for the first worker, I would expect the sum to be
1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 + 1/13 - 1/15 + 1/17 - 1/19 + 1/21 - 1/23 + 1/25 - 1/27 + 1/29 - 1/31



Using this calculator for the first 16 times, I expect the result to be around 3.2 : https://scratch.mit.edu/projects/19546118/



If I modify my last bit of code to be this so that I can look at a worker's calculated value of "sum":



    // Have the first worker add up all of the other worker's values
// to get the final value
if (lid == 0)
{
outputPi[0] = sum * 4;
}


Then the value returned for the first worker is 4 instead of the expected 3.2



Modifying to any other number except lid == 0, all other workers are reporting their sum as 0. So my question is why is that the calculated value? Am I doing something wrong with my sum variable? This should be a private variable and the for loop should be sequential from my understanding for each worker but numerous loops are executed in parallel based on the number of workers.



Here's a link to my github that has the kernel and main code uploaded.



https://github.com/TreverWagenhals/TreverWagenhals/tree/master/School/Heterogeneous%20Computing/Lab2



Thanks










share|improve this question















Good day,



I have an openCL kernel that is using the Leibniz formula to calculate pi. Currently my issue is that the value I get back isn't pi, but instead just 4.



__kernel void calculatePi(int numIterations, __global float *outputPi,
__local float* local_result, int numWorkers)
{
__private const uint gid = get_global_id(0);
__private const uint lid = get_local_id(0);
__private const uint offset = numIterations*gid*2;
__private float sum = 0.0f;

// Have the first worker initialize local_result
if (gid == 0)
{
for (int i = 0; i < numWorkers; i++)
{
local_result[i] = 0.0f;
}
}

// Have all workers wait until this is completed
barrier(CLK_GLOBAL_MEM_FENCE);

// Have each worker calculate their portion of pi
// This is a private value
for (int i=0; i<numIterations; i++)
{
if (i % 2 == 0)
{
sum += 1 / (1 + 2*i + offset);
}
else
{
sum -= 1 / (1 + 2*i + offset);
}
}

// Have each worker move their value to the appropriate
// local_result slot so that the first worker can see it
// when reducing next
local_result[gid] = sum;

// Make sure all workers complete this task before continuing
barrier(CLK_LOCAL_MEM_FENCE);

// Have the first worker add up all of the other worker's values
// to get the final value
if (lid == 0)
{
outputPi[0] = 0;
for (int i = 0; i < numWorkers; i++)
{
outputPi[0] += local_result[i];
}

outputPi[0] *= 4;
}
}


I've steered all of my inputs to my output to verify that they are what I expect. numIterations is 16 and numWorkers is also 16.



When sum is calculated then for the first worker, I would expect the sum to be
1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 + 1/13 - 1/15 + 1/17 - 1/19 + 1/21 - 1/23 + 1/25 - 1/27 + 1/29 - 1/31



Using this calculator for the first 16 times, I expect the result to be around 3.2 : https://scratch.mit.edu/projects/19546118/



If I modify my last bit of code to be this so that I can look at a worker's calculated value of "sum":



    // Have the first worker add up all of the other worker's values
// to get the final value
if (lid == 0)
{
outputPi[0] = sum * 4;
}


Then the value returned for the first worker is 4 instead of the expected 3.2



Modifying to any other number except lid == 0, all other workers are reporting their sum as 0. So my question is why is that the calculated value? Am I doing something wrong with my sum variable? This should be a private variable and the for loop should be sequential from my understanding for each worker but numerous loops are executed in parallel based on the number of workers.



Here's a link to my github that has the kernel and main code uploaded.



https://github.com/TreverWagenhals/TreverWagenhals/tree/master/School/Heterogeneous%20Computing/Lab2



Thanks







c for-loop kernel opencl pi






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 11 hours ago









OznOg

2,22711324




2,22711324










asked 11 hours ago









Trever Wagenhals

9210




9210












  • I would expect offset = numIterations / (gid*2); or am i wrong?
    – OznOg
    11 hours ago










  • No I don't think so. The next term for the next worker (gid==1) should start at 1/33 (I showed the first 16 terms above). So, the offset is 16*1*2 = 32. For the first loop of the for-loop, the calculation would then be + 1/(1 +2(0) + 32) = + 1/33, which is what is expected. Just for sanity, I tried what you said too and it results in a floating point exception (16/(15*2) = 16/30, which is not an integer anymore, so that makes sense that it isn't correct.
    – Trever Wagenhals
    11 hours ago




















  • I would expect offset = numIterations / (gid*2); or am i wrong?
    – OznOg
    11 hours ago










  • No I don't think so. The next term for the next worker (gid==1) should start at 1/33 (I showed the first 16 terms above). So, the offset is 16*1*2 = 32. For the first loop of the for-loop, the calculation would then be + 1/(1 +2(0) + 32) = + 1/33, which is what is expected. Just for sanity, I tried what you said too and it results in a floating point exception (16/(15*2) = 16/30, which is not an integer anymore, so that makes sense that it isn't correct.
    – Trever Wagenhals
    11 hours ago


















I would expect offset = numIterations / (gid*2); or am i wrong?
– OznOg
11 hours ago




I would expect offset = numIterations / (gid*2); or am i wrong?
– OznOg
11 hours ago












No I don't think so. The next term for the next worker (gid==1) should start at 1/33 (I showed the first 16 terms above). So, the offset is 16*1*2 = 32. For the first loop of the for-loop, the calculation would then be + 1/(1 +2(0) + 32) = + 1/33, which is what is expected. Just for sanity, I tried what you said too and it results in a floating point exception (16/(15*2) = 16/30, which is not an integer anymore, so that makes sense that it isn't correct.
– Trever Wagenhals
11 hours ago






No I don't think so. The next term for the next worker (gid==1) should start at 1/33 (I showed the first 16 terms above). So, the offset is 16*1*2 = 32. For the first loop of the for-loop, the calculation would then be + 1/(1 +2(0) + 32) = + 1/33, which is what is expected. Just for sanity, I tried what you said too and it results in a floating point exception (16/(15*2) = 16/30, which is not an integer anymore, so that makes sense that it isn't correct.
– Trever Wagenhals
11 hours ago














1 Answer
1






active

oldest

votes

















up vote
0
down vote













you are performing integral divisions in your code, should be floats:



if (i % 2 == 0)
{
sum += 1. / (1 + 2*i + offset); // notice the 1.
}
else
{
sum -= 1. / (1 + 2*i + offset);
}





share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53398186%2fopencl-kernel-to-calculate-pi-is-not-correct-value%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    you are performing integral divisions in your code, should be floats:



    if (i % 2 == 0)
    {
    sum += 1. / (1 + 2*i + offset); // notice the 1.
    }
    else
    {
    sum -= 1. / (1 + 2*i + offset);
    }





    share|improve this answer

























      up vote
      0
      down vote













      you are performing integral divisions in your code, should be floats:



      if (i % 2 == 0)
      {
      sum += 1. / (1 + 2*i + offset); // notice the 1.
      }
      else
      {
      sum -= 1. / (1 + 2*i + offset);
      }





      share|improve this answer























        up vote
        0
        down vote










        up vote
        0
        down vote









        you are performing integral divisions in your code, should be floats:



        if (i % 2 == 0)
        {
        sum += 1. / (1 + 2*i + offset); // notice the 1.
        }
        else
        {
        sum -= 1. / (1 + 2*i + offset);
        }





        share|improve this answer












        you are performing integral divisions in your code, should be floats:



        if (i % 2 == 0)
        {
        sum += 1. / (1 + 2*i + offset); // notice the 1.
        }
        else
        {
        sum -= 1. / (1 + 2*i + offset);
        }






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 9 hours ago









        OznOg

        2,22711324




        2,22711324






























             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53398186%2fopencl-kernel-to-calculate-pi-is-not-correct-value%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Berounka

            Sphinx de Gizeh

            Different font size/position of beamer's navigation symbols template's content depending on regular/plain...