Is it true that on a modern processor, parallelism is possible on a single core?












5












$begingroup$


Final Edit: I just realized that when use the word "parallelism", it's almost parallelism==ILP, I originally thought even a single instruction could be divided into several phrases, and at that level there would be some parallelism, but then I realized this has no meaning. Both my title and my example didn't mentioned anything about more than one threads' parallelism as done by HyperThreading, so @user110971's is the correct answer, without doubt anymore. (In the philosophical level, I just need a base-case condition to return my recursiveness of finding the deepest part of parallelism)



Edit3: I made a graph for my Edit2, and I found this video on YouTube about HyperThreading useful.
enter image description here



Edit2: In short, for my question I adopt the definitions on Wikipedia, and for the definition of the terms:




  • Parallel: Two threads, run independently, at any physical instant. So one thread won't interrupt the other, at any instant.

  • Concurrent: Two threads, run independently, interleavedly is allowed, i.e. not restricted to parallel, and one can interrupt the other.

  • In short, for me and Wikipedia writers, Concurrent includes Parallel. Thanks.


Edit: Just to be clear, for me parallelism means true parallelism, I add a "true" for it because people I talked to tend to think parallel==concurrent. (See my second link)





Is it true that on modern processor, "true" parallelism is possible on a single core? I asked elsewhere but didn't get a confirming answer. What I want to know is e.g. whether at t=0, two instructions are fetched and executed. At the same physical instant.



My question came from here:




parallel computing is impossible on a (one-core) single processor, as only one computation can occur at any instant (during any single clock cycle).











share|improve this question











$endgroup$








  • 2




    $begingroup$
    Aside from HyperThreading?
    $endgroup$
    – cHao
    Dec 7 '18 at 22:38






  • 2




    $begingroup$
    It's the only way to achieve true parallelism with a single core, yes. Superscalar processing can do several things at once, but it can still only run one thread at a time.
    $endgroup$
    – cHao
    Dec 7 '18 at 22:50






  • 1




    $begingroup$
    Because it's a kind of parallelism, and thus is worth bringing up. It's an example of the processor executing two or more instructions in a single clock cycle, just like you asked for. It's just not the kind of parallelism most people are talking about when they use the word.
    $endgroup$
    – cHao
    Dec 7 '18 at 22:53








  • 1




    $begingroup$
    I don't get it - What does it mean "true" or "false" or "untrue" parallelism?
    $endgroup$
    – Al Kepp
    Dec 7 '18 at 23:21






  • 1




    $begingroup$
    @ptr_user7813604 I looked at the first link, that one is "true parallelism". But this suspicious pseudo-term is not defined there. So maybe you could write the basic terms in your question, instead of providing just links.
    $endgroup$
    – Al Kepp
    Dec 7 '18 at 23:34


















5












$begingroup$


Final Edit: I just realized that when use the word "parallelism", it's almost parallelism==ILP, I originally thought even a single instruction could be divided into several phrases, and at that level there would be some parallelism, but then I realized this has no meaning. Both my title and my example didn't mentioned anything about more than one threads' parallelism as done by HyperThreading, so @user110971's is the correct answer, without doubt anymore. (In the philosophical level, I just need a base-case condition to return my recursiveness of finding the deepest part of parallelism)



Edit3: I made a graph for my Edit2, and I found this video on YouTube about HyperThreading useful.
enter image description here



Edit2: In short, for my question I adopt the definitions on Wikipedia, and for the definition of the terms:




  • Parallel: Two threads, run independently, at any physical instant. So one thread won't interrupt the other, at any instant.

  • Concurrent: Two threads, run independently, interleavedly is allowed, i.e. not restricted to parallel, and one can interrupt the other.

  • In short, for me and Wikipedia writers, Concurrent includes Parallel. Thanks.


Edit: Just to be clear, for me parallelism means true parallelism, I add a "true" for it because people I talked to tend to think parallel==concurrent. (See my second link)





Is it true that on modern processor, "true" parallelism is possible on a single core? I asked elsewhere but didn't get a confirming answer. What I want to know is e.g. whether at t=0, two instructions are fetched and executed. At the same physical instant.



My question came from here:




parallel computing is impossible on a (one-core) single processor, as only one computation can occur at any instant (during any single clock cycle).











share|improve this question











$endgroup$








  • 2




    $begingroup$
    Aside from HyperThreading?
    $endgroup$
    – cHao
    Dec 7 '18 at 22:38






  • 2




    $begingroup$
    It's the only way to achieve true parallelism with a single core, yes. Superscalar processing can do several things at once, but it can still only run one thread at a time.
    $endgroup$
    – cHao
    Dec 7 '18 at 22:50






  • 1




    $begingroup$
    Because it's a kind of parallelism, and thus is worth bringing up. It's an example of the processor executing two or more instructions in a single clock cycle, just like you asked for. It's just not the kind of parallelism most people are talking about when they use the word.
    $endgroup$
    – cHao
    Dec 7 '18 at 22:53








  • 1




    $begingroup$
    I don't get it - What does it mean "true" or "false" or "untrue" parallelism?
    $endgroup$
    – Al Kepp
    Dec 7 '18 at 23:21






  • 1




    $begingroup$
    @ptr_user7813604 I looked at the first link, that one is "true parallelism". But this suspicious pseudo-term is not defined there. So maybe you could write the basic terms in your question, instead of providing just links.
    $endgroup$
    – Al Kepp
    Dec 7 '18 at 23:34
















5












5








5


2



$begingroup$


Final Edit: I just realized that when use the word "parallelism", it's almost parallelism==ILP, I originally thought even a single instruction could be divided into several phrases, and at that level there would be some parallelism, but then I realized this has no meaning. Both my title and my example didn't mentioned anything about more than one threads' parallelism as done by HyperThreading, so @user110971's is the correct answer, without doubt anymore. (In the philosophical level, I just need a base-case condition to return my recursiveness of finding the deepest part of parallelism)



Edit3: I made a graph for my Edit2, and I found this video on YouTube about HyperThreading useful.
enter image description here



Edit2: In short, for my question I adopt the definitions on Wikipedia, and for the definition of the terms:




  • Parallel: Two threads, run independently, at any physical instant. So one thread won't interrupt the other, at any instant.

  • Concurrent: Two threads, run independently, interleavedly is allowed, i.e. not restricted to parallel, and one can interrupt the other.

  • In short, for me and Wikipedia writers, Concurrent includes Parallel. Thanks.


Edit: Just to be clear, for me parallelism means true parallelism, I add a "true" for it because people I talked to tend to think parallel==concurrent. (See my second link)





Is it true that on modern processor, "true" parallelism is possible on a single core? I asked elsewhere but didn't get a confirming answer. What I want to know is e.g. whether at t=0, two instructions are fetched and executed. At the same physical instant.



My question came from here:




parallel computing is impossible on a (one-core) single processor, as only one computation can occur at any instant (during any single clock cycle).











share|improve this question











$endgroup$




Final Edit: I just realized that when use the word "parallelism", it's almost parallelism==ILP, I originally thought even a single instruction could be divided into several phrases, and at that level there would be some parallelism, but then I realized this has no meaning. Both my title and my example didn't mentioned anything about more than one threads' parallelism as done by HyperThreading, so @user110971's is the correct answer, without doubt anymore. (In the philosophical level, I just need a base-case condition to return my recursiveness of finding the deepest part of parallelism)



Edit3: I made a graph for my Edit2, and I found this video on YouTube about HyperThreading useful.
enter image description here



Edit2: In short, for my question I adopt the definitions on Wikipedia, and for the definition of the terms:




  • Parallel: Two threads, run independently, at any physical instant. So one thread won't interrupt the other, at any instant.

  • Concurrent: Two threads, run independently, interleavedly is allowed, i.e. not restricted to parallel, and one can interrupt the other.

  • In short, for me and Wikipedia writers, Concurrent includes Parallel. Thanks.


Edit: Just to be clear, for me parallelism means true parallelism, I add a "true" for it because people I talked to tend to think parallel==concurrent. (See my second link)





Is it true that on modern processor, "true" parallelism is possible on a single core? I asked elsewhere but didn't get a confirming answer. What I want to know is e.g. whether at t=0, two instructions are fetched and executed. At the same physical instant.



My question came from here:




parallel computing is impossible on a (one-core) single processor, as only one computation can occur at any instant (during any single clock cycle).








parallel cpu multicore






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 8 '18 at 16:39







ptr_user7813604

















asked Dec 7 '18 at 21:07









ptr_user7813604ptr_user7813604

1517




1517








  • 2




    $begingroup$
    Aside from HyperThreading?
    $endgroup$
    – cHao
    Dec 7 '18 at 22:38






  • 2




    $begingroup$
    It's the only way to achieve true parallelism with a single core, yes. Superscalar processing can do several things at once, but it can still only run one thread at a time.
    $endgroup$
    – cHao
    Dec 7 '18 at 22:50






  • 1




    $begingroup$
    Because it's a kind of parallelism, and thus is worth bringing up. It's an example of the processor executing two or more instructions in a single clock cycle, just like you asked for. It's just not the kind of parallelism most people are talking about when they use the word.
    $endgroup$
    – cHao
    Dec 7 '18 at 22:53








  • 1




    $begingroup$
    I don't get it - What does it mean "true" or "false" or "untrue" parallelism?
    $endgroup$
    – Al Kepp
    Dec 7 '18 at 23:21






  • 1




    $begingroup$
    @ptr_user7813604 I looked at the first link, that one is "true parallelism". But this suspicious pseudo-term is not defined there. So maybe you could write the basic terms in your question, instead of providing just links.
    $endgroup$
    – Al Kepp
    Dec 7 '18 at 23:34
















  • 2




    $begingroup$
    Aside from HyperThreading?
    $endgroup$
    – cHao
    Dec 7 '18 at 22:38






  • 2




    $begingroup$
    It's the only way to achieve true parallelism with a single core, yes. Superscalar processing can do several things at once, but it can still only run one thread at a time.
    $endgroup$
    – cHao
    Dec 7 '18 at 22:50






  • 1




    $begingroup$
    Because it's a kind of parallelism, and thus is worth bringing up. It's an example of the processor executing two or more instructions in a single clock cycle, just like you asked for. It's just not the kind of parallelism most people are talking about when they use the word.
    $endgroup$
    – cHao
    Dec 7 '18 at 22:53








  • 1




    $begingroup$
    I don't get it - What does it mean "true" or "false" or "untrue" parallelism?
    $endgroup$
    – Al Kepp
    Dec 7 '18 at 23:21






  • 1




    $begingroup$
    @ptr_user7813604 I looked at the first link, that one is "true parallelism". But this suspicious pseudo-term is not defined there. So maybe you could write the basic terms in your question, instead of providing just links.
    $endgroup$
    – Al Kepp
    Dec 7 '18 at 23:34










2




2




$begingroup$
Aside from HyperThreading?
$endgroup$
– cHao
Dec 7 '18 at 22:38




$begingroup$
Aside from HyperThreading?
$endgroup$
– cHao
Dec 7 '18 at 22:38




2




2




$begingroup$
It's the only way to achieve true parallelism with a single core, yes. Superscalar processing can do several things at once, but it can still only run one thread at a time.
$endgroup$
– cHao
Dec 7 '18 at 22:50




$begingroup$
It's the only way to achieve true parallelism with a single core, yes. Superscalar processing can do several things at once, but it can still only run one thread at a time.
$endgroup$
– cHao
Dec 7 '18 at 22:50




1




1




$begingroup$
Because it's a kind of parallelism, and thus is worth bringing up. It's an example of the processor executing two or more instructions in a single clock cycle, just like you asked for. It's just not the kind of parallelism most people are talking about when they use the word.
$endgroup$
– cHao
Dec 7 '18 at 22:53






$begingroup$
Because it's a kind of parallelism, and thus is worth bringing up. It's an example of the processor executing two or more instructions in a single clock cycle, just like you asked for. It's just not the kind of parallelism most people are talking about when they use the word.
$endgroup$
– cHao
Dec 7 '18 at 22:53






1




1




$begingroup$
I don't get it - What does it mean "true" or "false" or "untrue" parallelism?
$endgroup$
– Al Kepp
Dec 7 '18 at 23:21




$begingroup$
I don't get it - What does it mean "true" or "false" or "untrue" parallelism?
$endgroup$
– Al Kepp
Dec 7 '18 at 23:21




1




1




$begingroup$
@ptr_user7813604 I looked at the first link, that one is "true parallelism". But this suspicious pseudo-term is not defined there. So maybe you could write the basic terms in your question, instead of providing just links.
$endgroup$
– Al Kepp
Dec 7 '18 at 23:34






$begingroup$
@ptr_user7813604 I looked at the first link, that one is "true parallelism". But this suspicious pseudo-term is not defined there. So maybe you could write the basic terms in your question, instead of providing just links.
$endgroup$
– Al Kepp
Dec 7 '18 at 23:34












3 Answers
3






active

oldest

votes


















17












$begingroup$

It is indeed possible to have parallelism on a superscalar processor. A superscalar processor can execute multiple instructions at the same time by using multiple execution units.
pipeline
There are certain limitations depending on the architecture. It is not true parallelism. If you have to calculate
$$A = B + C,$$
$$D = A + 3,$$
you cannot execute both instructions at the same time. However you can execute
$$A = B + C,$$
$$D = D + 3,$$
simultaneously by utilizing two ALUs.



So as an answer to your question, you can have a certain level of parallelism on a single core, as long as your instructions do not use the same hardware resources.






share|improve this answer











$endgroup$









  • 3




    $begingroup$
    And then many DSP architectures are explicitly MIMD (like SIMD, but the instructions to the different execution units are different). Unlike "implicit superscalar execution", this VLIW and "Explicitly parallel instruction computing" is scheduled by the linker, not discovered by the CPU logic.
    $endgroup$
    – Ben Voigt
    Dec 7 '18 at 22:51








  • 1




    $begingroup$
    @BenVoigt not only DSP, there's also Itanium featuring VLIW.
    $endgroup$
    – Ruslan
    Dec 8 '18 at 8:48



















10












$begingroup$

On some processors this is (sometimes) possible. Since different instructions use different processor resources (ALU, floating point, load, store, etc), it's sometimes possible to parallelize some of them. For example, see here for details on how that works on an Ivy Bridge (x86) CPU: https://dendibakh.github.io/blog/2018/03/21/port-contention






share|improve this answer









$endgroup$





















    10












    $begingroup$

    There are lots of different types of parallelism.



    Instruction level parallelism is a feature of any superscalar processor. Multiple instructions are in progress at any point in time. However, those instructions are from the same thread of control.



    Thread level parallelism within a single core is possible with hyperthreading -- two separate threads using different core resources at the same time. One thread can use the integer ALU while another is executing a load or store.



    Data level parallelism is also a type of parallelism. SIMD units can execute the same instructions on multiple registers at the same time. For instance, if you need to apply the same blur transformation to every pixel in an image, you might be able to do that 8 pixels in parallel, but within the same thread of control.






    share|improve this answer









    $endgroup$













    • $begingroup$
      Great, and apologies that I think my "e.g. whether at t=0, two instructions are fetched and executed" is a wrong example, it's not the same as I thought it should be....
      $endgroup$
      – ptr_user7813604
      Dec 8 '18 at 0:47










    • $begingroup$
      So is that for hyperthreading there is nothing duplicate in a single core, but using different core resources at the same time?
      $endgroup$
      – ptr_user7813604
      Dec 8 '18 at 0:50










    • $begingroup$
      A single core may just as well contain multiple copies of the same unit. For example, it is typical to have two or more integer units and one floating point unit. Or, two full integer units, one integer unit with restricted capabilities, and one floating point unit. Or, … (you get the idea).
      $endgroup$
      – Jörg W Mittag
      Dec 8 '18 at 8:09






    • 1




      $begingroup$
      What Jorg W Mittag said. You have multiple schedulers. So if the primary scheduler isn't using all the INTEGER or FLOAT units, than a secondary scheduler could run another thread using those unused units. This is also key why some workloads get faster and some get way slower. If the primary workload is using almost every unit, the secondary "hyperthread" can't actually get anything done without waiting for those units. They also compete for other shared things like cache and memory traffic. So if they're not producing enough useful work to justify the new cache contention, it's slower.
      $endgroup$
      – Katastic Voyage
      Dec 8 '18 at 10:21










    • $begingroup$
      @JörgWMittag: You mean Architectural_state?
      $endgroup$
      – ptr_user7813604
      Dec 8 '18 at 12:41











    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("schematics", function () {
    StackExchange.schematics.init();
    });
    }, "cicuitlab");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "135"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2felectronics.stackexchange.com%2fquestions%2f411049%2fis-it-true-that-on-a-modern-processor-parallelism-is-possible-on-a-single-core%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    17












    $begingroup$

    It is indeed possible to have parallelism on a superscalar processor. A superscalar processor can execute multiple instructions at the same time by using multiple execution units.
    pipeline
    There are certain limitations depending on the architecture. It is not true parallelism. If you have to calculate
    $$A = B + C,$$
    $$D = A + 3,$$
    you cannot execute both instructions at the same time. However you can execute
    $$A = B + C,$$
    $$D = D + 3,$$
    simultaneously by utilizing two ALUs.



    So as an answer to your question, you can have a certain level of parallelism on a single core, as long as your instructions do not use the same hardware resources.






    share|improve this answer











    $endgroup$









    • 3




      $begingroup$
      And then many DSP architectures are explicitly MIMD (like SIMD, but the instructions to the different execution units are different). Unlike "implicit superscalar execution", this VLIW and "Explicitly parallel instruction computing" is scheduled by the linker, not discovered by the CPU logic.
      $endgroup$
      – Ben Voigt
      Dec 7 '18 at 22:51








    • 1




      $begingroup$
      @BenVoigt not only DSP, there's also Itanium featuring VLIW.
      $endgroup$
      – Ruslan
      Dec 8 '18 at 8:48
















    17












    $begingroup$

    It is indeed possible to have parallelism on a superscalar processor. A superscalar processor can execute multiple instructions at the same time by using multiple execution units.
    pipeline
    There are certain limitations depending on the architecture. It is not true parallelism. If you have to calculate
    $$A = B + C,$$
    $$D = A + 3,$$
    you cannot execute both instructions at the same time. However you can execute
    $$A = B + C,$$
    $$D = D + 3,$$
    simultaneously by utilizing two ALUs.



    So as an answer to your question, you can have a certain level of parallelism on a single core, as long as your instructions do not use the same hardware resources.






    share|improve this answer











    $endgroup$









    • 3




      $begingroup$
      And then many DSP architectures are explicitly MIMD (like SIMD, but the instructions to the different execution units are different). Unlike "implicit superscalar execution", this VLIW and "Explicitly parallel instruction computing" is scheduled by the linker, not discovered by the CPU logic.
      $endgroup$
      – Ben Voigt
      Dec 7 '18 at 22:51








    • 1




      $begingroup$
      @BenVoigt not only DSP, there's also Itanium featuring VLIW.
      $endgroup$
      – Ruslan
      Dec 8 '18 at 8:48














    17












    17








    17





    $begingroup$

    It is indeed possible to have parallelism on a superscalar processor. A superscalar processor can execute multiple instructions at the same time by using multiple execution units.
    pipeline
    There are certain limitations depending on the architecture. It is not true parallelism. If you have to calculate
    $$A = B + C,$$
    $$D = A + 3,$$
    you cannot execute both instructions at the same time. However you can execute
    $$A = B + C,$$
    $$D = D + 3,$$
    simultaneously by utilizing two ALUs.



    So as an answer to your question, you can have a certain level of parallelism on a single core, as long as your instructions do not use the same hardware resources.






    share|improve this answer











    $endgroup$



    It is indeed possible to have parallelism on a superscalar processor. A superscalar processor can execute multiple instructions at the same time by using multiple execution units.
    pipeline
    There are certain limitations depending on the architecture. It is not true parallelism. If you have to calculate
    $$A = B + C,$$
    $$D = A + 3,$$
    you cannot execute both instructions at the same time. However you can execute
    $$A = B + C,$$
    $$D = D + 3,$$
    simultaneously by utilizing two ALUs.



    So as an answer to your question, you can have a certain level of parallelism on a single core, as long as your instructions do not use the same hardware resources.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Dec 7 '18 at 21:26

























    answered Dec 7 '18 at 21:23









    user110971user110971

    3,3441717




    3,3441717








    • 3




      $begingroup$
      And then many DSP architectures are explicitly MIMD (like SIMD, but the instructions to the different execution units are different). Unlike "implicit superscalar execution", this VLIW and "Explicitly parallel instruction computing" is scheduled by the linker, not discovered by the CPU logic.
      $endgroup$
      – Ben Voigt
      Dec 7 '18 at 22:51








    • 1




      $begingroup$
      @BenVoigt not only DSP, there's also Itanium featuring VLIW.
      $endgroup$
      – Ruslan
      Dec 8 '18 at 8:48














    • 3




      $begingroup$
      And then many DSP architectures are explicitly MIMD (like SIMD, but the instructions to the different execution units are different). Unlike "implicit superscalar execution", this VLIW and "Explicitly parallel instruction computing" is scheduled by the linker, not discovered by the CPU logic.
      $endgroup$
      – Ben Voigt
      Dec 7 '18 at 22:51








    • 1




      $begingroup$
      @BenVoigt not only DSP, there's also Itanium featuring VLIW.
      $endgroup$
      – Ruslan
      Dec 8 '18 at 8:48








    3




    3




    $begingroup$
    And then many DSP architectures are explicitly MIMD (like SIMD, but the instructions to the different execution units are different). Unlike "implicit superscalar execution", this VLIW and "Explicitly parallel instruction computing" is scheduled by the linker, not discovered by the CPU logic.
    $endgroup$
    – Ben Voigt
    Dec 7 '18 at 22:51






    $begingroup$
    And then many DSP architectures are explicitly MIMD (like SIMD, but the instructions to the different execution units are different). Unlike "implicit superscalar execution", this VLIW and "Explicitly parallel instruction computing" is scheduled by the linker, not discovered by the CPU logic.
    $endgroup$
    – Ben Voigt
    Dec 7 '18 at 22:51






    1




    1




    $begingroup$
    @BenVoigt not only DSP, there's also Itanium featuring VLIW.
    $endgroup$
    – Ruslan
    Dec 8 '18 at 8:48




    $begingroup$
    @BenVoigt not only DSP, there's also Itanium featuring VLIW.
    $endgroup$
    – Ruslan
    Dec 8 '18 at 8:48













    10












    $begingroup$

    On some processors this is (sometimes) possible. Since different instructions use different processor resources (ALU, floating point, load, store, etc), it's sometimes possible to parallelize some of them. For example, see here for details on how that works on an Ivy Bridge (x86) CPU: https://dendibakh.github.io/blog/2018/03/21/port-contention






    share|improve this answer









    $endgroup$


















      10












      $begingroup$

      On some processors this is (sometimes) possible. Since different instructions use different processor resources (ALU, floating point, load, store, etc), it's sometimes possible to parallelize some of them. For example, see here for details on how that works on an Ivy Bridge (x86) CPU: https://dendibakh.github.io/blog/2018/03/21/port-contention






      share|improve this answer









      $endgroup$
















        10












        10








        10





        $begingroup$

        On some processors this is (sometimes) possible. Since different instructions use different processor resources (ALU, floating point, load, store, etc), it's sometimes possible to parallelize some of them. For example, see here for details on how that works on an Ivy Bridge (x86) CPU: https://dendibakh.github.io/blog/2018/03/21/port-contention






        share|improve this answer









        $endgroup$



        On some processors this is (sometimes) possible. Since different instructions use different processor resources (ALU, floating point, load, store, etc), it's sometimes possible to parallelize some of them. For example, see here for details on how that works on an Ivy Bridge (x86) CPU: https://dendibakh.github.io/blog/2018/03/21/port-contention







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Dec 7 '18 at 21:22









        Nate StricklandNate Strickland

        4706




        4706























            10












            $begingroup$

            There are lots of different types of parallelism.



            Instruction level parallelism is a feature of any superscalar processor. Multiple instructions are in progress at any point in time. However, those instructions are from the same thread of control.



            Thread level parallelism within a single core is possible with hyperthreading -- two separate threads using different core resources at the same time. One thread can use the integer ALU while another is executing a load or store.



            Data level parallelism is also a type of parallelism. SIMD units can execute the same instructions on multiple registers at the same time. For instance, if you need to apply the same blur transformation to every pixel in an image, you might be able to do that 8 pixels in parallel, but within the same thread of control.






            share|improve this answer









            $endgroup$













            • $begingroup$
              Great, and apologies that I think my "e.g. whether at t=0, two instructions are fetched and executed" is a wrong example, it's not the same as I thought it should be....
              $endgroup$
              – ptr_user7813604
              Dec 8 '18 at 0:47










            • $begingroup$
              So is that for hyperthreading there is nothing duplicate in a single core, but using different core resources at the same time?
              $endgroup$
              – ptr_user7813604
              Dec 8 '18 at 0:50










            • $begingroup$
              A single core may just as well contain multiple copies of the same unit. For example, it is typical to have two or more integer units and one floating point unit. Or, two full integer units, one integer unit with restricted capabilities, and one floating point unit. Or, … (you get the idea).
              $endgroup$
              – Jörg W Mittag
              Dec 8 '18 at 8:09






            • 1




              $begingroup$
              What Jorg W Mittag said. You have multiple schedulers. So if the primary scheduler isn't using all the INTEGER or FLOAT units, than a secondary scheduler could run another thread using those unused units. This is also key why some workloads get faster and some get way slower. If the primary workload is using almost every unit, the secondary "hyperthread" can't actually get anything done without waiting for those units. They also compete for other shared things like cache and memory traffic. So if they're not producing enough useful work to justify the new cache contention, it's slower.
              $endgroup$
              – Katastic Voyage
              Dec 8 '18 at 10:21










            • $begingroup$
              @JörgWMittag: You mean Architectural_state?
              $endgroup$
              – ptr_user7813604
              Dec 8 '18 at 12:41
















            10












            $begingroup$

            There are lots of different types of parallelism.



            Instruction level parallelism is a feature of any superscalar processor. Multiple instructions are in progress at any point in time. However, those instructions are from the same thread of control.



            Thread level parallelism within a single core is possible with hyperthreading -- two separate threads using different core resources at the same time. One thread can use the integer ALU while another is executing a load or store.



            Data level parallelism is also a type of parallelism. SIMD units can execute the same instructions on multiple registers at the same time. For instance, if you need to apply the same blur transformation to every pixel in an image, you might be able to do that 8 pixels in parallel, but within the same thread of control.






            share|improve this answer









            $endgroup$













            • $begingroup$
              Great, and apologies that I think my "e.g. whether at t=0, two instructions are fetched and executed" is a wrong example, it's not the same as I thought it should be....
              $endgroup$
              – ptr_user7813604
              Dec 8 '18 at 0:47










            • $begingroup$
              So is that for hyperthreading there is nothing duplicate in a single core, but using different core resources at the same time?
              $endgroup$
              – ptr_user7813604
              Dec 8 '18 at 0:50










            • $begingroup$
              A single core may just as well contain multiple copies of the same unit. For example, it is typical to have two or more integer units and one floating point unit. Or, two full integer units, one integer unit with restricted capabilities, and one floating point unit. Or, … (you get the idea).
              $endgroup$
              – Jörg W Mittag
              Dec 8 '18 at 8:09






            • 1




              $begingroup$
              What Jorg W Mittag said. You have multiple schedulers. So if the primary scheduler isn't using all the INTEGER or FLOAT units, than a secondary scheduler could run another thread using those unused units. This is also key why some workloads get faster and some get way slower. If the primary workload is using almost every unit, the secondary "hyperthread" can't actually get anything done without waiting for those units. They also compete for other shared things like cache and memory traffic. So if they're not producing enough useful work to justify the new cache contention, it's slower.
              $endgroup$
              – Katastic Voyage
              Dec 8 '18 at 10:21










            • $begingroup$
              @JörgWMittag: You mean Architectural_state?
              $endgroup$
              – ptr_user7813604
              Dec 8 '18 at 12:41














            10












            10








            10





            $begingroup$

            There are lots of different types of parallelism.



            Instruction level parallelism is a feature of any superscalar processor. Multiple instructions are in progress at any point in time. However, those instructions are from the same thread of control.



            Thread level parallelism within a single core is possible with hyperthreading -- two separate threads using different core resources at the same time. One thread can use the integer ALU while another is executing a load or store.



            Data level parallelism is also a type of parallelism. SIMD units can execute the same instructions on multiple registers at the same time. For instance, if you need to apply the same blur transformation to every pixel in an image, you might be able to do that 8 pixels in parallel, but within the same thread of control.






            share|improve this answer









            $endgroup$



            There are lots of different types of parallelism.



            Instruction level parallelism is a feature of any superscalar processor. Multiple instructions are in progress at any point in time. However, those instructions are from the same thread of control.



            Thread level parallelism within a single core is possible with hyperthreading -- two separate threads using different core resources at the same time. One thread can use the integer ALU while another is executing a load or store.



            Data level parallelism is also a type of parallelism. SIMD units can execute the same instructions on multiple registers at the same time. For instance, if you need to apply the same blur transformation to every pixel in an image, you might be able to do that 8 pixels in parallel, but within the same thread of control.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Dec 8 '18 at 0:27









            EvanEvan

            2,074516




            2,074516












            • $begingroup$
              Great, and apologies that I think my "e.g. whether at t=0, two instructions are fetched and executed" is a wrong example, it's not the same as I thought it should be....
              $endgroup$
              – ptr_user7813604
              Dec 8 '18 at 0:47










            • $begingroup$
              So is that for hyperthreading there is nothing duplicate in a single core, but using different core resources at the same time?
              $endgroup$
              – ptr_user7813604
              Dec 8 '18 at 0:50










            • $begingroup$
              A single core may just as well contain multiple copies of the same unit. For example, it is typical to have two or more integer units and one floating point unit. Or, two full integer units, one integer unit with restricted capabilities, and one floating point unit. Or, … (you get the idea).
              $endgroup$
              – Jörg W Mittag
              Dec 8 '18 at 8:09






            • 1




              $begingroup$
              What Jorg W Mittag said. You have multiple schedulers. So if the primary scheduler isn't using all the INTEGER or FLOAT units, than a secondary scheduler could run another thread using those unused units. This is also key why some workloads get faster and some get way slower. If the primary workload is using almost every unit, the secondary "hyperthread" can't actually get anything done without waiting for those units. They also compete for other shared things like cache and memory traffic. So if they're not producing enough useful work to justify the new cache contention, it's slower.
              $endgroup$
              – Katastic Voyage
              Dec 8 '18 at 10:21










            • $begingroup$
              @JörgWMittag: You mean Architectural_state?
              $endgroup$
              – ptr_user7813604
              Dec 8 '18 at 12:41


















            • $begingroup$
              Great, and apologies that I think my "e.g. whether at t=0, two instructions are fetched and executed" is a wrong example, it's not the same as I thought it should be....
              $endgroup$
              – ptr_user7813604
              Dec 8 '18 at 0:47










            • $begingroup$
              So is that for hyperthreading there is nothing duplicate in a single core, but using different core resources at the same time?
              $endgroup$
              – ptr_user7813604
              Dec 8 '18 at 0:50










            • $begingroup$
              A single core may just as well contain multiple copies of the same unit. For example, it is typical to have two or more integer units and one floating point unit. Or, two full integer units, one integer unit with restricted capabilities, and one floating point unit. Or, … (you get the idea).
              $endgroup$
              – Jörg W Mittag
              Dec 8 '18 at 8:09






            • 1




              $begingroup$
              What Jorg W Mittag said. You have multiple schedulers. So if the primary scheduler isn't using all the INTEGER or FLOAT units, than a secondary scheduler could run another thread using those unused units. This is also key why some workloads get faster and some get way slower. If the primary workload is using almost every unit, the secondary "hyperthread" can't actually get anything done without waiting for those units. They also compete for other shared things like cache and memory traffic. So if they're not producing enough useful work to justify the new cache contention, it's slower.
              $endgroup$
              – Katastic Voyage
              Dec 8 '18 at 10:21










            • $begingroup$
              @JörgWMittag: You mean Architectural_state?
              $endgroup$
              – ptr_user7813604
              Dec 8 '18 at 12:41
















            $begingroup$
            Great, and apologies that I think my "e.g. whether at t=0, two instructions are fetched and executed" is a wrong example, it's not the same as I thought it should be....
            $endgroup$
            – ptr_user7813604
            Dec 8 '18 at 0:47




            $begingroup$
            Great, and apologies that I think my "e.g. whether at t=0, two instructions are fetched and executed" is a wrong example, it's not the same as I thought it should be....
            $endgroup$
            – ptr_user7813604
            Dec 8 '18 at 0:47












            $begingroup$
            So is that for hyperthreading there is nothing duplicate in a single core, but using different core resources at the same time?
            $endgroup$
            – ptr_user7813604
            Dec 8 '18 at 0:50




            $begingroup$
            So is that for hyperthreading there is nothing duplicate in a single core, but using different core resources at the same time?
            $endgroup$
            – ptr_user7813604
            Dec 8 '18 at 0:50












            $begingroup$
            A single core may just as well contain multiple copies of the same unit. For example, it is typical to have two or more integer units and one floating point unit. Or, two full integer units, one integer unit with restricted capabilities, and one floating point unit. Or, … (you get the idea).
            $endgroup$
            – Jörg W Mittag
            Dec 8 '18 at 8:09




            $begingroup$
            A single core may just as well contain multiple copies of the same unit. For example, it is typical to have two or more integer units and one floating point unit. Or, two full integer units, one integer unit with restricted capabilities, and one floating point unit. Or, … (you get the idea).
            $endgroup$
            – Jörg W Mittag
            Dec 8 '18 at 8:09




            1




            1




            $begingroup$
            What Jorg W Mittag said. You have multiple schedulers. So if the primary scheduler isn't using all the INTEGER or FLOAT units, than a secondary scheduler could run another thread using those unused units. This is also key why some workloads get faster and some get way slower. If the primary workload is using almost every unit, the secondary "hyperthread" can't actually get anything done without waiting for those units. They also compete for other shared things like cache and memory traffic. So if they're not producing enough useful work to justify the new cache contention, it's slower.
            $endgroup$
            – Katastic Voyage
            Dec 8 '18 at 10:21




            $begingroup$
            What Jorg W Mittag said. You have multiple schedulers. So if the primary scheduler isn't using all the INTEGER or FLOAT units, than a secondary scheduler could run another thread using those unused units. This is also key why some workloads get faster and some get way slower. If the primary workload is using almost every unit, the secondary "hyperthread" can't actually get anything done without waiting for those units. They also compete for other shared things like cache and memory traffic. So if they're not producing enough useful work to justify the new cache contention, it's slower.
            $endgroup$
            – Katastic Voyage
            Dec 8 '18 at 10:21












            $begingroup$
            @JörgWMittag: You mean Architectural_state?
            $endgroup$
            – ptr_user7813604
            Dec 8 '18 at 12:41




            $begingroup$
            @JörgWMittag: You mean Architectural_state?
            $endgroup$
            – ptr_user7813604
            Dec 8 '18 at 12:41


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Electrical Engineering Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2felectronics.stackexchange.com%2fquestions%2f411049%2fis-it-true-that-on-a-modern-processor-parallelism-is-possible-on-a-single-core%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Berounka

            Sphinx de Gizeh

            Different font size/position of beamer's navigation symbols template's content depending on regular/plain...