Conditionally $project from different fields












1














I have the unfortunate problem of a data structure changing on me around six months ago. Therefore, I have a document that used to look like...



{
fruits: [
{
id: 123
},
{
id: 456
}
]
}


(Important to note that the id is not the ObjectId BSON type, it's just a random series of characters generated by the client side).



...but now has the id key changed to.



{
fruits: [
{
fruit_id: 'xxx'
},
{
fruit_id: 'yyy'
}
]
}


So, I'm trying to do a $project to have both id and fruit_id changed to something generic like general_id, so that I can continue with another aggregation like $group and just reference the one field



I've tried something along the lines of:



[
$unwind: {
path: '$fruits'
},
$project: {
general_id: {
$cond: {
if: {
'fruits.fruit_id': {
$type: ['string']
}
},
then: '$fruits.fruit_id',
else: '$fruits.id'
}
}
}
]









share|improve this question





























    1














    I have the unfortunate problem of a data structure changing on me around six months ago. Therefore, I have a document that used to look like...



    {
    fruits: [
    {
    id: 123
    },
    {
    id: 456
    }
    ]
    }


    (Important to note that the id is not the ObjectId BSON type, it's just a random series of characters generated by the client side).



    ...but now has the id key changed to.



    {
    fruits: [
    {
    fruit_id: 'xxx'
    },
    {
    fruit_id: 'yyy'
    }
    ]
    }


    So, I'm trying to do a $project to have both id and fruit_id changed to something generic like general_id, so that I can continue with another aggregation like $group and just reference the one field



    I've tried something along the lines of:



    [
    $unwind: {
    path: '$fruits'
    },
    $project: {
    general_id: {
    $cond: {
    if: {
    'fruits.fruit_id': {
    $type: ['string']
    }
    },
    then: '$fruits.fruit_id',
    else: '$fruits.id'
    }
    }
    }
    ]









    share|improve this question



























      1












      1








      1







      I have the unfortunate problem of a data structure changing on me around six months ago. Therefore, I have a document that used to look like...



      {
      fruits: [
      {
      id: 123
      },
      {
      id: 456
      }
      ]
      }


      (Important to note that the id is not the ObjectId BSON type, it's just a random series of characters generated by the client side).



      ...but now has the id key changed to.



      {
      fruits: [
      {
      fruit_id: 'xxx'
      },
      {
      fruit_id: 'yyy'
      }
      ]
      }


      So, I'm trying to do a $project to have both id and fruit_id changed to something generic like general_id, so that I can continue with another aggregation like $group and just reference the one field



      I've tried something along the lines of:



      [
      $unwind: {
      path: '$fruits'
      },
      $project: {
      general_id: {
      $cond: {
      if: {
      'fruits.fruit_id': {
      $type: ['string']
      }
      },
      then: '$fruits.fruit_id',
      else: '$fruits.id'
      }
      }
      }
      ]









      share|improve this question















      I have the unfortunate problem of a data structure changing on me around six months ago. Therefore, I have a document that used to look like...



      {
      fruits: [
      {
      id: 123
      },
      {
      id: 456
      }
      ]
      }


      (Important to note that the id is not the ObjectId BSON type, it's just a random series of characters generated by the client side).



      ...but now has the id key changed to.



      {
      fruits: [
      {
      fruit_id: 'xxx'
      },
      {
      fruit_id: 'yyy'
      }
      ]
      }


      So, I'm trying to do a $project to have both id and fruit_id changed to something generic like general_id, so that I can continue with another aggregation like $group and just reference the one field



      I've tried something along the lines of:



      [
      $unwind: {
      path: '$fruits'
      },
      $project: {
      general_id: {
      $cond: {
      if: {
      'fruits.fruit_id': {
      $type: ['string']
      }
      },
      then: '$fruits.fruit_id',
      else: '$fruits.id'
      }
      }
      }
      ]






      mongodb mongoose mongodb-query aggregation-framework






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 23 '18 at 10:27

























      asked Nov 23 '18 at 4:25









      Modermo

      552414




      552414
























          1 Answer
          1






          active

          oldest

          votes


















          1














          It really depends on what you are after here, but for a general case of knowing two possibilities it's probably better to use $ifNull to return the value for the field if present, otherwise return the other field's value.



          Adding a little more data for demonstration since you probably don't want to loose anything else in the array elements:



          {
          _id: 1,
          fruits: [
          {
          id: 123,
          data: 1
          },
          {
          id: 456,
          data: 2
          }
          ]
          },
          {
          _id: 2,
          fruits: [
          {
          fruit_id: 'xxx',
          data: 1
          },
          {
          fruit_id: 'yyy',
          data: 2
          }
          ]
          },
          {
          _id: 3,
          fruits: [
          {
          fruit_id: 'xxx',
          data: 1,
          },
          {
          fruit_id: 'yyy',
          data: 2
          },
          {
          id: 123,
          data: 3
          },
          {
          id: 456,
          data: 4
          }
          ]
          }


          Then you can either do the process using $unwind as the first step, which does make path naming easier and especially with $addFields instead of $project:



          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$addFields": {
          "fruits": {
          "id": "$$REMOVE",
          "fruit_id": "$$REMOVE",
          "general_id": { "$ifNull": [ "$fruits.id", "$fruits.fruit_id" ] }
          }
          }}
          ])


          That uses $$REMOVE from MongoDB 3.6 and above ( which should be the minimal version you are using ) in order to "remove" the fields you don't want. You don't need to do that and can just declare everything you actually want with $project if you don't have support.



          Then of course has the alternate with an $ifNull expression.



          This gives results on that data like:



          { "_id" : 1, "fruits" : { "data" : 1, "general_id" : 123 } }
          { "_id" : 1, "fruits" : { "data" : 2, "general_id" : 456 } }
          { "_id" : 2, "fruits" : { "data" : 1, "general_id" : "xxx" } }
          { "_id" : 2, "fruits" : { "data" : 2, "general_id" : "yyy" } }
          { "_id" : 3, "fruits" : { "data" : 1, "general_id" : "xxx" } }
          { "_id" : 3, "fruits" : { "data" : 2, "general_id" : "yyy" } }
          { "_id" : 3, "fruits" : { "data" : 3, "general_id" : 123 } }
          { "_id" : 3, "fruits" : { "data" : 4, "general_id" : 456 } }


          If you wanted to $group on that value, then there's no need for an intermediate "project" of any kind. Just do the $ifNull directly in that stage:



          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": { "$ifNull": [ "$fruits.id", "$fruits.fruit_id" ] },
          "count": { "$sum": 1 }
          }}
          ])


          And output:



          { "_id" : "yyy", "count" : 2 }
          { "_id" : "xxx", "count" : 2 }
          { "_id" : 456, "count" : 2 }
          { "_id" : 123, "count" : 2 }


          Or if you actually did not need to $unwind the array for other purposes, you can use $map and some other manipulation with $objectToArray and $arrayToObject:



          Model.aggregate([
          { "$addFields": {
          "fruits": {
          "$map": {
          "input": "$fruits",
          "in": {
          "$mergeObjects": [
          { "$arrayToObject": {
          "$filter": {
          "input": { "$objectToArray": "$$this" },
          "cond": { "$not": { "$in": [ "$$this.k", ["fruit_id","id"] ] } }
          }
          }},
          {
          "general_id": { "$ifNull": ["$$this.id","$$this.fruit_id"] }
          }
          ]
          }
          }
          }
          }}
          ])


          Which returns results like:



          {
          "_id" : 1,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : 123
          },
          {
          "data" : 2,
          "general_id" : 456
          }
          ]
          }
          {
          "_id" : 2,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : "xxx"
          },
          {
          "data" : 2,
          "general_id" : "yyy"
          }
          ]
          }
          {
          "_id" : 3,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : "xxx"
          },
          {
          "data" : 2,
          "general_id" : "yyy"
          },
          {
          "data" : 3,
          "general_id" : 123
          },
          {
          "data" : 4,
          "general_id" : 456
          }
          ]
          }


          Adding an $unwind after that returns just the same as before. But the more complex operations are probably better suited to where you want to keep this as an array.



          This time we removed the id and fruit_id by converting each array element into an array of "key/value" pairs via $objectToArray. We then $filter the array based on those "k" values, which are the names of the fields. The $arrayToObject makes this an object again, with all other content except those fields.



          The $mergeObjects is to $map what $addFields is to the root "document", in that it takes multiple objects and "merges" them together. Thus the "filtered" object as described before, and the new object with only the general_id key and it's value translated from whichever field was present.



          Lists of More than Two Fields



          As a final note, $ifNull works better than $cond where you have just two values, but neither is actually that great if there is a larger possible list. You can nest $cond expressions or even use $switch, but really it's probably best to filter content out via the $objectToArray as seen before:



          var valid_names = [ "id", "fruit_id", "apple_id", "orange_id" ];

          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": {
          "$arrayElemAt": [
          { "$map": {
          "input": {
          "$filter": {
          "input": { "$objectToArray": "$fruits" },
          "cond": { "$in": [ "$$this.k", valid_names ] }
          }
          },
          "in": "$$this.v"
          }},
          0
          ]
          },
          "count": { "$sum": 1 }
          }}
          ])


          That usually makes the most sense, otherwise for working with such a list in dynamic way you end up building aggregation pipeline stages in code, such as with using $switch would be:



          var valid_names = [ "id", "fruit_id", "apple_id", "orange_id" ];

          var branches = valid_names.map(name =>
          ({
          "case": { "$gt": [`$fruits.${name}`, null ] },
          "then": `$fruits.${name}`
          })
          )

          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": { "$switch": { branches, "default": null } },
          "count": { "$sum": 1 }
          }}
          ])


          Which looks cleaner in your code, but actually sends a much larger pipeline in BSON:



          [
          { "$unwind" : "$fruits" },
          { "$group" : {
          "_id" : {
          "$switch" : {
          "branches" : [
          {
          "case" : { "$gt" : [ "$fruits.id", null ] },
          "then" : "$fruits.id"
          },
          {
          "case" : { "$gt" : [ "$fruits.fruit_id", null ] },
          "then" : "$fruits.fruit_id"
          },
          {
          "case" : { "$gt" : [ "$fruits.apple_id", null ] },
          "then" : "$fruits.apple_id"
          },
          {
          "case" : { "$gt" : [ "$fruits.orange_id", null ] },
          "then" : "$fruits.orange_id"
          }
          ],
          "default" : null
          }
          },
          "count" : { "$sum" : 1 }
          }}
          ]





          share|improve this answer



















          • 1




            This is a very comprehensive answer. Thanks for that.
            – Modermo
            Nov 23 '18 at 10:27











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440639%2fconditionally-project-from-different-fields%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          It really depends on what you are after here, but for a general case of knowing two possibilities it's probably better to use $ifNull to return the value for the field if present, otherwise return the other field's value.



          Adding a little more data for demonstration since you probably don't want to loose anything else in the array elements:



          {
          _id: 1,
          fruits: [
          {
          id: 123,
          data: 1
          },
          {
          id: 456,
          data: 2
          }
          ]
          },
          {
          _id: 2,
          fruits: [
          {
          fruit_id: 'xxx',
          data: 1
          },
          {
          fruit_id: 'yyy',
          data: 2
          }
          ]
          },
          {
          _id: 3,
          fruits: [
          {
          fruit_id: 'xxx',
          data: 1,
          },
          {
          fruit_id: 'yyy',
          data: 2
          },
          {
          id: 123,
          data: 3
          },
          {
          id: 456,
          data: 4
          }
          ]
          }


          Then you can either do the process using $unwind as the first step, which does make path naming easier and especially with $addFields instead of $project:



          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$addFields": {
          "fruits": {
          "id": "$$REMOVE",
          "fruit_id": "$$REMOVE",
          "general_id": { "$ifNull": [ "$fruits.id", "$fruits.fruit_id" ] }
          }
          }}
          ])


          That uses $$REMOVE from MongoDB 3.6 and above ( which should be the minimal version you are using ) in order to "remove" the fields you don't want. You don't need to do that and can just declare everything you actually want with $project if you don't have support.



          Then of course has the alternate with an $ifNull expression.



          This gives results on that data like:



          { "_id" : 1, "fruits" : { "data" : 1, "general_id" : 123 } }
          { "_id" : 1, "fruits" : { "data" : 2, "general_id" : 456 } }
          { "_id" : 2, "fruits" : { "data" : 1, "general_id" : "xxx" } }
          { "_id" : 2, "fruits" : { "data" : 2, "general_id" : "yyy" } }
          { "_id" : 3, "fruits" : { "data" : 1, "general_id" : "xxx" } }
          { "_id" : 3, "fruits" : { "data" : 2, "general_id" : "yyy" } }
          { "_id" : 3, "fruits" : { "data" : 3, "general_id" : 123 } }
          { "_id" : 3, "fruits" : { "data" : 4, "general_id" : 456 } }


          If you wanted to $group on that value, then there's no need for an intermediate "project" of any kind. Just do the $ifNull directly in that stage:



          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": { "$ifNull": [ "$fruits.id", "$fruits.fruit_id" ] },
          "count": { "$sum": 1 }
          }}
          ])


          And output:



          { "_id" : "yyy", "count" : 2 }
          { "_id" : "xxx", "count" : 2 }
          { "_id" : 456, "count" : 2 }
          { "_id" : 123, "count" : 2 }


          Or if you actually did not need to $unwind the array for other purposes, you can use $map and some other manipulation with $objectToArray and $arrayToObject:



          Model.aggregate([
          { "$addFields": {
          "fruits": {
          "$map": {
          "input": "$fruits",
          "in": {
          "$mergeObjects": [
          { "$arrayToObject": {
          "$filter": {
          "input": { "$objectToArray": "$$this" },
          "cond": { "$not": { "$in": [ "$$this.k", ["fruit_id","id"] ] } }
          }
          }},
          {
          "general_id": { "$ifNull": ["$$this.id","$$this.fruit_id"] }
          }
          ]
          }
          }
          }
          }}
          ])


          Which returns results like:



          {
          "_id" : 1,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : 123
          },
          {
          "data" : 2,
          "general_id" : 456
          }
          ]
          }
          {
          "_id" : 2,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : "xxx"
          },
          {
          "data" : 2,
          "general_id" : "yyy"
          }
          ]
          }
          {
          "_id" : 3,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : "xxx"
          },
          {
          "data" : 2,
          "general_id" : "yyy"
          },
          {
          "data" : 3,
          "general_id" : 123
          },
          {
          "data" : 4,
          "general_id" : 456
          }
          ]
          }


          Adding an $unwind after that returns just the same as before. But the more complex operations are probably better suited to where you want to keep this as an array.



          This time we removed the id and fruit_id by converting each array element into an array of "key/value" pairs via $objectToArray. We then $filter the array based on those "k" values, which are the names of the fields. The $arrayToObject makes this an object again, with all other content except those fields.



          The $mergeObjects is to $map what $addFields is to the root "document", in that it takes multiple objects and "merges" them together. Thus the "filtered" object as described before, and the new object with only the general_id key and it's value translated from whichever field was present.



          Lists of More than Two Fields



          As a final note, $ifNull works better than $cond where you have just two values, but neither is actually that great if there is a larger possible list. You can nest $cond expressions or even use $switch, but really it's probably best to filter content out via the $objectToArray as seen before:



          var valid_names = [ "id", "fruit_id", "apple_id", "orange_id" ];

          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": {
          "$arrayElemAt": [
          { "$map": {
          "input": {
          "$filter": {
          "input": { "$objectToArray": "$fruits" },
          "cond": { "$in": [ "$$this.k", valid_names ] }
          }
          },
          "in": "$$this.v"
          }},
          0
          ]
          },
          "count": { "$sum": 1 }
          }}
          ])


          That usually makes the most sense, otherwise for working with such a list in dynamic way you end up building aggregation pipeline stages in code, such as with using $switch would be:



          var valid_names = [ "id", "fruit_id", "apple_id", "orange_id" ];

          var branches = valid_names.map(name =>
          ({
          "case": { "$gt": [`$fruits.${name}`, null ] },
          "then": `$fruits.${name}`
          })
          )

          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": { "$switch": { branches, "default": null } },
          "count": { "$sum": 1 }
          }}
          ])


          Which looks cleaner in your code, but actually sends a much larger pipeline in BSON:



          [
          { "$unwind" : "$fruits" },
          { "$group" : {
          "_id" : {
          "$switch" : {
          "branches" : [
          {
          "case" : { "$gt" : [ "$fruits.id", null ] },
          "then" : "$fruits.id"
          },
          {
          "case" : { "$gt" : [ "$fruits.fruit_id", null ] },
          "then" : "$fruits.fruit_id"
          },
          {
          "case" : { "$gt" : [ "$fruits.apple_id", null ] },
          "then" : "$fruits.apple_id"
          },
          {
          "case" : { "$gt" : [ "$fruits.orange_id", null ] },
          "then" : "$fruits.orange_id"
          }
          ],
          "default" : null
          }
          },
          "count" : { "$sum" : 1 }
          }}
          ]





          share|improve this answer



















          • 1




            This is a very comprehensive answer. Thanks for that.
            – Modermo
            Nov 23 '18 at 10:27
















          1














          It really depends on what you are after here, but for a general case of knowing two possibilities it's probably better to use $ifNull to return the value for the field if present, otherwise return the other field's value.



          Adding a little more data for demonstration since you probably don't want to loose anything else in the array elements:



          {
          _id: 1,
          fruits: [
          {
          id: 123,
          data: 1
          },
          {
          id: 456,
          data: 2
          }
          ]
          },
          {
          _id: 2,
          fruits: [
          {
          fruit_id: 'xxx',
          data: 1
          },
          {
          fruit_id: 'yyy',
          data: 2
          }
          ]
          },
          {
          _id: 3,
          fruits: [
          {
          fruit_id: 'xxx',
          data: 1,
          },
          {
          fruit_id: 'yyy',
          data: 2
          },
          {
          id: 123,
          data: 3
          },
          {
          id: 456,
          data: 4
          }
          ]
          }


          Then you can either do the process using $unwind as the first step, which does make path naming easier and especially with $addFields instead of $project:



          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$addFields": {
          "fruits": {
          "id": "$$REMOVE",
          "fruit_id": "$$REMOVE",
          "general_id": { "$ifNull": [ "$fruits.id", "$fruits.fruit_id" ] }
          }
          }}
          ])


          That uses $$REMOVE from MongoDB 3.6 and above ( which should be the minimal version you are using ) in order to "remove" the fields you don't want. You don't need to do that and can just declare everything you actually want with $project if you don't have support.



          Then of course has the alternate with an $ifNull expression.



          This gives results on that data like:



          { "_id" : 1, "fruits" : { "data" : 1, "general_id" : 123 } }
          { "_id" : 1, "fruits" : { "data" : 2, "general_id" : 456 } }
          { "_id" : 2, "fruits" : { "data" : 1, "general_id" : "xxx" } }
          { "_id" : 2, "fruits" : { "data" : 2, "general_id" : "yyy" } }
          { "_id" : 3, "fruits" : { "data" : 1, "general_id" : "xxx" } }
          { "_id" : 3, "fruits" : { "data" : 2, "general_id" : "yyy" } }
          { "_id" : 3, "fruits" : { "data" : 3, "general_id" : 123 } }
          { "_id" : 3, "fruits" : { "data" : 4, "general_id" : 456 } }


          If you wanted to $group on that value, then there's no need for an intermediate "project" of any kind. Just do the $ifNull directly in that stage:



          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": { "$ifNull": [ "$fruits.id", "$fruits.fruit_id" ] },
          "count": { "$sum": 1 }
          }}
          ])


          And output:



          { "_id" : "yyy", "count" : 2 }
          { "_id" : "xxx", "count" : 2 }
          { "_id" : 456, "count" : 2 }
          { "_id" : 123, "count" : 2 }


          Or if you actually did not need to $unwind the array for other purposes, you can use $map and some other manipulation with $objectToArray and $arrayToObject:



          Model.aggregate([
          { "$addFields": {
          "fruits": {
          "$map": {
          "input": "$fruits",
          "in": {
          "$mergeObjects": [
          { "$arrayToObject": {
          "$filter": {
          "input": { "$objectToArray": "$$this" },
          "cond": { "$not": { "$in": [ "$$this.k", ["fruit_id","id"] ] } }
          }
          }},
          {
          "general_id": { "$ifNull": ["$$this.id","$$this.fruit_id"] }
          }
          ]
          }
          }
          }
          }}
          ])


          Which returns results like:



          {
          "_id" : 1,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : 123
          },
          {
          "data" : 2,
          "general_id" : 456
          }
          ]
          }
          {
          "_id" : 2,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : "xxx"
          },
          {
          "data" : 2,
          "general_id" : "yyy"
          }
          ]
          }
          {
          "_id" : 3,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : "xxx"
          },
          {
          "data" : 2,
          "general_id" : "yyy"
          },
          {
          "data" : 3,
          "general_id" : 123
          },
          {
          "data" : 4,
          "general_id" : 456
          }
          ]
          }


          Adding an $unwind after that returns just the same as before. But the more complex operations are probably better suited to where you want to keep this as an array.



          This time we removed the id and fruit_id by converting each array element into an array of "key/value" pairs via $objectToArray. We then $filter the array based on those "k" values, which are the names of the fields. The $arrayToObject makes this an object again, with all other content except those fields.



          The $mergeObjects is to $map what $addFields is to the root "document", in that it takes multiple objects and "merges" them together. Thus the "filtered" object as described before, and the new object with only the general_id key and it's value translated from whichever field was present.



          Lists of More than Two Fields



          As a final note, $ifNull works better than $cond where you have just two values, but neither is actually that great if there is a larger possible list. You can nest $cond expressions or even use $switch, but really it's probably best to filter content out via the $objectToArray as seen before:



          var valid_names = [ "id", "fruit_id", "apple_id", "orange_id" ];

          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": {
          "$arrayElemAt": [
          { "$map": {
          "input": {
          "$filter": {
          "input": { "$objectToArray": "$fruits" },
          "cond": { "$in": [ "$$this.k", valid_names ] }
          }
          },
          "in": "$$this.v"
          }},
          0
          ]
          },
          "count": { "$sum": 1 }
          }}
          ])


          That usually makes the most sense, otherwise for working with such a list in dynamic way you end up building aggregation pipeline stages in code, such as with using $switch would be:



          var valid_names = [ "id", "fruit_id", "apple_id", "orange_id" ];

          var branches = valid_names.map(name =>
          ({
          "case": { "$gt": [`$fruits.${name}`, null ] },
          "then": `$fruits.${name}`
          })
          )

          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": { "$switch": { branches, "default": null } },
          "count": { "$sum": 1 }
          }}
          ])


          Which looks cleaner in your code, but actually sends a much larger pipeline in BSON:



          [
          { "$unwind" : "$fruits" },
          { "$group" : {
          "_id" : {
          "$switch" : {
          "branches" : [
          {
          "case" : { "$gt" : [ "$fruits.id", null ] },
          "then" : "$fruits.id"
          },
          {
          "case" : { "$gt" : [ "$fruits.fruit_id", null ] },
          "then" : "$fruits.fruit_id"
          },
          {
          "case" : { "$gt" : [ "$fruits.apple_id", null ] },
          "then" : "$fruits.apple_id"
          },
          {
          "case" : { "$gt" : [ "$fruits.orange_id", null ] },
          "then" : "$fruits.orange_id"
          }
          ],
          "default" : null
          }
          },
          "count" : { "$sum" : 1 }
          }}
          ]





          share|improve this answer



















          • 1




            This is a very comprehensive answer. Thanks for that.
            – Modermo
            Nov 23 '18 at 10:27














          1












          1








          1






          It really depends on what you are after here, but for a general case of knowing two possibilities it's probably better to use $ifNull to return the value for the field if present, otherwise return the other field's value.



          Adding a little more data for demonstration since you probably don't want to loose anything else in the array elements:



          {
          _id: 1,
          fruits: [
          {
          id: 123,
          data: 1
          },
          {
          id: 456,
          data: 2
          }
          ]
          },
          {
          _id: 2,
          fruits: [
          {
          fruit_id: 'xxx',
          data: 1
          },
          {
          fruit_id: 'yyy',
          data: 2
          }
          ]
          },
          {
          _id: 3,
          fruits: [
          {
          fruit_id: 'xxx',
          data: 1,
          },
          {
          fruit_id: 'yyy',
          data: 2
          },
          {
          id: 123,
          data: 3
          },
          {
          id: 456,
          data: 4
          }
          ]
          }


          Then you can either do the process using $unwind as the first step, which does make path naming easier and especially with $addFields instead of $project:



          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$addFields": {
          "fruits": {
          "id": "$$REMOVE",
          "fruit_id": "$$REMOVE",
          "general_id": { "$ifNull": [ "$fruits.id", "$fruits.fruit_id" ] }
          }
          }}
          ])


          That uses $$REMOVE from MongoDB 3.6 and above ( which should be the minimal version you are using ) in order to "remove" the fields you don't want. You don't need to do that and can just declare everything you actually want with $project if you don't have support.



          Then of course has the alternate with an $ifNull expression.



          This gives results on that data like:



          { "_id" : 1, "fruits" : { "data" : 1, "general_id" : 123 } }
          { "_id" : 1, "fruits" : { "data" : 2, "general_id" : 456 } }
          { "_id" : 2, "fruits" : { "data" : 1, "general_id" : "xxx" } }
          { "_id" : 2, "fruits" : { "data" : 2, "general_id" : "yyy" } }
          { "_id" : 3, "fruits" : { "data" : 1, "general_id" : "xxx" } }
          { "_id" : 3, "fruits" : { "data" : 2, "general_id" : "yyy" } }
          { "_id" : 3, "fruits" : { "data" : 3, "general_id" : 123 } }
          { "_id" : 3, "fruits" : { "data" : 4, "general_id" : 456 } }


          If you wanted to $group on that value, then there's no need for an intermediate "project" of any kind. Just do the $ifNull directly in that stage:



          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": { "$ifNull": [ "$fruits.id", "$fruits.fruit_id" ] },
          "count": { "$sum": 1 }
          }}
          ])


          And output:



          { "_id" : "yyy", "count" : 2 }
          { "_id" : "xxx", "count" : 2 }
          { "_id" : 456, "count" : 2 }
          { "_id" : 123, "count" : 2 }


          Or if you actually did not need to $unwind the array for other purposes, you can use $map and some other manipulation with $objectToArray and $arrayToObject:



          Model.aggregate([
          { "$addFields": {
          "fruits": {
          "$map": {
          "input": "$fruits",
          "in": {
          "$mergeObjects": [
          { "$arrayToObject": {
          "$filter": {
          "input": { "$objectToArray": "$$this" },
          "cond": { "$not": { "$in": [ "$$this.k", ["fruit_id","id"] ] } }
          }
          }},
          {
          "general_id": { "$ifNull": ["$$this.id","$$this.fruit_id"] }
          }
          ]
          }
          }
          }
          }}
          ])


          Which returns results like:



          {
          "_id" : 1,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : 123
          },
          {
          "data" : 2,
          "general_id" : 456
          }
          ]
          }
          {
          "_id" : 2,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : "xxx"
          },
          {
          "data" : 2,
          "general_id" : "yyy"
          }
          ]
          }
          {
          "_id" : 3,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : "xxx"
          },
          {
          "data" : 2,
          "general_id" : "yyy"
          },
          {
          "data" : 3,
          "general_id" : 123
          },
          {
          "data" : 4,
          "general_id" : 456
          }
          ]
          }


          Adding an $unwind after that returns just the same as before. But the more complex operations are probably better suited to where you want to keep this as an array.



          This time we removed the id and fruit_id by converting each array element into an array of "key/value" pairs via $objectToArray. We then $filter the array based on those "k" values, which are the names of the fields. The $arrayToObject makes this an object again, with all other content except those fields.



          The $mergeObjects is to $map what $addFields is to the root "document", in that it takes multiple objects and "merges" them together. Thus the "filtered" object as described before, and the new object with only the general_id key and it's value translated from whichever field was present.



          Lists of More than Two Fields



          As a final note, $ifNull works better than $cond where you have just two values, but neither is actually that great if there is a larger possible list. You can nest $cond expressions or even use $switch, but really it's probably best to filter content out via the $objectToArray as seen before:



          var valid_names = [ "id", "fruit_id", "apple_id", "orange_id" ];

          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": {
          "$arrayElemAt": [
          { "$map": {
          "input": {
          "$filter": {
          "input": { "$objectToArray": "$fruits" },
          "cond": { "$in": [ "$$this.k", valid_names ] }
          }
          },
          "in": "$$this.v"
          }},
          0
          ]
          },
          "count": { "$sum": 1 }
          }}
          ])


          That usually makes the most sense, otherwise for working with such a list in dynamic way you end up building aggregation pipeline stages in code, such as with using $switch would be:



          var valid_names = [ "id", "fruit_id", "apple_id", "orange_id" ];

          var branches = valid_names.map(name =>
          ({
          "case": { "$gt": [`$fruits.${name}`, null ] },
          "then": `$fruits.${name}`
          })
          )

          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": { "$switch": { branches, "default": null } },
          "count": { "$sum": 1 }
          }}
          ])


          Which looks cleaner in your code, but actually sends a much larger pipeline in BSON:



          [
          { "$unwind" : "$fruits" },
          { "$group" : {
          "_id" : {
          "$switch" : {
          "branches" : [
          {
          "case" : { "$gt" : [ "$fruits.id", null ] },
          "then" : "$fruits.id"
          },
          {
          "case" : { "$gt" : [ "$fruits.fruit_id", null ] },
          "then" : "$fruits.fruit_id"
          },
          {
          "case" : { "$gt" : [ "$fruits.apple_id", null ] },
          "then" : "$fruits.apple_id"
          },
          {
          "case" : { "$gt" : [ "$fruits.orange_id", null ] },
          "then" : "$fruits.orange_id"
          }
          ],
          "default" : null
          }
          },
          "count" : { "$sum" : 1 }
          }}
          ]





          share|improve this answer














          It really depends on what you are after here, but for a general case of knowing two possibilities it's probably better to use $ifNull to return the value for the field if present, otherwise return the other field's value.



          Adding a little more data for demonstration since you probably don't want to loose anything else in the array elements:



          {
          _id: 1,
          fruits: [
          {
          id: 123,
          data: 1
          },
          {
          id: 456,
          data: 2
          }
          ]
          },
          {
          _id: 2,
          fruits: [
          {
          fruit_id: 'xxx',
          data: 1
          },
          {
          fruit_id: 'yyy',
          data: 2
          }
          ]
          },
          {
          _id: 3,
          fruits: [
          {
          fruit_id: 'xxx',
          data: 1,
          },
          {
          fruit_id: 'yyy',
          data: 2
          },
          {
          id: 123,
          data: 3
          },
          {
          id: 456,
          data: 4
          }
          ]
          }


          Then you can either do the process using $unwind as the first step, which does make path naming easier and especially with $addFields instead of $project:



          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$addFields": {
          "fruits": {
          "id": "$$REMOVE",
          "fruit_id": "$$REMOVE",
          "general_id": { "$ifNull": [ "$fruits.id", "$fruits.fruit_id" ] }
          }
          }}
          ])


          That uses $$REMOVE from MongoDB 3.6 and above ( which should be the minimal version you are using ) in order to "remove" the fields you don't want. You don't need to do that and can just declare everything you actually want with $project if you don't have support.



          Then of course has the alternate with an $ifNull expression.



          This gives results on that data like:



          { "_id" : 1, "fruits" : { "data" : 1, "general_id" : 123 } }
          { "_id" : 1, "fruits" : { "data" : 2, "general_id" : 456 } }
          { "_id" : 2, "fruits" : { "data" : 1, "general_id" : "xxx" } }
          { "_id" : 2, "fruits" : { "data" : 2, "general_id" : "yyy" } }
          { "_id" : 3, "fruits" : { "data" : 1, "general_id" : "xxx" } }
          { "_id" : 3, "fruits" : { "data" : 2, "general_id" : "yyy" } }
          { "_id" : 3, "fruits" : { "data" : 3, "general_id" : 123 } }
          { "_id" : 3, "fruits" : { "data" : 4, "general_id" : 456 } }


          If you wanted to $group on that value, then there's no need for an intermediate "project" of any kind. Just do the $ifNull directly in that stage:



          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": { "$ifNull": [ "$fruits.id", "$fruits.fruit_id" ] },
          "count": { "$sum": 1 }
          }}
          ])


          And output:



          { "_id" : "yyy", "count" : 2 }
          { "_id" : "xxx", "count" : 2 }
          { "_id" : 456, "count" : 2 }
          { "_id" : 123, "count" : 2 }


          Or if you actually did not need to $unwind the array for other purposes, you can use $map and some other manipulation with $objectToArray and $arrayToObject:



          Model.aggregate([
          { "$addFields": {
          "fruits": {
          "$map": {
          "input": "$fruits",
          "in": {
          "$mergeObjects": [
          { "$arrayToObject": {
          "$filter": {
          "input": { "$objectToArray": "$$this" },
          "cond": { "$not": { "$in": [ "$$this.k", ["fruit_id","id"] ] } }
          }
          }},
          {
          "general_id": { "$ifNull": ["$$this.id","$$this.fruit_id"] }
          }
          ]
          }
          }
          }
          }}
          ])


          Which returns results like:



          {
          "_id" : 1,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : 123
          },
          {
          "data" : 2,
          "general_id" : 456
          }
          ]
          }
          {
          "_id" : 2,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : "xxx"
          },
          {
          "data" : 2,
          "general_id" : "yyy"
          }
          ]
          }
          {
          "_id" : 3,
          "fruits" : [
          {
          "data" : 1,
          "general_id" : "xxx"
          },
          {
          "data" : 2,
          "general_id" : "yyy"
          },
          {
          "data" : 3,
          "general_id" : 123
          },
          {
          "data" : 4,
          "general_id" : 456
          }
          ]
          }


          Adding an $unwind after that returns just the same as before. But the more complex operations are probably better suited to where you want to keep this as an array.



          This time we removed the id and fruit_id by converting each array element into an array of "key/value" pairs via $objectToArray. We then $filter the array based on those "k" values, which are the names of the fields. The $arrayToObject makes this an object again, with all other content except those fields.



          The $mergeObjects is to $map what $addFields is to the root "document", in that it takes multiple objects and "merges" them together. Thus the "filtered" object as described before, and the new object with only the general_id key and it's value translated from whichever field was present.



          Lists of More than Two Fields



          As a final note, $ifNull works better than $cond where you have just two values, but neither is actually that great if there is a larger possible list. You can nest $cond expressions or even use $switch, but really it's probably best to filter content out via the $objectToArray as seen before:



          var valid_names = [ "id", "fruit_id", "apple_id", "orange_id" ];

          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": {
          "$arrayElemAt": [
          { "$map": {
          "input": {
          "$filter": {
          "input": { "$objectToArray": "$fruits" },
          "cond": { "$in": [ "$$this.k", valid_names ] }
          }
          },
          "in": "$$this.v"
          }},
          0
          ]
          },
          "count": { "$sum": 1 }
          }}
          ])


          That usually makes the most sense, otherwise for working with such a list in dynamic way you end up building aggregation pipeline stages in code, such as with using $switch would be:



          var valid_names = [ "id", "fruit_id", "apple_id", "orange_id" ];

          var branches = valid_names.map(name =>
          ({
          "case": { "$gt": [`$fruits.${name}`, null ] },
          "then": `$fruits.${name}`
          })
          )

          Model.aggregate([
          { "$unwind": "$fruits" },
          { "$group": {
          "_id": { "$switch": { branches, "default": null } },
          "count": { "$sum": 1 }
          }}
          ])


          Which looks cleaner in your code, but actually sends a much larger pipeline in BSON:



          [
          { "$unwind" : "$fruits" },
          { "$group" : {
          "_id" : {
          "$switch" : {
          "branches" : [
          {
          "case" : { "$gt" : [ "$fruits.id", null ] },
          "then" : "$fruits.id"
          },
          {
          "case" : { "$gt" : [ "$fruits.fruit_id", null ] },
          "then" : "$fruits.fruit_id"
          },
          {
          "case" : { "$gt" : [ "$fruits.apple_id", null ] },
          "then" : "$fruits.apple_id"
          },
          {
          "case" : { "$gt" : [ "$fruits.orange_id", null ] },
          "then" : "$fruits.orange_id"
          }
          ],
          "default" : null
          }
          },
          "count" : { "$sum" : 1 }
          }}
          ]






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 23 '18 at 9:42

























          answered Nov 23 '18 at 5:34









          Neil Lunn

          97.1k22170181




          97.1k22170181








          • 1




            This is a very comprehensive answer. Thanks for that.
            – Modermo
            Nov 23 '18 at 10:27














          • 1




            This is a very comprehensive answer. Thanks for that.
            – Modermo
            Nov 23 '18 at 10:27








          1




          1




          This is a very comprehensive answer. Thanks for that.
          – Modermo
          Nov 23 '18 at 10:27




          This is a very comprehensive answer. Thanks for that.
          – Modermo
          Nov 23 '18 at 10:27


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440639%2fconditionally-project-from-different-fields%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Berounka

          Different font size/position of beamer's navigation symbols template's content depending on regular/plain...

          Sphinx de Gizeh