Spark: Dataframe action really slow when upgraded from 2.1.0 to 2.2.1











up vote
0
down vote

favorite












I just upgraded spark 2.1.0 to spark 2.2.1. Has anyone seen extreme slow behavior on dataframe.filter(…).collect()?.. specifically a collect operation with filter before. dataframe.collect seems to run okay. However, dataframe.filter(…).collect() takes forever. it contains only 2 records. and its on a unit test. When I go back to spark 2.1.0, its back to normal speed



I have looked at the thread dump and could not find an obvious cause. I have made an effort to make sure all the libraries I am using are also using Spark 2.2.1. Any suggestion would be greatly appreciated.










share|improve this question


















  • 2




    need more details. it was very generic question. have you checked spark ui like stages etc...?
    – Ram Ghadiyaram
    8 hours ago















up vote
0
down vote

favorite












I just upgraded spark 2.1.0 to spark 2.2.1. Has anyone seen extreme slow behavior on dataframe.filter(…).collect()?.. specifically a collect operation with filter before. dataframe.collect seems to run okay. However, dataframe.filter(…).collect() takes forever. it contains only 2 records. and its on a unit test. When I go back to spark 2.1.0, its back to normal speed



I have looked at the thread dump and could not find an obvious cause. I have made an effort to make sure all the libraries I am using are also using Spark 2.2.1. Any suggestion would be greatly appreciated.










share|improve this question


















  • 2




    need more details. it was very generic question. have you checked spark ui like stages etc...?
    – Ram Ghadiyaram
    8 hours ago













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I just upgraded spark 2.1.0 to spark 2.2.1. Has anyone seen extreme slow behavior on dataframe.filter(…).collect()?.. specifically a collect operation with filter before. dataframe.collect seems to run okay. However, dataframe.filter(…).collect() takes forever. it contains only 2 records. and its on a unit test. When I go back to spark 2.1.0, its back to normal speed



I have looked at the thread dump and could not find an obvious cause. I have made an effort to make sure all the libraries I am using are also using Spark 2.2.1. Any suggestion would be greatly appreciated.










share|improve this question













I just upgraded spark 2.1.0 to spark 2.2.1. Has anyone seen extreme slow behavior on dataframe.filter(…).collect()?.. specifically a collect operation with filter before. dataframe.collect seems to run okay. However, dataframe.filter(…).collect() takes forever. it contains only 2 records. and its on a unit test. When I go back to spark 2.1.0, its back to normal speed



I have looked at the thread dump and could not find an obvious cause. I have made an effort to make sure all the libraries I am using are also using Spark 2.2.1. Any suggestion would be greatly appreciated.







java scala apache-spark






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 8 hours ago









Karan Gupta

112




112








  • 2




    need more details. it was very generic question. have you checked spark ui like stages etc...?
    – Ram Ghadiyaram
    8 hours ago














  • 2




    need more details. it was very generic question. have you checked spark ui like stages etc...?
    – Ram Ghadiyaram
    8 hours ago








2




2




need more details. it was very generic question. have you checked spark ui like stages etc...?
– Ram Ghadiyaram
8 hours ago




need more details. it was very generic question. have you checked spark ui like stages etc...?
– Ram Ghadiyaram
8 hours ago

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53400713%2fspark-dataframe-action-really-slow-when-upgraded-from-2-1-0-to-2-2-1%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53400713%2fspark-dataframe-action-really-slow-when-upgraded-from-2-1-0-to-2-2-1%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Berounka

Sphinx de Gizeh

Different font size/position of beamer's navigation symbols template's content depending on regular/plain...