Excluding specific data in a column (csv-file) with regular expression

up vote
0
down vote

favorite

community,

I have a csv file with two columns (a, b) and one of the columns (b) has quite a long string in each row (e.g. System1. Help2. I need info. Please. Interesting_data). This string I want to reduce, namely to only give me the Interesting_data. For a regular expression you need a pattern and in my case I counted whitespaces and space characters (only full stops) until the data comes I'm interested in. There are 5 full stops and 3 whitespaces, inbetween are only words and numericals.
This is my code:

import pandas as pd

import numpy as np

import re





df = pd.read_csv('file.csv', sep=";",decimal=",")



result = df.b.re.findall(r's{8}.(w+)') 

print(result)

Unfortunately I get this Error-Message:

AttributeError: 'Series' object has no attribute 're'

My research on the internet didn't really help.
Is my regular expression completely wrong or is it something else?
And I am using Python 3.7.0 on Jupyter Notebook

Thank you!

asked Nov 21 at 16:17

Lis011235

I think your problem is that you want to use the returned value of the regex directly with dot notation. Try df.b[re.findall(r's{8}.(w+)') instead, and be careful with findall method because it returns all matches in list of strings form. You need to be sure that in your list will be always one element.
– Iulian
Nov 21 at 16:30

Use df['newcol']= df['b'].str.extract(r'.{5}(.*?)s{3}') to extract a substring from 5 dots till 3 whitespaces from each cell.
– Wiktor Stribiżew
Nov 21 at 16:33

@Iulian isn't there a closing ` ]` missing in your code?
– Lis011235
Nov 22 at 8:51

1

Share an MCVE, code/file to reproduce the issue.
– Wiktor Stribiżew
Nov 22 at 8:56

1

It would help if you edit your question to give some worked examples showing what you are trying to acheive.
– Martin Evans
Nov 22 at 10:56

|
show 3 more comments

up vote
0
down vote

favorite

community,

import pandas as pd

import numpy as np

import re





df = pd.read_csv('file.csv', sep=";",decimal=",")



result = df.b.re.findall(r's{8}.(w+)') 

print(result)

Unfortunately I get this Error-Message:

AttributeError: 'Series' object has no attribute 're'

My research on the internet didn't really help.
Is my regular expression completely wrong or is it something else?
And I am using Python 3.7.0 on Jupyter Notebook

Thank you!

asked Nov 21 at 16:17

Lis011235

I think your problem is that you want to use the returned value of the regex directly with dot notation. Try df.b[re.findall(r's{8}.(w+)') instead, and be careful with findall method because it returns all matches in list of strings form. You need to be sure that in your list will be always one element.
– Iulian
Nov 21 at 16:30

Use df['newcol']= df['b'].str.extract(r'.{5}(.*?)s{3}') to extract a substring from 5 dots till 3 whitespaces from each cell.
– Wiktor Stribiżew
Nov 21 at 16:33

@Iulian isn't there a closing ` ]` missing in your code?
– Lis011235
Nov 22 at 8:51

1

Share an MCVE, code/file to reproduce the issue.
– Wiktor Stribiżew
Nov 22 at 8:56

1

It would help if you edit your question to give some worked examples showing what you are trying to acheive.
– Martin Evans
Nov 22 at 10:56

|
show 3 more comments

up vote
0
down vote

favorite

community,

import pandas as pd

import numpy as np

import re





df = pd.read_csv('file.csv', sep=";",decimal=",")



result = df.b.re.findall(r's{8}.(w+)') 

print(result)

Unfortunately I get this Error-Message:

AttributeError: 'Series' object has no attribute 're'

My research on the internet didn't really help.
Is my regular expression completely wrong or is it something else?
And I am using Python 3.7.0 on Jupyter Notebook

Thank you!

asked Nov 21 at 16:17

Lis011235

community,

import pandas as pd

import numpy as np

import re





df = pd.read_csv('file.csv', sep=";",decimal=",")



result = df.b.re.findall(r's{8}.(w+)') 

print(result)

Unfortunately I get this Error-Message:

AttributeError: 'Series' object has no attribute 're'

My research on the internet didn't really help.
Is my regular expression completely wrong or is it something else?
And I am using Python 3.7.0 on Jupyter Notebook

Thank you!

python regex python-3.x csv

asked Nov 21 at 16:17

Lis011235

asked Nov 21 at 16:17

Lis011235

asked Nov 21 at 16:17

Lis011235

asked Nov 21 at 16:17

Lis011235

asked Nov 21 at 16:17

Lis011235

I think your problem is that you want to use the returned value of the regex directly with dot notation. Try df.b[re.findall(r's{8}.(w+)') instead, and be careful with findall method because it returns all matches in list of strings form. You need to be sure that in your list will be always one element.
– Iulian
Nov 21 at 16:30

Use df['newcol']= df['b'].str.extract(r'.{5}(.*?)s{3}') to extract a substring from 5 dots till 3 whitespaces from each cell.
– Wiktor Stribiżew
Nov 21 at 16:33

@Iulian isn't there a closing ` ]` missing in your code?
– Lis011235
Nov 22 at 8:51

1

Share an MCVE, code/file to reproduce the issue.
– Wiktor Stribiżew
Nov 22 at 8:56

1

It would help if you edit your question to give some worked examples showing what you are trying to acheive.
– Martin Evans
Nov 22 at 10:56

|
show 3 more comments

I think your problem is that you want to use the returned value of the regex directly with dot notation. Try df.b[re.findall(r's{8}.(w+)') instead, and be careful with findall method because it returns all matches in list of strings form. You need to be sure that in your list will be always one element.
– Iulian
Nov 21 at 16:30

Use df['newcol']= df['b'].str.extract(r'.{5}(.*?)s{3}') to extract a substring from 5 dots till 3 whitespaces from each cell.
– Wiktor Stribiżew
Nov 21 at 16:33

@Iulian isn't there a closing ` ]` missing in your code?
– Lis011235
Nov 22 at 8:51

1

Share an MCVE, code/file to reproduce the issue.
– Wiktor Stribiżew
Nov 22 at 8:56

1

It would help if you edit your question to give some worked examples showing what you are trying to acheive.
– Martin Evans
Nov 22 at 10:56

I think your problem is that you want to use the returned value of the regex directly with dot notation. Try df.b[re.findall(r's{8}.(w+)') instead, and be careful with findall method because it returns all matches in list of strings form. You need to be sure that in your list will be always one element.
– Iulian
Nov 21 at 16:30

Use df['newcol']= df['b'].str.extract(r'.{5}(.*?)s{3}') to extract a substring from 5 dots till 3 whitespaces from each cell.
– Wiktor Stribiżew
Nov 21 at 16:33

@Iulian isn't there a closing ` ]` missing in your code?
– Lis011235
Nov 22 at 8:51

Share an MCVE, code/file to reproduce the issue.
– Wiktor Stribiżew
Nov 22 at 8:56

It would help if you edit your question to give some worked examples showing what you are trying to acheive.
– Martin Evans
Nov 22 at 10:56

|
show 3 more comments

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53416325%2fexcluding-specific-data-in-a-column-csv-file-with-regular-expression%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htykuut