Split string into list contains alphabetical bullet list












5














My string contains
text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"



I want to split this in list like
["Baghdad, Iraq","United Arab Emirates (possibly)"]



The code which i have used is not providing me the desired result



re.split('\s*([a-zA-Z\d][).]|•)\s*(?=[A-Z])', text)


Please help me regarding this










share|improve this question
























  • Is it possible to have a string like a) Baghdad, Iraq b) United Arab Emirates (possibly) c) Turkey if UAE is not in (b)?
    – lxop
    Nov 22 at 12:16










  • You are missing the r at the start of your regex pattern string.
    – usr2564301
    Nov 22 at 12:26






  • 1




    [s for s in re.split('\s*([a-zA-Z\d][).]|•)\s*(?=[A-Z])', text) if len(s) > 4]
    – iamklaus
    Nov 22 at 12:30










  • @SarthakNegi that fails for c) A
    – planetmaker
    Nov 22 at 12:46










  • @lxop yes it can also contains c d e so on......
    – Sharjeel Ali Shaukat
    Nov 23 at 5:26
















5














My string contains
text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"



I want to split this in list like
["Baghdad, Iraq","United Arab Emirates (possibly)"]



The code which i have used is not providing me the desired result



re.split('\s*([a-zA-Z\d][).]|•)\s*(?=[A-Z])', text)


Please help me regarding this










share|improve this question
























  • Is it possible to have a string like a) Baghdad, Iraq b) United Arab Emirates (possibly) c) Turkey if UAE is not in (b)?
    – lxop
    Nov 22 at 12:16










  • You are missing the r at the start of your regex pattern string.
    – usr2564301
    Nov 22 at 12:26






  • 1




    [s for s in re.split('\s*([a-zA-Z\d][).]|•)\s*(?=[A-Z])', text) if len(s) > 4]
    – iamklaus
    Nov 22 at 12:30










  • @SarthakNegi that fails for c) A
    – planetmaker
    Nov 22 at 12:46










  • @lxop yes it can also contains c d e so on......
    – Sharjeel Ali Shaukat
    Nov 23 at 5:26














5












5








5







My string contains
text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"



I want to split this in list like
["Baghdad, Iraq","United Arab Emirates (possibly)"]



The code which i have used is not providing me the desired result



re.split('\s*([a-zA-Z\d][).]|•)\s*(?=[A-Z])', text)


Please help me regarding this










share|improve this question















My string contains
text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"



I want to split this in list like
["Baghdad, Iraq","United Arab Emirates (possibly)"]



The code which i have used is not providing me the desired result



re.split('\s*([a-zA-Z\d][).]|•)\s*(?=[A-Z])', text)


Please help me regarding this







python string python-3.x






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 at 12:24









Patrick Artner

20.3k52042




20.3k52042










asked Nov 22 at 12:12









Sharjeel Ali Shaukat

373210




373210












  • Is it possible to have a string like a) Baghdad, Iraq b) United Arab Emirates (possibly) c) Turkey if UAE is not in (b)?
    – lxop
    Nov 22 at 12:16










  • You are missing the r at the start of your regex pattern string.
    – usr2564301
    Nov 22 at 12:26






  • 1




    [s for s in re.split('\s*([a-zA-Z\d][).]|•)\s*(?=[A-Z])', text) if len(s) > 4]
    – iamklaus
    Nov 22 at 12:30










  • @SarthakNegi that fails for c) A
    – planetmaker
    Nov 22 at 12:46










  • @lxop yes it can also contains c d e so on......
    – Sharjeel Ali Shaukat
    Nov 23 at 5:26


















  • Is it possible to have a string like a) Baghdad, Iraq b) United Arab Emirates (possibly) c) Turkey if UAE is not in (b)?
    – lxop
    Nov 22 at 12:16










  • You are missing the r at the start of your regex pattern string.
    – usr2564301
    Nov 22 at 12:26






  • 1




    [s for s in re.split('\s*([a-zA-Z\d][).]|•)\s*(?=[A-Z])', text) if len(s) > 4]
    – iamklaus
    Nov 22 at 12:30










  • @SarthakNegi that fails for c) A
    – planetmaker
    Nov 22 at 12:46










  • @lxop yes it can also contains c d e so on......
    – Sharjeel Ali Shaukat
    Nov 23 at 5:26
















Is it possible to have a string like a) Baghdad, Iraq b) United Arab Emirates (possibly) c) Turkey if UAE is not in (b)?
– lxop
Nov 22 at 12:16




Is it possible to have a string like a) Baghdad, Iraq b) United Arab Emirates (possibly) c) Turkey if UAE is not in (b)?
– lxop
Nov 22 at 12:16












You are missing the r at the start of your regex pattern string.
– usr2564301
Nov 22 at 12:26




You are missing the r at the start of your regex pattern string.
– usr2564301
Nov 22 at 12:26




1




1




[s for s in re.split('\s*([a-zA-Z\d][).]|•)\s*(?=[A-Z])', text) if len(s) > 4]
– iamklaus
Nov 22 at 12:30




[s for s in re.split('\s*([a-zA-Z\d][).]|•)\s*(?=[A-Z])', text) if len(s) > 4]
– iamklaus
Nov 22 at 12:30












@SarthakNegi that fails for c) A
– planetmaker
Nov 22 at 12:46




@SarthakNegi that fails for c) A
– planetmaker
Nov 22 at 12:46












@lxop yes it can also contains c d e so on......
– Sharjeel Ali Shaukat
Nov 23 at 5:26




@lxop yes it can also contains c d e so on......
– Sharjeel Ali Shaukat
Nov 23 at 5:26












2 Answers
2






active

oldest

votes


















3














You could create the wanted data for your example using a list comp and a second regex:



import re

text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"

# different 1.regex pattern, same result - refining with 2nd pattern
data = [x for x in re.split(r'((?:^s*[a-zA-Z0-9]))|(?:s+[a-zA-Z0-9])))s*',
text) if x and not re.match(r"s*[a-zA-Z])",x)]
print(data)


Output:



['Baghdad, Iraq', 'United Arab Emirates (possibly)']


See https://regex101.com/r/wxEEQW/1






share|improve this answer





























    1














    Instead of re.findall, you can simply use re.split:



    import re
    text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"
    countries = list(filter(None, map(str.rstrip, re.split('w)s', text))))


    Output:



    ['Baghdad, Iraq', 'United Arab Emirates (possibly)']





    share|improve this answer





















      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53430766%2fsplit-string-into-list-contains-alphabetical-bullet-list%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      3














      You could create the wanted data for your example using a list comp and a second regex:



      import re

      text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"

      # different 1.regex pattern, same result - refining with 2nd pattern
      data = [x for x in re.split(r'((?:^s*[a-zA-Z0-9]))|(?:s+[a-zA-Z0-9])))s*',
      text) if x and not re.match(r"s*[a-zA-Z])",x)]
      print(data)


      Output:



      ['Baghdad, Iraq', 'United Arab Emirates (possibly)']


      See https://regex101.com/r/wxEEQW/1






      share|improve this answer


























        3














        You could create the wanted data for your example using a list comp and a second regex:



        import re

        text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"

        # different 1.regex pattern, same result - refining with 2nd pattern
        data = [x for x in re.split(r'((?:^s*[a-zA-Z0-9]))|(?:s+[a-zA-Z0-9])))s*',
        text) if x and not re.match(r"s*[a-zA-Z])",x)]
        print(data)


        Output:



        ['Baghdad, Iraq', 'United Arab Emirates (possibly)']


        See https://regex101.com/r/wxEEQW/1






        share|improve this answer
























          3












          3








          3






          You could create the wanted data for your example using a list comp and a second regex:



          import re

          text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"

          # different 1.regex pattern, same result - refining with 2nd pattern
          data = [x for x in re.split(r'((?:^s*[a-zA-Z0-9]))|(?:s+[a-zA-Z0-9])))s*',
          text) if x and not re.match(r"s*[a-zA-Z])",x)]
          print(data)


          Output:



          ['Baghdad, Iraq', 'United Arab Emirates (possibly)']


          See https://regex101.com/r/wxEEQW/1






          share|improve this answer












          You could create the wanted data for your example using a list comp and a second regex:



          import re

          text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"

          # different 1.regex pattern, same result - refining with 2nd pattern
          data = [x for x in re.split(r'((?:^s*[a-zA-Z0-9]))|(?:s+[a-zA-Z0-9])))s*',
          text) if x and not re.match(r"s*[a-zA-Z])",x)]
          print(data)


          Output:



          ['Baghdad, Iraq', 'United Arab Emirates (possibly)']


          See https://regex101.com/r/wxEEQW/1







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 22 at 12:33









          Patrick Artner

          20.3k52042




          20.3k52042

























              1














              Instead of re.findall, you can simply use re.split:



              import re
              text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"
              countries = list(filter(None, map(str.rstrip, re.split('w)s', text))))


              Output:



              ['Baghdad, Iraq', 'United Arab Emirates (possibly)']





              share|improve this answer


























                1














                Instead of re.findall, you can simply use re.split:



                import re
                text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"
                countries = list(filter(None, map(str.rstrip, re.split('w)s', text))))


                Output:



                ['Baghdad, Iraq', 'United Arab Emirates (possibly)']





                share|improve this answer
























                  1












                  1








                  1






                  Instead of re.findall, you can simply use re.split:



                  import re
                  text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"
                  countries = list(filter(None, map(str.rstrip, re.split('w)s', text))))


                  Output:



                  ['Baghdad, Iraq', 'United Arab Emirates (possibly)']





                  share|improve this answer












                  Instead of re.findall, you can simply use re.split:



                  import re
                  text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"
                  countries = list(filter(None, map(str.rstrip, re.split('w)s', text))))


                  Output:



                  ['Baghdad, Iraq', 'United Arab Emirates (possibly)']






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 22 at 15:18









                  Ajax1234

                  39.8k42652




                  39.8k42652






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53430766%2fsplit-string-into-list-contains-alphabetical-bullet-list%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Sphinx de Gizeh

                      Dijon

                      Équipe cycliste