Swift SFSpeechRecognizer appending existing UITextView content











up vote
0
down vote

favorite












I'm using SFSpeechRecognizer in my app which is working fine to ease the end user entering a comment in a UITextView thanks to a dedicated button (Start Speech Recognition).



But if the user is typing some text manually first and then starts its Speech Recognition, the previous text entered manually is erased. This is also the case if the user is performing two times a Speech Recognition (user is "speech" recording a first part of its text, then stop recording, and finally restart recording) on the same UITextView, the previous text is erased.



Hence, I would like to know how I can append text recognized by SFSpeechRecognizer to the existing one.



Here is my code:



func recordAndRecognizeSpeech(){

if recognitionTask != nil {
recognitionTask?.cancel()
recognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(AVAudioSessionCategoryRecord)
try audioSession.setMode(AVAudioSessionModeMeasurement)
try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
} catch {
print("audioSession properties weren't set because of an error.")
}
self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let inputNode = audioEngine.inputNode else {
fatalError("Audio engine has no input node")
}
let recognitionRequest = self.recognitionRequest
recognitionRequest.shouldReportPartialResults = true

recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
var isFinal = false
self.decaration.text = (result?.bestTranscription.formattedString)!

isFinal = (result?.isFinal)!
let bottom = NSMakeRange(self.decaration.text.characters.count - 1, 1)
self.decaration.scrollRangeToVisible(bottom)

if error != nil || isFinal {
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionTask = nil
self.recognitionRequest.endAudio()
self.oBtSpeech.isEnabled = true
}
})
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
self.recognitionRequest.append(buffer)
}
audioEngine.prepare()

do {
try audioEngine.start()
} catch {
print("audioEngine couldn't start because of an error.")
}

}


I tried to update



self.decaration.text = (result?.bestTranscription.formattedString)!


by



self.decaration.text += (result?.bestTranscription.formattedString)!



but it makes a doubloon for each sentence recognized.



Any idea how I can do that ?










share|improve this question


























    up vote
    0
    down vote

    favorite












    I'm using SFSpeechRecognizer in my app which is working fine to ease the end user entering a comment in a UITextView thanks to a dedicated button (Start Speech Recognition).



    But if the user is typing some text manually first and then starts its Speech Recognition, the previous text entered manually is erased. This is also the case if the user is performing two times a Speech Recognition (user is "speech" recording a first part of its text, then stop recording, and finally restart recording) on the same UITextView, the previous text is erased.



    Hence, I would like to know how I can append text recognized by SFSpeechRecognizer to the existing one.



    Here is my code:



    func recordAndRecognizeSpeech(){

    if recognitionTask != nil {
    recognitionTask?.cancel()
    recognitionTask = nil
    }
    let audioSession = AVAudioSession.sharedInstance()
    do {
    try audioSession.setCategory(AVAudioSessionCategoryRecord)
    try audioSession.setMode(AVAudioSessionModeMeasurement)
    try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
    } catch {
    print("audioSession properties weren't set because of an error.")
    }
    self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
    guard let inputNode = audioEngine.inputNode else {
    fatalError("Audio engine has no input node")
    }
    let recognitionRequest = self.recognitionRequest
    recognitionRequest.shouldReportPartialResults = true

    recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
    var isFinal = false
    self.decaration.text = (result?.bestTranscription.formattedString)!

    isFinal = (result?.isFinal)!
    let bottom = NSMakeRange(self.decaration.text.characters.count - 1, 1)
    self.decaration.scrollRangeToVisible(bottom)

    if error != nil || isFinal {
    self.audioEngine.stop()
    inputNode.removeTap(onBus: 0)
    self.recognitionTask = nil
    self.recognitionRequest.endAudio()
    self.oBtSpeech.isEnabled = true
    }
    })
    let recordingFormat = inputNode.outputFormat(forBus: 0)
    inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
    self.recognitionRequest.append(buffer)
    }
    audioEngine.prepare()

    do {
    try audioEngine.start()
    } catch {
    print("audioEngine couldn't start because of an error.")
    }

    }


    I tried to update



    self.decaration.text = (result?.bestTranscription.formattedString)!


    by



    self.decaration.text += (result?.bestTranscription.formattedString)!



    but it makes a doubloon for each sentence recognized.



    Any idea how I can do that ?










    share|improve this question
























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I'm using SFSpeechRecognizer in my app which is working fine to ease the end user entering a comment in a UITextView thanks to a dedicated button (Start Speech Recognition).



      But if the user is typing some text manually first and then starts its Speech Recognition, the previous text entered manually is erased. This is also the case if the user is performing two times a Speech Recognition (user is "speech" recording a first part of its text, then stop recording, and finally restart recording) on the same UITextView, the previous text is erased.



      Hence, I would like to know how I can append text recognized by SFSpeechRecognizer to the existing one.



      Here is my code:



      func recordAndRecognizeSpeech(){

      if recognitionTask != nil {
      recognitionTask?.cancel()
      recognitionTask = nil
      }
      let audioSession = AVAudioSession.sharedInstance()
      do {
      try audioSession.setCategory(AVAudioSessionCategoryRecord)
      try audioSession.setMode(AVAudioSessionModeMeasurement)
      try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
      } catch {
      print("audioSession properties weren't set because of an error.")
      }
      self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
      guard let inputNode = audioEngine.inputNode else {
      fatalError("Audio engine has no input node")
      }
      let recognitionRequest = self.recognitionRequest
      recognitionRequest.shouldReportPartialResults = true

      recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
      var isFinal = false
      self.decaration.text = (result?.bestTranscription.formattedString)!

      isFinal = (result?.isFinal)!
      let bottom = NSMakeRange(self.decaration.text.characters.count - 1, 1)
      self.decaration.scrollRangeToVisible(bottom)

      if error != nil || isFinal {
      self.audioEngine.stop()
      inputNode.removeTap(onBus: 0)
      self.recognitionTask = nil
      self.recognitionRequest.endAudio()
      self.oBtSpeech.isEnabled = true
      }
      })
      let recordingFormat = inputNode.outputFormat(forBus: 0)
      inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
      self.recognitionRequest.append(buffer)
      }
      audioEngine.prepare()

      do {
      try audioEngine.start()
      } catch {
      print("audioEngine couldn't start because of an error.")
      }

      }


      I tried to update



      self.decaration.text = (result?.bestTranscription.formattedString)!


      by



      self.decaration.text += (result?.bestTranscription.formattedString)!



      but it makes a doubloon for each sentence recognized.



      Any idea how I can do that ?










      share|improve this question













      I'm using SFSpeechRecognizer in my app which is working fine to ease the end user entering a comment in a UITextView thanks to a dedicated button (Start Speech Recognition).



      But if the user is typing some text manually first and then starts its Speech Recognition, the previous text entered manually is erased. This is also the case if the user is performing two times a Speech Recognition (user is "speech" recording a first part of its text, then stop recording, and finally restart recording) on the same UITextView, the previous text is erased.



      Hence, I would like to know how I can append text recognized by SFSpeechRecognizer to the existing one.



      Here is my code:



      func recordAndRecognizeSpeech(){

      if recognitionTask != nil {
      recognitionTask?.cancel()
      recognitionTask = nil
      }
      let audioSession = AVAudioSession.sharedInstance()
      do {
      try audioSession.setCategory(AVAudioSessionCategoryRecord)
      try audioSession.setMode(AVAudioSessionModeMeasurement)
      try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
      } catch {
      print("audioSession properties weren't set because of an error.")
      }
      self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
      guard let inputNode = audioEngine.inputNode else {
      fatalError("Audio engine has no input node")
      }
      let recognitionRequest = self.recognitionRequest
      recognitionRequest.shouldReportPartialResults = true

      recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
      var isFinal = false
      self.decaration.text = (result?.bestTranscription.formattedString)!

      isFinal = (result?.isFinal)!
      let bottom = NSMakeRange(self.decaration.text.characters.count - 1, 1)
      self.decaration.scrollRangeToVisible(bottom)

      if error != nil || isFinal {
      self.audioEngine.stop()
      inputNode.removeTap(onBus: 0)
      self.recognitionTask = nil
      self.recognitionRequest.endAudio()
      self.oBtSpeech.isEnabled = true
      }
      })
      let recordingFormat = inputNode.outputFormat(forBus: 0)
      inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
      self.recognitionRequest.append(buffer)
      }
      audioEngine.prepare()

      do {
      try audioEngine.start()
      } catch {
      print("audioEngine couldn't start because of an error.")
      }

      }


      I tried to update



      self.decaration.text = (result?.bestTranscription.formattedString)!


      by



      self.decaration.text += (result?.bestTranscription.formattedString)!



      but it makes a doubloon for each sentence recognized.



      Any idea how I can do that ?







      swift sfspeechrecognizer






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 21 at 15:37









      tiamat

      345315




      345315
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted










          Try saving the text before starting the recognition system.



          func recordAndRecognizeSpeech(){
          // one change here
          let defaultText = self.decaration.text

          if recognitionTask != nil {
          recognitionTask?.cancel()
          recognitionTask = nil
          }
          let audioSession = AVAudioSession.sharedInstance()
          do {
          try audioSession.setCategory(AVAudioSessionCategoryRecord)
          try audioSession.setMode(AVAudioSessionModeMeasurement)
          try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
          } catch {
          print("audioSession properties weren't set because of an error.")
          }
          self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
          guard let inputNode = audioEngine.inputNode else {
          fatalError("Audio engine has no input node")
          }
          let recognitionRequest = self.recognitionRequest
          recognitionRequest.shouldReportPartialResults = true

          recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
          var isFinal = false
          // one change here
          self.decaration.text = defaultText + " " + (result?.bestTranscription.formattedString)!

          isFinal = (result?.isFinal)!
          let bottom = NSMakeRange(self.decaration.text.characters.count - 1, 1)
          self.decaration.scrollRangeToVisible(bottom)

          if error != nil || isFinal {
          self.audioEngine.stop()
          inputNode.removeTap(onBus: 0)
          self.recognitionTask = nil
          self.recognitionRequest.endAudio()
          self.oBtSpeech.isEnabled = true
          }
          })
          let recordingFormat = inputNode.outputFormat(forBus: 0)
          inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
          self.recognitionRequest.append(buffer)
          }
          audioEngine.prepare()

          do {
          try audioEngine.start()
          } catch {
          print("audioEngine couldn't start because of an error.")
          }
          }


          result?.bestTranscription.formattedString returns the entire phrase that was recognised, thats why you should reset self.decaration.text each time you get a response from SFSpeechRecognnizer.






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53415557%2fswift-sfspeechrecognizer-appending-existing-uitextview-content%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            1
            down vote



            accepted










            Try saving the text before starting the recognition system.



            func recordAndRecognizeSpeech(){
            // one change here
            let defaultText = self.decaration.text

            if recognitionTask != nil {
            recognitionTask?.cancel()
            recognitionTask = nil
            }
            let audioSession = AVAudioSession.sharedInstance()
            do {
            try audioSession.setCategory(AVAudioSessionCategoryRecord)
            try audioSession.setMode(AVAudioSessionModeMeasurement)
            try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
            } catch {
            print("audioSession properties weren't set because of an error.")
            }
            self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
            guard let inputNode = audioEngine.inputNode else {
            fatalError("Audio engine has no input node")
            }
            let recognitionRequest = self.recognitionRequest
            recognitionRequest.shouldReportPartialResults = true

            recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
            var isFinal = false
            // one change here
            self.decaration.text = defaultText + " " + (result?.bestTranscription.formattedString)!

            isFinal = (result?.isFinal)!
            let bottom = NSMakeRange(self.decaration.text.characters.count - 1, 1)
            self.decaration.scrollRangeToVisible(bottom)

            if error != nil || isFinal {
            self.audioEngine.stop()
            inputNode.removeTap(onBus: 0)
            self.recognitionTask = nil
            self.recognitionRequest.endAudio()
            self.oBtSpeech.isEnabled = true
            }
            })
            let recordingFormat = inputNode.outputFormat(forBus: 0)
            inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
            self.recognitionRequest.append(buffer)
            }
            audioEngine.prepare()

            do {
            try audioEngine.start()
            } catch {
            print("audioEngine couldn't start because of an error.")
            }
            }


            result?.bestTranscription.formattedString returns the entire phrase that was recognised, thats why you should reset self.decaration.text each time you get a response from SFSpeechRecognnizer.






            share|improve this answer



























              up vote
              1
              down vote



              accepted










              Try saving the text before starting the recognition system.



              func recordAndRecognizeSpeech(){
              // one change here
              let defaultText = self.decaration.text

              if recognitionTask != nil {
              recognitionTask?.cancel()
              recognitionTask = nil
              }
              let audioSession = AVAudioSession.sharedInstance()
              do {
              try audioSession.setCategory(AVAudioSessionCategoryRecord)
              try audioSession.setMode(AVAudioSessionModeMeasurement)
              try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
              } catch {
              print("audioSession properties weren't set because of an error.")
              }
              self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
              guard let inputNode = audioEngine.inputNode else {
              fatalError("Audio engine has no input node")
              }
              let recognitionRequest = self.recognitionRequest
              recognitionRequest.shouldReportPartialResults = true

              recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
              var isFinal = false
              // one change here
              self.decaration.text = defaultText + " " + (result?.bestTranscription.formattedString)!

              isFinal = (result?.isFinal)!
              let bottom = NSMakeRange(self.decaration.text.characters.count - 1, 1)
              self.decaration.scrollRangeToVisible(bottom)

              if error != nil || isFinal {
              self.audioEngine.stop()
              inputNode.removeTap(onBus: 0)
              self.recognitionTask = nil
              self.recognitionRequest.endAudio()
              self.oBtSpeech.isEnabled = true
              }
              })
              let recordingFormat = inputNode.outputFormat(forBus: 0)
              inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
              self.recognitionRequest.append(buffer)
              }
              audioEngine.prepare()

              do {
              try audioEngine.start()
              } catch {
              print("audioEngine couldn't start because of an error.")
              }
              }


              result?.bestTranscription.formattedString returns the entire phrase that was recognised, thats why you should reset self.decaration.text each time you get a response from SFSpeechRecognnizer.






              share|improve this answer

























                up vote
                1
                down vote



                accepted







                up vote
                1
                down vote



                accepted






                Try saving the text before starting the recognition system.



                func recordAndRecognizeSpeech(){
                // one change here
                let defaultText = self.decaration.text

                if recognitionTask != nil {
                recognitionTask?.cancel()
                recognitionTask = nil
                }
                let audioSession = AVAudioSession.sharedInstance()
                do {
                try audioSession.setCategory(AVAudioSessionCategoryRecord)
                try audioSession.setMode(AVAudioSessionModeMeasurement)
                try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
                } catch {
                print("audioSession properties weren't set because of an error.")
                }
                self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
                guard let inputNode = audioEngine.inputNode else {
                fatalError("Audio engine has no input node")
                }
                let recognitionRequest = self.recognitionRequest
                recognitionRequest.shouldReportPartialResults = true

                recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
                var isFinal = false
                // one change here
                self.decaration.text = defaultText + " " + (result?.bestTranscription.formattedString)!

                isFinal = (result?.isFinal)!
                let bottom = NSMakeRange(self.decaration.text.characters.count - 1, 1)
                self.decaration.scrollRangeToVisible(bottom)

                if error != nil || isFinal {
                self.audioEngine.stop()
                inputNode.removeTap(onBus: 0)
                self.recognitionTask = nil
                self.recognitionRequest.endAudio()
                self.oBtSpeech.isEnabled = true
                }
                })
                let recordingFormat = inputNode.outputFormat(forBus: 0)
                inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
                self.recognitionRequest.append(buffer)
                }
                audioEngine.prepare()

                do {
                try audioEngine.start()
                } catch {
                print("audioEngine couldn't start because of an error.")
                }
                }


                result?.bestTranscription.formattedString returns the entire phrase that was recognised, thats why you should reset self.decaration.text each time you get a response from SFSpeechRecognnizer.






                share|improve this answer














                Try saving the text before starting the recognition system.



                func recordAndRecognizeSpeech(){
                // one change here
                let defaultText = self.decaration.text

                if recognitionTask != nil {
                recognitionTask?.cancel()
                recognitionTask = nil
                }
                let audioSession = AVAudioSession.sharedInstance()
                do {
                try audioSession.setCategory(AVAudioSessionCategoryRecord)
                try audioSession.setMode(AVAudioSessionModeMeasurement)
                try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
                } catch {
                print("audioSession properties weren't set because of an error.")
                }
                self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
                guard let inputNode = audioEngine.inputNode else {
                fatalError("Audio engine has no input node")
                }
                let recognitionRequest = self.recognitionRequest
                recognitionRequest.shouldReportPartialResults = true

                recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
                var isFinal = false
                // one change here
                self.decaration.text = defaultText + " " + (result?.bestTranscription.formattedString)!

                isFinal = (result?.isFinal)!
                let bottom = NSMakeRange(self.decaration.text.characters.count - 1, 1)
                self.decaration.scrollRangeToVisible(bottom)

                if error != nil || isFinal {
                self.audioEngine.stop()
                inputNode.removeTap(onBus: 0)
                self.recognitionTask = nil
                self.recognitionRequest.endAudio()
                self.oBtSpeech.isEnabled = true
                }
                })
                let recordingFormat = inputNode.outputFormat(forBus: 0)
                inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
                self.recognitionRequest.append(buffer)
                }
                audioEngine.prepare()

                do {
                try audioEngine.start()
                } catch {
                print("audioEngine couldn't start because of an error.")
                }
                }


                result?.bestTranscription.formattedString returns the entire phrase that was recognised, thats why you should reset self.decaration.text each time you get a response from SFSpeechRecognnizer.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Nov 21 at 22:17

























                answered Nov 21 at 17:07









                Deryck Lucian

                1567




                1567






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53415557%2fswift-sfspeechrecognizer-appending-existing-uitextview-content%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Sphinx de Gizeh

                    Dijon

                    Guerrita