Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Onju to use new compoents from HAVPE #104

Closed
TheStigh opened this issue Jan 6, 2025 · 5 comments
Closed

Migrate Onju to use new compoents from HAVPE #104

TheStigh opened this issue Jan 6, 2025 · 5 comments
Labels

Comments

@TheStigh
Copy link

TheStigh commented Jan 6, 2025

Hi all,

This is a duplicate of message I wrote here: justLV/onju-voice#76

I've made several custom code versions for the Onju, both with continuous conversation, random greeting messages etc and got full control of the code itself :)

What I wanted to do is implement the updated components for voice_assistant, nabu_microphone and nabu_media_player. I ended up with two different issues yesterday;
1 - The microphone clearly pick up my wakeword, but it does not hear my command (VAD)
2 - Whatever I try to play to the media_player just causes an error (mp3 file etc)

The issue might be the shared i2c bus, but I'm not sure.
I could not import adf_pipeline as prevsiously.

I loaded following in the test configuration:

Full esp32 as for HAVPE:

esp32:
  board: esp32-s3-devkitc-1

Then for external_components:

external_components:
  - source:
      type: git
      url: https://github.com/esphome/home-assistant-voice-pe
      ref: dev
    components:
      - media_player
      - micro_wake_word
      - microphone
      - nabu
      - nabu_microphone
      - voice_assistant
    refresh: 0s

This is the revised settings for audio + media_player + wakeword + voice_assistant:

i2s_audio:
  - id: i2s_shared
    i2s_lrclk_pin: GPIO13
    i2s_bclk_pin: GPIO18
    #access_mode: duplex

microphone:
  - platform: nabu_microphone
    i2s_din_pin: GPIO17
    adc_type: external
    pdm: false
    use_apll: true
    sample_rate: 16000
    bits_per_sample: 32bit
    i2s_mode: primary
    i2s_audio_id: i2s_shared
    channel_0:
      id: asr_mic
      amplify_shift: 0

speaker:
  - platform: i2s_audio
    sample_rate: 48000
    i2s_mode: primary
    i2s_dout_pin: GPIO12
    bits_per_sample: 32bit
    i2s_audio_id: i2s_shared
    dac_type: external
    channel: left
    timeout: 500ms
    buffer_duration: 100ms

media_player:
  - platform: nabu
    id: nabu_media_player
    name: Media Player
    internal: false
    #audio_dac:
    speaker:
    sample_rate: 48000
    volume_increment: 0.05
    volume_min: 0.4
    volume_max: 0.85


micro_wake_word:
  id: mww
  models:
    - model: https://github.com/kahrendt/microWakeWord/releases/download/okay_nabu_20241226.3/okay_nabu.json
      id: okay_nabu
    - model: hey_jarvis
      id: hey_jarvis
    - model: alexa
      id: alexa
    - model: https://github.com/kahrendt/microWakeWord/releases/download/stop/stop.json
      id: stop
      internal: true
  vad:
  microphone: asr_mic
  on_wake_word_detected:
    # If the wake word is detected when the device is muted (Possible with the software mute switch): Do nothing
    - if:
        condition:
          switch.is_on: mute
        then:
          - switch.turn_off: mute
        # Start voice assistant, stop current announcement.
        else:
          - if:
              condition:
                lambda: return id(nabu_media_player)->state == media_player::MediaPlayerState::MEDIA_PLAYER_STATE_ANNOUNCING;
              then:
                lambda: |-
                  id(nabu_media_player)
                    ->make_call()
                    .set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_STOP)
                    .set_announcement(true)
                    .perform();
              else:
                - voice_assistant.start:
                    wake_word: !lambda return wake_word;

voice_assistant:
  id: va
  microphone: asr_mic
  media_player: nabu_media_player
  micro_wake_word: mww
  use_wake_word: false
  noise_suppression_level: 0
  auto_gain: 0 dbfs
  volume_multiplier: 1

Spot anything wrong?

@TheStigh TheStigh added the bug label Jan 6, 2025
@jhbruhn
Copy link
Contributor

jhbruhn commented Jan 6, 2025

Yes, the duplex i2s bus is the problem here. It is possible to get this going by changing the i2s_audio speaker implementation to not initialize the i2s driver again, after the microphone has already initialized the bus.

The proper solution is to have official ESPHome support of duplex I2S-Busses. I think it just needs someone to implement that / port over the duplex i2s_audio implementation of esphome_audio.

@tetele
Copy link
Owner

tetele commented Jan 6, 2025

How is this a bug with the current config, if the current config does not contain any of the components you mentioned?

I have attempted to port those components over and these are the conclusions. Until I a solution will be published, feel free to open a discussion with your own findings, but please don't call it a bug.

@tetele
Copy link
Owner

tetele commented Jan 6, 2025

The proper solution is to have official ESPHome support of duplex I2S-Busses.

That's only part of the solution, which was tackled in the new components for VPE. I trust those components will make their way into ESPHome soon enough.

The unsolvable issue is that both VA and MWW run at 16kHz sample rate and that if you use the same I2S bus, you can't run the input at 16kHz and the output at 48kHz (or any other sampling rate, for that matter).

@tetele tetele closed this as completed Jan 6, 2025
@jhbruhn
Copy link
Contributor

jhbruhn commented Jan 6, 2025

I have had surprisingly good success on my Onju Voice with some modified components (namely this: formatBCE/home-assistant-voice-pe@652b155 ) running mic and speaker at 48kHz, and having this very simple resampling implementation. microWakeWord and HACloud still pick up speech totally fine with that.

The only reason I haven't published that implementation yet is that native duplex support is missing. I hacked it in by disabling the speakers i2s driver init, but as that is not a very tidy solution, I don't recommend anyone to use it.

Also, duplex I2S-Busses were not yet tackled, as Voice-PE does not need support for that.

@TheStigh
Copy link
Author

TheStigh commented Jan 6, 2025

@tetele Sorry for adressing it as a bug - if I had made it as a feature request, would that be better?

The idea was to see if we together could find a way to implement the new components the best way possible for all of us already got the Onju. Whenever I get my HAVPE, I still will not throw the Onju away, I want the best out of it.

Thanks for clearing up the 16/48Hz.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants