Skip to content
Change the repository type filter

All

    Repositories list

    • moshi

      Public
      Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
      Python
      Apache License 2.0
      6037.5k336Updated Feb 22, 2025Feb 22, 2025
    • Swift
      MIT License
      57314Updated Feb 20, 2025Feb 20, 2025
    • hibiki

      Public
      Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits for the end of the source utterance to start translating--- Hibiki adapts its flow to accumulate just enough context to produce a correct translation in real-time, chunk by chunk.
      Rust
      Apache License 2.0
      6079161Updated Feb 9, 2025Feb 9, 2025
    • JAX bindings for the flash-attention2 kernels
      C++
      0700Updated Jan 16, 2025Jan 16, 2025
    • sphn

      Public
      python bindings for symphonia/opus - read various audio formats from python and write opus files
      Rust
      Apache License 2.0
      43040Updated Dec 22, 2024Dec 22, 2024
    • yomikomi

      Public
      A small rust-based data loader
      Rust
      Apache License 2.0
      02200Updated Dec 10, 2024Dec 10, 2024
    • ogg-table

      Public
      Ogg-vorbis reader with fast random access
      Rust
      Other
      1600Updated Aug 29, 2024Aug 29, 2024
    • JAX bindings for the flash-attention3 kernels
      C++
      11100Updated Aug 6, 2024Aug 6, 2024