-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Mahina HUD is an intelligent augmented reality (AR) application designed for Apple Vision Pro on visionOS, featuring integrated real-time Speech-to-Text (STT), translation, and language tools. This application is tailored for accessibility, language learning, and enhanced interaction with real-world environments through dynamic AR features.
Mahina HUD is built with the following key objectives in mind:
-
Real-Time Accessibility:
Providing real-time transcription and translation to make spoken content accessible, regardless of language level or hearing ability. -
Language Learning & Enrichment:
Offering tools to assist users in learning new languages through live subtitling, translation, and grammar analysis. -
Innovative Use of Augmented Reality:
Leveraging the full potential of visionOS in the form of a heads-up display (HUD), with future plans to include support for AOSP-compatible hardware (and Android, other Linux support, etc.) to push the boundaries of AR experiences. -
Offline Capabilities:
Exploring onboard STT models to ensure the fastest inference times for STT subtitling.
We are steadily making progress with support for multiple languages. We now have IPA (International Phonetic Alphabet) support for over 30 languages. We are also busily working on unique linguistics tools such as custom Mandarin tonal display modes and Japanese pitch accent display modes. On a separate note, we have been working on a mode that allows for long text analysis of full documents, supporting the import and use of Core Tools on of a wide array of file types (.doc, .pdf, .txt, .rtf, etc.). Currently utilizing OCR, machine learning models to test the possibilities of document analysis within the integrated Mahina toolkit.
Mahina-tk – a full-fledged linguistic toolkit webapp
Part of the development journey for the last few months has involved the idea of converting the embedded language tools in the native visionOS application we have been creating and putting them into a webapp. This approach allows us to organize Mahina's Core Tools in one place and makes the principal language tools of Mahina HUD available across multiple platforms. We expect to use this method to rapidly test experimental versions of Mahina in various forms of applications on macOS, Linux, Windows, iOS, and potentially browser extensions, etc.
October has been a busy month for the Mahina HUD project, with a focus on refining several key features and expanding the application's capabilities. We are continuously evaluating new technologies and approaches to further enhance the user experience.
Our team is also exploring new directions in subtitling, language tools, and overall user interaction, with an emphasis on expanding support and providing more seamless real-time responses. Future developments are set to bring exciting updates to the platform, allowing us to push the boundaries of what’s possible in AR language tech.
Mahina HUD is designed to support a variety of features, including:
-
Real-Time Subtitling:
Used for live conversations to visualize real-time communication for language learning and translation.(A multilayer subtitling feature is currently in development to expand available options.)
-
Language Tools:
Features include POS tagging, transliteration, and detailed grammar feedback.
As APIs for visionOS become available, we also plan to explore:
-
Environment & Visual Tools:
Tools such as subject detection and environment augmentation within AR spaces.
For a full list of supported features, visit the Core Tools Overview.
Who:
Mahina HUD is developed by PhasaTek Labs; Tokyo, Japan. This project aims to bring cutting-edge language tools to the XR industry by visualizing spoken language. Currently utilizing third-party APIs and libraries.
Testing Status:
This project is being alpha tested internally with trusted partners. For more information, feel free to reach out to PhasaTek Labs development team at info@phasatek.jp.