Skip to content

Aatlantise/jsgf_nlu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

NLU JSGF task

Simple CFG development in JSGF

Task 1: English grammar extension

My understanding of JSGF and its <unk> tags are rudimentary. From what I've gathered from the documentation (linked in the pdf) and the explanations in the pdf, tokens without explicit mappings to a tag are tagged <unk>. I have assumed so in my tasks.

I organized the generatable sentences into <play_request> <music_entity>, where <play_request> is implicitly defined as (i want to listen to | [can you] (play [me] | put on)).

The extended grammar, in [english_extension.java](english_extension.java) is able to generate:

  • [can you play]<unk> [beatles]<artist>
  • [can you put on]<unk> [paranoid android]<song>
  • [i want to listen to]<unk> [jazz]<genre> [music]<unk>
  • [play me]<unk> [ummagumma]<album> [by]<unk> [pink floyd]<artist>

As the model is scaled, it is likely that music entities also include phrases like "can you play", "want to listen", and other tokens that should be tagged <unk>, parsing them as <music_entity>. Then, it may be beneficial to explicitly define <play_request> and have the model identify both <play_request> and <music_entity>. Tokens like [music] and [by] may also need to be explicitly defined.

Task 2: Korean localization

2.1. Localization

The localization, in [korean_localization.java](korean_localization.java) explicitly defines <play_request> and introduces <want_listen>, which can be used as a request on its own but also can accompany <play_request>.

2.2. Sample utterances [^1]:

The localized grammar is able to generate:

  • [비틀즈]<artist_ko> [재생해줘]<play_request>
  • [파라노이드 안드로이드]<song_ko> [듣고싶어]<want_listen>
  • [재즈]<genre_ko> [음악]<unk> [듣고싶은데]<want_listen> [틀어줘]<play_request>
  • [핑크플로이드]<artist_ko> [의]<unk> [우마구마]<album_ko> [틀어줄 수 있니]<play_request>

2.3. Difficulties in scaling

Scaling a grammar is no easy task, and I think one significatn issue that may arise in scaling the above grammar may be the following:

Korean transliterations of songs, artists, and genres of non-Korean origin may not accurately resemble how they are pronounced. For example, the genre jazz is most commonly written as 재즈 but is pronounced 째즈. Varying levels of flunecy in English of other langauges may affect how a Korean speaker pronounces a song, artist, or genre. A corpus generated by a grammar that does not reflect such variance may not be effective in musical entity voice recognition, and may require a hybrid corpus that not only trains on 재즈 but also 째즈 and jazz.

2.4. Language-specific difficulties

I believe working with conjugation offers a challenging problem to be solved when it comes to Korean-specific difficulties.

Korean sentences have many conjugations. For example, depending on what the user considers the voice recognition module--an older friend, an assistant, a younger friend, a subordinate, etc--they may use different types of conjugation. In such case, 재생해줘 from 2.2 [^1] may take other forms like 재생해주세요, 재생해줘요, 재생해, 재생해주라. In addition, sentence-ending conjugation may also vary depending on the user's mood, their native dialect, and other factors. This problem should be addressed from an end-to-end perspective as most pre-trained transformer models should know to focus on the token 재생, but even if hardware conditions limit computational capacities to lightweight or rule-based models, I believe a feature engineering approach or data augmentation--both ideally human-in-the-loop--should be sufficient to address this issue.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages