index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <link rel="stylesheet" href="./style.css">
    <title>Transformer-S2A</title>
</head>
<body>
    <div class="container">
        <div class="title-box">
            <img src="./img/THUHCSI_logo.png" alt="THUHCSI_Logo">
            <p id="paper-title">Transformer-S2A: Robust and Efficient Speech-to-Animation</p>
            <p id="submit">Submitted to ICASSP 2022</p>
            <p id="ack"><a target="_blank" href="https://digitaldomain.com/?lang=zh-hant">Digital Domain</a> creates the Digital Avatar and provides all Rendering Demos</p>
        </div>

        <div class="single-demo">
            <div class="subtitle">
                <h1>Speak Mandarian</h1>
                <p>The proposed model and baseline are trained on Mandarin dataset.</p>
                <p>Upper: baseline (frame-level). Lower: proposed.</p>
            </div>
            <div class="media">
                <video width="100%" height="100%" controls>
                    <source src="demos/frameLevel.mp4" type="video/mp4">
                </video>
            </div>
            <div class="notion">
                <h3 >Frame Level</h3>
                <p>Transcription: <em>随着年轻人聚集，县城就可以不断滚动提升，最终实现真正的品质提升.</em></p>
            </div>
            <div class="clear"></div>

            <div class="media">
                <video width="100%" height="100%" controls>
                    <source src="demos/seqLevel.mp4" type="video/mp4">
                </video>
            </div>
            <div class="notion">
                <h3 >Proposed</h3>
                <p>Transcription: <em>随着年轻人聚集，县城就可以不断滚动提升，最终实现真正的品质提升.</em></p>
            </div>
            <div class="clear"></div>
        </div>

        <div class="single-demo">
            <div class="subtitle">
                <h1>Transfer to Unseen Speaker and Language (English)</h1>
                <p>The proposed S2A model is only trained on Madarian dataset.</p>
                <p>The compared baseline (LipSync3D) is trained on English dataset.</p>
                <p>Left: proposed. Right: LipSync3D.</p>
            </div>
            <div class="media">
                <video width="100%" height="100%" controls>
                    <source src="demos/comparison.mp4" type="video/mp4">
                </video>
            </div>
            <div class="notion">
                <h3 >Transcription</h3>
                <p><em>I can’t promise that I’ll be an expert at it and be able to help you get better with it, but at least we can have some fun.</em></p>
            </div>
            <div class="clear"></div>
        </div>

        <div class="single-demo">
            <div class="subtitle">
                <h1>Alibility to Sing</h1>
                <p>Only trained on Mandarin talking dataset.</p>
            </div>
            <div class="media">
                <video width="100%" height="100%" controls>
                    <source src="demos/sing.mp4" type="video/mp4">
                </video>
            </div>
            <div class="notion">
                <h3 >Lyric</h3>
                <p><em>终于做了这个决定，别人怎么说我不理，只要你也一样的肯定. —— <a id="link_lyric" target="_blank" href="https://youtu.be/BVpXUyXPKOg?t=21"> 《勇气》</a></em></p>
            </div>
            <div class="clear"></div>
        </div>
    </div>

</body>
</html>