Automated quantitative analysis of vocal fold vibration using two-dimensional scanning videokymography after transoral laser microsurgery

Article information

Clin Arch Commun Disord. 2021;6(1):39-47
Publication date (electronic) : 2021 April 30
doi : https://doi.org/10.21849/cacd.2021.00374
1Department of Speech and Language Pathology, Kosin University, Busan, Korea
2Department of Speech and Hearing Therapy, Catholic University of Pusan, Busan, Korea
Correspondence: Hee-June Park, Department of Speech and Hearing Therapy, Catholic University of Pusan, 74 Oryundae-ro, Geumjeong-gu, Busan 46265, Korea, Tel: +82-51-510-0846, Fax: +82-51-510-0848, E-mail: june@cup.ac.kr
Received 2021 April 5; Revised 2021 April 15; Accepted 2021 April 20.

Abstract

Purpose

The purpose of this study was to develop a quantitative automated analysis system for two-dimensional videokymography (2D VKG) using the threshold segmentation method, and to analyze the vocal fold vibration characteristics of vocally healthy subjects and patients who underwent CO2 transoral laser microsurgery using this system, and to examine in accuracy differences between automatic and automated analysis methods.

Methods

29 male patients who underwent CO2 transoral laser microsurgery and 10 vocally healthy male participated in the study. Quantitative analysis such as opening quotient (OQ), phase symmetry index (PSI), amplitude symmetry index (ASI) of 2D VKG images was performed using an automated analysis program in which glottal area extraction by the threshold segmentation method and edge detection using a manual plotting technique were mixed.

Results

Automatic analysis enabled accurate quantitative analysis in 76.9% (30/39) of the total image. The automated analysis corrected by the manual plotting technique on the sample showing gross error showed higher accuracy than the automatic analysis. The results of automated and automatic analysis were not statistically different in ASI, but there were differences in OQ and PSI. In addition, vocally healthy group and patients group showed statistical significance in all parameters (OQ, ASI, PSI).

Conclusions

The 2D VKG, which can evaluate the vibration of the entire vocal fold tissue in real time, can be usefully utilized as a laryngeal imaging technique that visualizes the structure and function of the vocal fold. Quantitative analysis using automated and automatic analysis can increase the clinical usability of 2D VKG.

INTRODUCTION

Direct observation of the structure and function of the vocal fold is an important process for understanding the normal vocalization process and evaluating speech impairment. In particular, for malignant lesions such as glottis cancer, the transformation rate from premalignant lesions to malignant is 6–22% [1], which increases the severity when diagnosis is delayed, so early detection of lesions through laryngeal imaging is of utmost importance. To this end, various digital imaging techniques have been developed to obtain more accurate images of the structure and vibrational patterns of vocal fold tissue, and they are useful not only for diagnosis of lesions, but also for observing the progress of treatment [2].

Imaging technology using a laryngoscope has been continuously developed since Manuel Garcia directly observed his larynx, and high-resolution laryngeal videoendoscopy is currently the most widely used in clinical practice.

However, since the camera of laryngeal videoendoscopy is not suitable to observe the vocal fold vibrating very rapidly with the naked eye, various methods have been studied to more accurately evaluate the vocal fold dynamics and morphological evaluation such as laryngeal videostroboscopy (LVS) [3], high speed videoendoscopy (HSV) [4] and digital kymography (DKG) [5], line scanning videokymography (VKG) [6], and 2D scanning videokymography such as (2D VKG) [7] such as functional evaluation has been developed.

The LVS and HSV systems, which are commonly used in vocal fold dynamics evaluation, are easy to evaluate vocal fold vibrations, which play an important role in voice production. However, since the vocal fold vibration image of LVS is an illusion image and requires synchronization with the trigger, it is difficult to identify the mucous membrane waves of the vocal fold in aperiodic voices whose periodicity is severely collapsed [8]. HSV is recognized as a useful tool because it can directly observe the actual vocal cord vibration, but it has the disadvantage in that it takes a lot of time to save and play data[9]. Due to these limitations, DKG was developed, but there is a limitation in that only a specific line is extracted from the entire laryngeal image, and because it is obtained by post-processing HSV images, is time consuming [10].

On the other hand, unlike DKG, 2D VKG can observe the movement of the entire vocal fold in real time, thus preventing unnecessary time consuming and objectively measuring specific variables related to vocal fold vibration [11]. Compared to VKG extracting 7,200 lines/s at a resolution of 240 pixels, 2D VKG extracts full HD (1,920×1,080 pixels) images at 30 frames/s at 32,400 lines/s. And it can prevent distortion due to patient movement in VKG [12].

The tools mentioned above provide spatial or temporal information in various ways, and in order to use this information in clinical practice, a quantitative evaluation capable of quantifying vocal fold vibration characteristics is necessary. VLS [13], VKG [14], HSV [15], DKG [16,17], 2D DKG [10], and other measurement variables for various laryngeal image methods have been studied, and 2D VKG also has absence of vibration of vocal fold, duration of glottal closure, left-right asymmetry, shape of lateral peaks, laterally traveling mucosal wave, opening versus closing duration, shape of medial peak, cycle aberrations Various parameters can be used to measure the characteristics of vocal fold vibration [11,12].

These parameters can be analyzed more effectively using an analysis system. 2D VKG also needs to develop a quantitative analysis system such as analysis system of VKG [18], GAW, phonovibrogram [19], HSV [20], and DKG [15] in order to quantitatively evaluate vocal fold vibration using objective parameters.

Recently, various automatic analysis systems based on digital image processing techniques have been developed and show a high level of image recognition [15,19]. However, in automatic analysis system, the accuracy of the analysis result is often inferior even though image recognition is successful due to the influence of edge extraction algorithm, light source size and shape of the vocal fold, pathological conditions, and mucus. For this reason, automated quantitative analysis using a manual plotting technique is also recognized as an effective method [2022]. In this study, the automatic analysis system is a system that automatically processes from segmentation and edge extraction to parameter calculation, and the automated analysis system is a system in which parameter values are calculated by designating a point in the program image. Was defined as calculating all tasks by hand.

In order to properly utilize 2D VKG in the clinical field, it is required to develop an analysis program capable of accurate and rapid quantitative evaluation. In addition, due to the lack of research by disease on 2D VKG, studies on various diseases and targets are needed for the evidence-based practice of 2D VKG.

Therefore, the purpose of this study is to develop an analysis system in which automatic and automated analysis methods are mixed for quantitative analysis of 2D VKG, and to quantitatively compare the vocal fold vibration characteristics of vocally healthy subjects and patients undergoing transoral laser microsurgery.

METHODS

Subjects

29 male patients (mean 53.2±9.8 years) diagnosed with T1–2 grade unilateral glottis cancer or unilateral laryngeal leukoplakia in otolaryngology and underwent CO2 transoral laser cordectomy January to August 2020, and 10 vocally heathy male (mean 45.0±4.8 years) males as a control group was participated in this study. Subjects whose glottal area was difficult to observe due to false vocal fold over-adduction were excluded from the study.

Instrumentation

Camera system

2D VKG camera (USC-710HD, U-medical, Korea) with a complementary metal-oxide semiconductor (CMOS) video sensor and rolling shutter was connected to a 4 mm rigid 70° endoscope (8700 CKA, Storz, Germany) via 16–34 mm zoom coupler (MGB, Germany). Recording was performed under illumination with a 300-watt xenon light source (NOVA 300, Storz, Germany) at a frame rate of 25 frame per second and 32,400 line per seconds, with a spatial resolution of full HD (1,920×1,080 pixel) for obtain the videokymographic images of the entire vocal fold. 2D VKG camera can be set to 25, 30, 50, or 60 frames per second, depending on the needs of the examiner. The auditory feedback was provided using the application (Function generator, Keuwl, UK) of the android smartphone (Figure 1).

Figure 1

Schematic illustration of examination using two-dimensional scanning kymography.

2D VKG analysis program

The user interface of the program is implemented as shown in Figure 1 using C++ of OpenCV (Figure 2). The system’s software can record, play back, and save 2D VKG images as still images, and can analyze parameters by post-processing. Existing post-analysis of 2D VKG was performed using the hand-operated method on still images, but it is difficult to properly analyze the dynamic movements of the vocal fold in a single image, so analysis on images is required and automated analysis is required. In this study, for this purpose, a method of automatically extracting a kymographic region from a video without conversion to a still image was used, and a moving kymographic image was corrected using a tracking algorithm (Figures 25). Edge extraction and parameter calculation are performed automatically for each frame of the image, and automated analysis that corrects the edges using a manual plotting technique is possible if necessary.

Figure 2

Graphical user interface for image processing and calculation of parameters. (1) Patient information, (2) Image load, (3) Original image window, (4) Convert image window, (5) Result kymography, (6) Still image window, (7) Movement frame, (8) Print button, (9) Data_clear button, (10) Object parameter, (11) Subject parameter.

Figure 3

A proposed flow chart of extraction algorithm for automatic tracing of the two-dimensional video kymography.

Figure 4

The calculating method of characteristic parameters in the typical 2D scanning VKG image. T, Vibration period of cycles; To and Tc, open and closed duration of the glottis; a1 and a2, open amplitude of the left and right vocal fold; m1 and m2, mucosal wave of the left and right vocal fold; t1 and t2, moments when the left and right vocal fold reach the maximum open amplitude, respectively, in the same cycle. The parameters for quantitative analysis are calculated as follows. Fundmental frequency (Hz)=Tentire/T×(frame per secends), Phase symmetric index=(t1+t2 )/T, Amplitude symmetric index=(a1a2 )/(a1+a2 ), Open quotient=To/T, Close quotient=Tc/T.

Figure 5

The sequential image of mucosal wave from two-dimensional scanning videokymography in subject with CO2 laser cordectomy (Male/36).

In order to analyze the video of 2D VKG, analysis was performed by dividing into image processing steps as shown in Figure 3. In this study, the Hue, Saturation, Value (HSV) color model conversion was used to identify the problems of the Red Green Blue (RGB) color conversion model and improve the reliability of the image. First, each color object is declared from the image in the input image and the reference image, and the value of each HSV is extracted using the color object. The extracted HSV value is divided through a threshold value corresponding to an angle, and is stored in respective arrays corresponding to the input image and the reference image. The values of each array are summed and converted into absolute values, compared with the reference value, and if the corresponding reference is exceeded, it is extracted as a key frame.

After inversely converting the 2D VKG image to gray scale, image blur by Gaussian profile was performed in order to blur the reflection and bright appearance of liquid secretions present in the vocal fold due to a strong light source (Formula 1).

(Formula 1) g(x,y)=12πσ2e-x2+y22σ2

After that, the objects contained in the image were separated using Otsu’s method to obtain 2D VKG image information from the image. Otsu is used to perform image thresholding based on automatic clustering or to reduce grayscale images in binary images. Using statistical methods, the total variance can be expressed as the sum of within class variance and between class variance (Formula 2).

(Formula 2) σ2=σω2+σb2σω2(t)=ω1(t)σ12(t)+ω2(t)σ22(t)σb2(t)=σ2-σω2(t)=ω1(t)ω2(t)[μ1(t)-μ2(t)]2

Yan, Chen, and Bless (2006) calculated the global threshold from the intensity histogram of the entire image, but in this study, segmentation was performed by applying the Otsu’s method to the intensity histogram only inside the region of interest.

A sequential algorithm was used for connected component labeling to separate and display each independent area of a binary image by its own label value. Binaryization was performed on the videokymography image to generate a binary image, and blob coloring was performed on the generated binary image, and the results were checked.

Data acquisition

2D VKG examination were performed on subject who underwent CO2 laser microsurgery and subject with vocally healthy, and subjects with post CO2 laser microsurgery were examined one month after surgery. In order to acquire 2D VKG imaging, an endoscope was inserted into the subject’s oral cavity by otolaryngologist and a sustained vowel /i/ or /e/ was spoken for 5 seconds. Since the 2D VKG is a rolling shutter method, a sharper image can be obtained at a frequency corresponding to a multiple of the camera frame rate [23]. To this end, voices of multiple frequencies (100, 125, 150, 175, 200, 225, 250 Hz) close to the frame rate of the 2D VKG camera were provided as auditory feedback and induce vocalization with the same sound.

Data analysis

The recorded images were loaded with 2D VKG analysis software to perform quantitative analysis. The tilt of the image that may occur during examination with a laryngoscope can increase the error rate in calculating parameters. To correct this, the 2D VKG image was post-processed so that the straight line between the anterior commissure of the vocal fold and the posterior vocal process was perpendicular to the horizontal center line.

In this study, as shown in Figure 4, threshold segmentation method based automatic quantitative analysis was performed on open quotient (OQ), phase symmetry index (PSI), amplitude symmetry index (ASI), objective parameters of 2D VKG, whose clinical usefulness was verified in previous studies [11,12].

When there is a gross error exceeding ±5 pixels at the glottal edge of the segmented 2D VKG image by digital image processing, an automated quantitative analysis was performed that corrected the edge using a manual plotting technique.

Statistics analysis

Independent t test was performed to compare non-inferiority test between with automatic analysis and automated analysis, and the difference with quantitative were considered significant and the statistical analysis was using SPSS version 21.0 (SPSS Inc., Chicago, IL, USA).

RESULTS

Vibratory characteristic of vocal fold with fibrotic change on 2D VKG

Figure 5 is a sequential image of vocal fold vibration of a patient with fibrotic changes in the superficial lamina propria of left vocal fold after laser microsurgery. It is can be observed that but incomplete glottal closure, amplitude different by decreased amplitude of left vocal fold, and the phase difference of left and right vocal fold. There is a difference in ASI and PSI, but frequency of left and right vocal fold is the same, and it can be seen that the vibrating cycle increases from low to high pitch.

Quantitative analysis on 2D VKG

From 2D VKG images that can acquire kymographic images of the entire vocal fold, the region of interest and glottal edge were extracted using the Otsu’s method and quantitative analysis was performed. Forty samples were analyzed by automatic analysis, and 39 were properly segmented except for one with insufficient light intensity. As a result of visual-perceptual analysis of images, fine errors within ±5 pixels were observed in 30 out of 39, enabling a relatively accurate analysis. Such errors are caused when the edges are unclear due to the influence of nearby pixels when the images overlap. In addition, gross errors exceeding ±5 pixels were observed in 9 out of 39, and these errors appeared in cases where no edge was found in the posterior commissure or segmented based on the mucosa of the anterior commissure.

The quantitative analysis result using automatic analysis and the quantitative analysis result of automated analysis correcting the error of edge extracted from automatic analysis with manual plotting technique were compared with non-inferiority test (Table 1). Automatic analysis and automatic analysis are 0.59 (SD=0.14), 0.64 (SD=12) in OQ, 0.50 (SD=0.08), 0.48 (SD=0.12) in PSI, 0.68 (SD=0.12), 0.72 (SD=0.24) in ASI, respectively, and there was statistically significant difference in OQ and PSI, but not in ASI.

Non-inferiority trial between automatic and automated quantitative analysis in patients with underwent laser microsurgery

Table 2 shows the results of calculating the 2D VKG images of the two groups by automated analysis. The results of vocally healthy group and the transoral laser microsurgery group are 0.06 (SD=0.02), 0.64 (SD=12) in OQ, 0.10 (SD=0.04), 0.48 (SD= 0.12) in PSI, 0.08 (SD=0.04), 0.72 (SD=0.24) in ASI, respectively, and statistically significant differences were shown in all parameter.

Quantitative comparison between vocally healthy and transoral laser microsurgery group

DISCUSSION

In this study, we developed an analysis system capable of quantitative analysis by automatic and automated methods by recognizing 2D VKG videos, and quantitatively analyzed the vocal fold vibration characteristics of patients undergoing transoral laser microsurgery using this system.

Various laryngeal imaging modalities using a laryngoscope have been used in the detection and observation of the prognosis of early glottis cancer. Among them, the laryngeal endoscope is relatively easy to use and can obtain high-quality vocal fold images, so it is used as a key procedure in clinical practice along with a biopsy. However, since it is not suitable for tracking vocal fold vibration, it is difficult to observe epithelial changes or to discriminate benign tumors from malignant tumors [2].

Since LVS can observe not only morphological peculiarities but also functional vocal fold vibrations, quantitative analysis using this is useful for observing changes after treatment such as radiation and phonosurgery. However, since the LVS image is provided as a processed image rather than the actual vocal fold vibration, it is difficult to acquire stroboscopic images in severe dysphonic voice, diplophonia, spasm, voice onset, and vocal function at offset, where periodicity is difficult to detect [24]. HSV and HSV post-processing methods are useful because they can observe actual voice vibration and can make various modifications, but have a disadvantage in that it takes a lot of time to obtain the final result. VKG, which can observe actual vocal fold vibration in real time, has been developed, but has a limitation in extracting only a specific pixel line from the entire laryngeal image [3].

On the other hand, 2D VKG can observe the movement of the entire vocal fold in real time and prevent distortion due to patient movement that occurs in line scanning VKG, and provided both spatial and temporal information in one image at the same time, so intuitive analysis is possible [11,25]. In addition, Since object parameters such as F0, OP, CQ, speed quotient, PSI, ASI, mucosal wave magnitude difference, and glottal area asymmetry index and subjective parameter such as mucosal wave, cycle to cycle variability, absence of vibration, interference of surrounding, lateral peak, medial peak, and cycle aberration can be measured, it can be usefully used for evidence-based intervention [12]. These numerical results can be used not only for normal vocal fold analysis, but also for precise evaluation of vocal fold diseases, and objective comparisons in treatment prognosis comparisons.

In previous studies, quantitative analysis using various objective parameters in 2D VKG images was attempted and was effective in evaluating aperiodic vocal fold vibration [12,23,26]. However, in the existing post-analysis study, it was difficult to evaluate the dynamic vocal fold movement by performing a quantitative analysis on a single still image, and the clinical use was limited due to the analysis using a hand-operated method.

In order to compensate for this, this study programmed to automatically extract the outline of the kymogram image with the threshold segmentation method by importing the entire video only once, rather than the method of importing and analyzing several black and white still images individually. A tracking algorithm was used to obtain an image of a desired area. In addition, a function to modify the point of the manually extracted glottal edge was added, and parameters were automatically calculated based on the extracted area.

A quantitative evaluation of the existing objective parameters is important, but the variation of the periodic period that is judged by the actual movement of the kymography movement, the vibration-free region of the vocal fold, the interference around the vocal fold, the inner and outer boundaries of the vocal fold, the pattern of mucous membrane waves, periodic abnormalities, etc. Since the subjective parameters of are very important, the interface for the subjective evaluation items was constructed so that the inspector could easily input these items, and the output and storage were also possible.

In order to see the validity of the automatic analysis, a more accurate analysis was possible in the automated analysis as a result of comparing the result of the automatic analysis and the result of the automated analysis corrected by the manual plotting technique for the frame where the segmentation error was observed. 97.37% of the images were recognized, but there was a gross error in 21.05% [22]. In the study of Moukalled [22], the success rate of image recognition by kymography was 98%, whereas the post-analysis of the recognized image showed 76% accuracy. However, there was no statistically significant difference in the comparison of quantitative analysis. In the case of viewing an area like GAW, the effect of mucosa, etc., affects the overall result a lot, but the measurement of the parameter using the width at the vertical pixel row level was relatively less affected by the error.

The method of extracting glottal edges using computer vision based analysis method and automatically processing even parameter analysis can improve clinical usability by enabling rapid analysis. However, threshold segmentation methods may not properly separate the pixel intensity of the glottal area due to the difference in brightness when the resolution is low or when upper-low lip of vocal fold appear simultaneously, and the vocal fold mucus is recognized as vocal fold tissue. Can occur [21,22,27].

As a result of examining the vocal fold vibrations of patients who underwent transoral laser microsurgery using the automated analysis method, ASI, PSI, and OQ showed differences from those of the normal group. In the case of patients who underwent transoral laser microsurgery, the mucosal wave decreases because the epithelium of the vocal ligament is dissected (medical findings related to vocal fold vibration). These patients mainly showed incomplete closure of the vocal fold and decreased mucosal wave and within-cycle asymmetries in phase, amplitude, and closure axis compared to the normal group [28,29]. However, in the case of type I or type II, it has been reported that the voice may improve after 6 months and restore the same level of voice as radiation treatment [30].

The evaluation using 2D VKG was effective in evaluating the irregularity of the vocal fold vibration, but due to the characteristics of the rolling shutter camera, more cycle tracking was possible only by exposing vocal fold to the camera as much as possible. In addition, depending on the frame rate of the camera, the vocal fold overlapped in the image, causing an error in the segmentation of the glottal area [31]. In future studies, it is necessary to apply a method to quantify vertical phase differences, and to improve the accuracy of automatic analysis by applying an image processing method that can discriminate mucosa of anterior commissures. In future studies, it is necessary to apply a method to quantify vertical phase differences, and to improve the accuracy of automatic analysis by applying an image processing method that can discriminate mucosa of anterior commissures.

CONCLUSION

Quantitative analysis of 2D VKG images was effective in evaluating the asymmetric vibration pattern of the entire vocal fold tissue. In addition, the threshold segmentation using Otsu’s method properly recognized the glottal area, and the automated analysis with the addition of the manual plotting technique improved the accuracy of the inspection by accurately segmenting the glottal edge. Therefore, 2D VKG will be useful as a laryngeal imaging technique to visualize the structure and function of the vocal fold, and automated analysis is expected to increase the clinical usability of 2D VKG.

References

1. Moukalled HJ. Segmentation of laryngeal high-speed videoendoscopy in temporal domain using paired active contours Columbia, SC: University of South Carolina; 2009. p. 1000–1004.
2. Yamauchi A, Yokonishi H, Imagawa H, Sakakibara K-I, Nito T, Tayama N, et al. Quantitative analysis of digital videokymography: a preliminary study on age-and gender-related difference of vocal fold vibration in normal speakers. Journal of Voice 2015;29:109–19.
3. Park HJ, Sohn UJ, Kim GH, Bae IH, Wang SG, Cho JK, et al. Clinical application of endoscopy using smartphone. Head and Neck Surgery 2016;27:103–111.
4. Manfredi C, Bocchi L, Cantarella G, Peretti G. Videokymographic image processing: objective parameters and user-friendly interface. Biomedical Signal Processing and Control 2012;7:192–201.
5. Sercarz JA, Berke GS, Arnstein D, Gerratt B. A new technique for quantitative measurement of laryngeal videostroboscopic images. Archives of Otolaryngology-Head & Neck Surgery 1991;117:871–5.
6. Kim GH, Wang SG, Lee BJ, Park HJ, Kim YC, Kim HS, et al. Real-time dual visualization of two different modalities for the evaluation of vocal fold vibration - Laryngeal videoendoscopy and 2D scanning videokymography: Preliminary report. Auris Nasus Larynx 2017;44:174–181.
7. Mannelli G, Cecconi L, Gallo O. Laryngeal preneoplastic lesions and cancer: challenging diagnosis. Qualitative literature review and meta-analysis. Critical Reviews in Oncology/Hematology 2016;106:64–90.
8. Woo P. Quantification of videostrobolaryngoscopic findings-measurements of the normal glottal cycle. The Laryngoscope 1996;106(S79):1–27.
9. Kim GH, Wang SG, Lee BJ, Park HJ, Kim YC, Kim HS, et al. Real-time dual visualization of two different modalities for the evaluation of vocal fold vibration-Laryngeal videoendoscopy and 2D scanning videokymography: Preliminary Report 2016;
10. Tigges M, Wittenberg T, Mergell P, Eysholdt U. Imaging of vocal fold vibration by digital multi-plane kymography. 1999;23:323–330.
11. Ahn D, Park JH, Heo SJ, Park CM, Jung DJ, Nam Y, et al. Result of laser cordectomy in early glottic cancer and observation of malignant transformation from precancerous lesion. Korean Journal of Otorhinolaryngology - Head and Neck Surgery 2010;53:425–429.
12. Stoeckli SJ, Schnieper I, Huguenin P, Schmid S. Early glottic carcinoma: treatment according patient’s preference? Head & Neck: Journal for the Sciences and Specialties of the Head and Neck 2003;25(12):1051–1056.
13. Voigt D, Döllinger M, Yang A, Eysholdt U, Lohscheller J. Automatic diagnosis of vocal fold paresis by employing phonovibrogram features and machine learning methods. Computer Methods and Programs in Biomedicine 2010;99(3):275–288.
14. Krausert CR, Ying D, Zhang Y, Jiang JJ. Quantitative study of vibrational symmetry of injured vocal folds via digital kymography in excised canine larynges. Journal of Speech, Language, and Hearing Research 2011;54:1022–38.
15. Wang SG, Lee BJ, Lee JC, Lim YS, Park YM, Park HJ, et al. Development of two-dimensional scanning videokymography for analysis of vocal fold vibration. The Korean Society of Laryngology, Phoniatrics and Logopedics 2013;24:107–111.
16. Deliyski DD, Hillman RE. State of the art laryngeal imaging: research and clinical implications. Current Opinion in Otolaryngology & Head and Neck Surgery 2010;18:147.
17. Sjögren EV, van Rossum MA, Langeveld TP, Voerman MS, van de Kamp VA, Friebel MO, et al. Voice outcome in T1a midcord glottic carcinoma: laser surgery vs radiotherapy. Archives of Otolaryngology–Head & Neck Surgery 2008;134:965–972.
18. Mehta DD, Deliyski DD, Quatieri TF, Hillman RE. Automated measurement of vocal fold vibratory asymmetry from high-speed videoendoscopy recordings. Journal of Speech, Language, and Hearing Research 2011;54:47–54.
19. Wang SG, Park HJ, Lee BJ, Lee SM, Ko BJ, Lee SM, et al. A new videokymography system for evaluation of the vibration pattern of entire vocal folds. Auris Nasus Larynx 2016;43(3):315–321.
20. Schutte HK, Ŝvec JG, Ŝram F. Videokymography: research and clinical issues. Logopedics Phoniatrics Vocology 1997;22(4):152–156.
21. Park HJ, Cha Wjae, Kim GH, Jeon GR, Lee BJ, Shin BJ, et al. Imaging and analysis of human vocal fold vibration using two-dimensional (2D) scanning videokymography. Journal of Voice 2016;30:345–353.
22. Krausert CR, Olszewski AE, Taylor LN, McMurray JS, Dailey SH, Jiang JJ. Mucosal wave measurement and visualization techniques. Journal of Voice 2011;25:395–405.
23. Lohscheller J, Toy H, Rosanowski F, Eysholdt U, Döllinger M. Clinically evaluated procedure for the reconstruction of vocal fold vibrations from endoscopic digital high-speed videos. Medical Image Analysis 2007;11:400–413.
24. Hirano M, Bless DM. Videostroboscopic examination of the larynx. Singular 1993;
25. Patel R, Dailey S, Bless D. Comparison of high-speed digital imaging with stroboscopy for laryngeal imaging of glottal disorders. Annals of Otology, Rhinology & Laryngology 2008;117:413–424.
26. Hirose H. High-speed digital imaging of vocal fold vibration. Acta Oto-Laryngologica 1988;105(sup458):151–153.
27. Švec JG, Šram F, Schutte HK. Videokymography in voice disorders: what to look for? Annals of Otology, Rhinology & Laryngology 2007;116:172–180.
28. Mehta DD, Deliyski DD, Zeitels SM, Quatieri TF, Hillman RE. voice production mechanisms following phonosurgical treatment of early glottic cancer. Annals of Otology, Rhinology & Laryngology 2010;119:1–9.
29. Kang DH, Wang SG, Park HJ, Lee JC, Jeon GR, Choi IS, et al. Real-time simultaneous DKG and 2D DKG using high-speed digital camera. Journal of Voice 2017;31:247.e1–247.e7.
30. Chodara AM, Krausert CR, Jiang JJ. Kymographic characterization of vibration in human vocal folds with nodules and polyps. The Laryngoscope 2012;122:58–65.
31. Wang SG, Park HJ, Cho JK, Jang JY, Lee WY, Lee BJ, et al. The first application of the two-dimensional scanning videokymography in excised canine larynx model. Journal of Voice 2016;30:1–4.

Article information Continued

Figure 1

Schematic illustration of examination using two-dimensional scanning kymography.

Figure 2

Graphical user interface for image processing and calculation of parameters. (1) Patient information, (2) Image load, (3) Original image window, (4) Convert image window, (5) Result kymography, (6) Still image window, (7) Movement frame, (8) Print button, (9) Data_clear button, (10) Object parameter, (11) Subject parameter.

Figure 3

A proposed flow chart of extraction algorithm for automatic tracing of the two-dimensional video kymography.

Figure 4

The calculating method of characteristic parameters in the typical 2D scanning VKG image. T, Vibration period of cycles; To and Tc, open and closed duration of the glottis; a1 and a2, open amplitude of the left and right vocal fold; m1 and m2, mucosal wave of the left and right vocal fold; t1 and t2, moments when the left and right vocal fold reach the maximum open amplitude, respectively, in the same cycle. The parameters for quantitative analysis are calculated as follows. Fundmental frequency (Hz)=Tentire/T×(frame per secends), Phase symmetric index=(t1+t2 )/T, Amplitude symmetric index=(a1a2 )/(a1+a2 ), Open quotient=To/T, Close quotient=Tc/T.

Figure 5

The sequential image of mucosal wave from two-dimensional scanning videokymography in subject with CO2 laser cordectomy (Male/36).

Table 1

Non-inferiority trial between automatic and automated quantitative analysis in patients with underwent laser microsurgery

Parameters Method (n=29) Faired differences

Mean SD t Sig
Opening quotients Automatic 0.59 0.14 3.485 .02
Automated 0.64 0.12

Phase symmetry index Automatic 0.50 0.08 2.986 .01
Automated 0.48 0.12

Amplitude symmetry index Automatic 0.68 0.18 1.485 .06
Automated 0.72 0.24

Table 2

Quantitative comparison between vocally healthy and transoral laser microsurgery group

Parameters Subjects (n=10) Faired differences

Mean SD t Sig
Opening quotients Vocally healthy 0.06 0.02 6.122 .01
TLM 0.64 0.12

Phase symmetry index Vocally healthy 0.10 0.04 4.152 .01
TLM 0.48 0.12

Amplitude symmetry index Vocally healthy 0.08 0.04 2.184 .03
TLM 0.72 0.24

TLM, transoral laser microsurgery.