SIMO: An Automatic Speech Recognition System for Paperless Manufactures

Article Preview

Abstract:

Despite environmental general conscience, heavy use of paper is still one fact in nowadays factories. The shorter the manufacturing production, the greater the tendency to employ paper to support quality tracking of pieces; using it to register measurements or nonconformities. This tendency increases drastically in some manufactures like aerospace, where typical production ratios vary between 9 and 18 subassemblies per month. The current work presents an automatic speech recognition system, meant to replace paper by a digitalized version of the manual writing task. The work presents (i) industrial use cases with benefits and requirements; (ii) the system architecture, including several tested free Automatic Speech Recognition modules, their analysis; and (iii) some open-source supporting modules that improves its functionality. The work concludes presenting several tests, showing the system performance against different kind of industrial noises, low to high quality microphones and users with different dialects.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

129-139

Citation:

Online since:

October 2023

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2023 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Djassemi, M., and Sena, J. A. (2006). "The paperless Factory: A review of issues and technologies." International Journal of Computer Science and Network Security, 6(12), 185.

Google Scholar

[2] Hawley, J., and Mackowiak, B. (1991). "Paperless assembly using touchscreen based graphics." Eleventh IEEE/CHMT IEMT Symposium, pp.231-234, IEEE.

DOI: 10.1109/iemt.1991.279784

Google Scholar

[3] Price, Tom, et al. (2012). "Paperless Track Inspection Record Keeping and Compliance." Rail Transportation Division Conference. Vol. 45073. American Society of Mechanical Engineers.

DOI: 10.1115/rtdf2012-9432

Google Scholar

[4] Zakoldaev, D. A., A. V. Shukalov, and I. O. Zharinov. (2019). "From Industry 3.0 to Industry 4.0: production modernization and creation of innovative digital companies." IOP conference series: Materials Science and Engineering. Vol. 560. No. 1. IOP Publishing.

DOI: 10.1088/1757-899x/560/1/012206

Google Scholar

[5] Helmke, H., et al. (2017). "Increasing atm efficiency with assistant based speech recognition." Proc. of the 13th USA/Europe Air Traffic Management Research and Development Seminar.

Google Scholar

[6] Beato, B., et al. (2011). "Going paperless: implementing an electronic laboratory notebook in a bioanalytical laboratory." Bioanalysis, 3(13), 1457-1470.

DOI: 10.4155/bio.11.117

Google Scholar

[7] Zinchenko, K., Wu, C. Y., and Song, K. T. (2016). "A study on speech recognition control for a surgical robot." IEEE Transactions on Industrial Informatics, 13(2), 607-615.

DOI: 10.1109/tii.2016.2625818

Google Scholar

[8] Elazzazi, M., Jawad, L., Hilfi, M., and Pandya, A. (2022). "A Natural Language Interface for an Autonomous Camera Control System on the da Vinci Surgical Robot." Robotics 2022, 11, 40.

DOI: 10.3390/robotics11020040

Google Scholar

[9] Pocketsphinx repository from CMU Sphinx.https://github.com/cmusphinx/pocketsphinx. Last Access: 2023-01-16.

Google Scholar

[10] Manasa, C. S., Priya, K. J., and Gupta, D. (2019). "Comparison of acoustical models of GMMHMM based for speech recognition in Hindi using PocketSphinx." 3rd International Conference on Computing Methodologies and Communication (ICCMC), pp.534-539. IEEE.

DOI: 10.1109/iccmc.2019.8819747

Google Scholar

[11] Povey, D., et al. (2011). "The Kaldi speech recognition toolkit." IEEE 2011 workshop on automatic speech recognition and understanding (No. CONF). IEEE Signal Processing Society.

DOI: 10.1109/asru.2011.6163923

Google Scholar

[12] Hannun, Awni, et al. (2014). "Deep speech: Scaling up end-to-end speech recognition." arXiv preprint arXiv:1412.5567.

Google Scholar

[13] Mozilla Common Voice Dataset. https://commonvoice.mozilla.org/en. Last Access: 2023-01- 16.

Google Scholar

[14] Vosk Website. https://alphacephei.com/vosk/. Last Access: 2023-01-16.[15] Radford, A., et al. (2022). "Robust speech recognition via large-scale weak supervision." arXiv preprint arXiv:2212.04356.

Google Scholar

[16] Errattahi, R., Hannani, A.E., and Ouahmane, H. (2015). "Automatic Speech Recognition Errors Detection and Correction: A Review." International Conference on Natural Language and Speech Processing.

DOI: 10.1016/j.procs.2018.03.005

Google Scholar

[17] McCowan, I.A., Moore, D., Dines, J., Gatica-Perez, D., Flynn, M., Wellner, P., Bourlard, H. (2004). "On the use of information retrieval measures for speech recognition evaluation." Technical Report. IDIAP.

Google Scholar

[18] Mostefa, D., Hamon, O., and Choukri, K. (2006). "Evaluation of Automatic Speech Recognition and Speech Language Translation within TC-STAR: Results from the first evaluation campaign." LREC, pp.149-154.

Google Scholar

[19] JiWER: Similarity measures for automatic speech recognition evaluation. https://github.com/jitsi/jiwer. Last access: 2023-01-16.

Google Scholar