Voice Input Computer Navigation System

Follow the full discussion on Reddit.
I need advice. I'm a third year CS student, and want to make a Voice Assistant to control and navigate a PC, specifically aimed for blind people in Pakistan, for my semester project. It'll take urdu voice input and handle basic things like searches and selection, but then because it's aimed at the blind, also things like current position, what's on the screen at the moment, reading out articles, rereading what's already been entered into the search bar etc. I have a basic idea of the things I'll need, like an urdu speech recognition library, text-to-speech convertor for the assistant to be able to speak and the PyAutoGUI Library to actually control the cursor and keyboard input etc., but honestly apart from that I'm completely lost. Is this project too ambitious or too basic? What else will I need? Should I have an assistant that only recognises set commands like go back, read etc. Or should I try to go for an assistant that understands the semantics of whatever the user says and grasps what it needs to do from that? Is that even possible? Also should I restricted the scope to just a browser, like as a chrome extension handling searches and websites etc., or general computer control?

Comments

There's unfortunately not much to read here yet...

Discover the Best of Machine Learning.

Ever having issues keeping up with everything that's going on in Machine Learning? That's where we help. We're sending out a weekly digest, highlighting the Best of Machine Learning.

Join over 900 Machine Learning Engineers receiving our weekly digest.

Best of Machine LearningBest of Machine Learning

Discover the best guides, books, papers and news in Machine Learning, once per week.

Twitter