Braille++ is an app that helps a visually impaired person know about the environment around them. The app understands what is happening around the user and describes the environment using sound. Furthermore, it lets the user scroll across the frame and describes the object the user has their finger on. An additional feature senses the mood and plays a relevant music track enabling them to have maximum interaction with their surrounding. For instance, it would play soothing music if the application detected a tranquil natural environment.
Our team member had a pitch in a session of our ideas first. We all decided to work on this idea of creating a practical application for visually impaired people. I guess the idea of designing a product that aids people with sight through their auditory senses resonated with all of us because it would make their lives so much easier. In fact, making lives easier is precisely what new technologies are supposed to do. We were so motivated to pursue this project because of the promising applications it has in real life.
My project uses machine learning to sense the environment and offers the users: 1. The best possible caption that describes what is going in the image 2. Describe and list all the objects in the frame 3. The option to point at objects in the image and determine objects in 2D perception
We used YOLO V3 and DarkFlow (DarkNet + TensorFlow) and Google Cloud Vision API for object recognition and IBM's show produce a tell image generator for captioning. We created a flask server that communicates with the client to return a JSON file with necessary details.
1. Captioning an image properly and finding a suitable description of the image 2. Setting up the Google cloud server connection between images taken on a phone and retrieving required data associated with it. 3. Image compression, data latency, and image scaling. Compression led to a drastic difference as the naturalization of coordinates and anchor points changed entirely. 4. TTS on android studio is quite buggy on some devices
I am proud of the fact that everyone in the team learned a lot in the past two days. I was a total beginner to the idea of using google cloud API and now am pretty good at it. Another most significant thing about our project that makes it stand out is that it is made for a good cause. We are creating a world that is more accessible for everyone using the technology available to us. It feels humbling that we are now capable of creating something that people can use to aid themselves in carrying out their daily tasks with more ease.
Like I said before, I learned a lot about using google cloud APIs. I also had to do a lot of research on Darknet and TensorFlow to achieve this. Similarly, I learned a lot of stuff on Android Development who that part of the project during implementation.
Implementing the same for a VR head set.
TensorFlow, Max image caption generator, YOLO, DarkFlow, Flask, Android Studio
Arteck HB030 Portable Keyboard
HAVIT RGB Mechanical Keyboard
Call of Duty: Black OPS 4 (XBOX ONE)
Intel® Movidius™ Neural Compute Stick
Google Home Mini
$100 Amazon Gift Cards
Social Entrepreneurship Award
Lutron Caseta Wireless Kit
Oculus Go (32 GB)
Misfit Shine 2
Fujifilm Instax Mini 26
LS20 Gaming Headset
Raspberry Pis & PiHut Essential Kits
TBI Pro Gaming Headset
Jetbrains Pro Software
Hacker gear & swag from HERE.com
Blu R2 Plus Smartphones
Raspberry Pi Arcade Gaming Kit