HackUMass VI Project: Braille++

Description:

Braille++ is an app that helps a visually impaired person know about the environment around them. The app understands what is happening around the user and describes the environment using sound. Furthermore, it lets the user scroll across the frame and describes the object the user has their finger on. An additional feature senses the mood and plays a relevant music track enabling them to have maximum interaction with their surrounding. For instance, it would play soothing music if the application detected a tranquil natural environment.

Inspiration:

Our team member had a pitch in a session of our ideas first. We all decided to work on this idea of creating a practical application for visually impaired people. I guess the idea of designing a product that aids people with sight through their auditory senses resonated with all of us because it would make their lives so much easier. In fact, making lives easier is precisely what new technologies are supposed to do. We were so motivated to pursue this project because of the promising applications it has in real life.

What it does:

My project uses machine learning to sense the environment and offers the users: 1. The best possible caption that describes what is going in the image 2. Describe and list all the objects in the frame 3. The option to point at objects in the image and determine objects in 2D perception

How we built it:

We used YOLO V3 and DarkFlow (DarkNet + TensorFlow) and Google Cloud Vision API for object recognition and IBM's show produce a tell image generator for captioning. We created a flask server that communicates with the client to return a JSON file with necessary details.

Challenges we ran into:

1. Captioning an image properly and finding a suitable description of the image 2. Setting up the Google cloud server connection between images taken on a phone and retrieving required data associated with it. 3. Image compression, data latency, and image scaling. Compression led to a drastic difference as the naturalization of coordinates and anchor points changed entirely. 4. TTS on android studio is quite buggy on some devices

Accomplishments that we're proud of:

I am proud of the fact that everyone in the team learned a lot in the past two days. I was a total beginner to the idea of using google cloud API and now am pretty good at it. Another most significant thing about our project that makes it stand out is that it is made for a good cause. We are creating a world that is more accessible for everyone using the technology available to us. It feels humbling that we are now capable of creating something that people can use to aid themselves in carrying out their daily tasks with more ease.

What we learned:

Like I said before, I learned a lot about using google cloud APIs. I also had to do a lot of research on Darknet and TensorFlow to achieve this. Similarly, I learned a lot of stuff on Android Development who that part of the project during implementation.

What's next:

Implementing the same for a VR head set.

Built with:

TensorFlow, Max image caption generator, YOLO, DarkFlow, Flask, Android Studio

Prizes we're going for:

Arteck HB030 Portable Keyboard

HAVIT RGB Mechanical Keyboard

Call of Duty: Black OPS 4 (XBOX ONE)

Intel® Movidius™ Neural Compute Stick

Google Home Mini

$100 Amazon Gift Cards

Hustle Award

Social Entrepreneurship Award

Lutron Caseta Wireless Kit

Oculus Go (32 GB)

Misfit Shine 2

Fujifilm Instax Mini 26

LS20 Gaming Headset

Hexacopter Drone

Raspberry Pis & PiHut Essential Kits

TBI Pro Gaming Headset

DragonBoard 410c

Grand Prize

Jetbrains Pro Software

Hacker gear & swag from HERE.com

Blu R2 Plus Smartphones

Raspberry Pi Arcade Gaming Kit

Team Members

Kunal Sheth, Vignesh Kumar, Abhijeet Pradhan, Asif Rahman

View on Github