Disclaimer: This dissertation has been written by a student and is not an example of our professional work, which you can see examples of here.

Any opinions, findings, conclusions, or recommendations expressed in this dissertation are those of the authors and do not necessarily reflect the views of UKDiss.com.

IBM Watson Cognitive Service for Visual Recognition

Info: 9684 words (39 pages) Dissertation
Published: 10th Dec 2019

Reference this

Tags: Medical Technology

Mayo Clinic TeleVision iProject

There has been an advent of usage of mobile devices, to aid health care professionals in managing time and circumventing the wait time for patients, in recent years. This paper is a comprehensive detailing of our efforts towards enhancing and analyzing one such tool TeleVision, which is developed in collaboration with Mayo Clinic. The TeleVision iProject involves an ophthalmoscope tool that facilitates first hand examination of patient’s anterior segment, posterior segment and frontal structures of the eyelid, iris, lens, and cornea. It is integrated with a mobile application that delivers the result of the examination to the Mayo clinic repository securely for further assessment. The hardware responsible for capturing the human eye stores the corpus of images into the Mayo server. Our research also delves deep into the study of structuring a neural network of these images using the IBM Watson cognitive service for Visual Recognition that can be implemented on Bluemix platform. This platform-as-a-service enables direct cognitive within applications that can help us determine if a patient is potentially at risk of an illness. Subsequently, this paper also deals with the possibility of integrating Artificial Intelligence solutions in the healthcare domain spearheaded by IBM Watson.

CCS Concepts: • Information systems ➝Retrieval on Mobile Devices  Computing methodologiesMachine Learning  Computing methodologiesNeural networks

General Terms: Mobile Application, Telemedicine; Artificial Intelligence

Additional Key Words and Phrases: Camera Device; Ophthalmology; Mayo Clinic; Neural Networks; Deep learning; IBM Watson; Bluemix

PHASE I – Mayo Television Application

1.       INTRODUCTION

Owing to the portable and convenient nature of mobile device technology, there has been a rapid growth in the development of mobile applications in the medical domain. Mobile Health (mHealth) applications include the use of mobile devices in collecting health data, delivery of healthcare information to practitioners, researchers, and patients, real-time monitoring of patient vital signs, and direct provision of care (via mobile telemedicine). TeleMedicine can be broadly defined as the use of telecommunications technologies to provide medical information and services. [1]

Mayo Clinic is one of the world’s renowned medical research group known for its finest medical practices. [2] However, securing immediate medical correspondence from such a top medical group can be impractical for patients due to long waiting time between appointments and restricted availability of schedules in remote hospitals. It is imperative to circumvent this issue especially in case of intense health issues related to eyes. Currently not all medical facilities are equipped with the complex and expensive ophthalmology instruments required to accurately diagnose and triage patients with eye injuries.

To aid this purpose, the Software and Manufacturing Engineering Team at Arizona State University developed an ophthalmoscope called TeleVision, under the guidance of key players – Mayo Clinic’s Dr. Dharmendra Patel, ASU Professors Dr. Ashraf Gaffar and Dr. Jerry Gintz. TeleVision is a low cost, high resolution, portable solution for imaging the anterior segment of the eye, with a field of view from inner canthus to outer canthus in order to support diagnosis and triage of eye injury or disease by Mayo clinic specialists. This solution also renders medical service in remote medical facilities where expensive equipments or clinical expertise is not available for patient’s proper diagnosis.

The TeleVision mobile App acts as the conduit between the hardware that houses the eye imaging system with the HIPPA compliant Mayo Clinic’s repository. It is through this mobile app that the health care professionals can successfully obtain examination reports and triage patients based on the degree of severity. This solves the time management issues encountered by the patients in large medical facilities like Mayo Clinic.

C:UsersDELLDesktopAppliedMayoPosterpicsFunctionalOverview.png

Fig 1.1 Functional Overview of the Application

The TeleVision iOS Application communicates with the hardware in a secured manner via WiFi streaming and transmits the captured eye images to Mayo clinic servers via RESTful Webservice APIs. (Fig 1.1) For system security, the user is prompted to enter valid credentials to access the application’s main screen. Once authenticated, the hardware can transmit eye images taken from the eye imaging system to the TeleVision mobile App. Once the image is obtained on the App, the user has the option of either saving or discarding the image. The image will be sent to Mayo’s data repository on the server once the Save option is selected.

2.       BACKGROUND

The TeleVision iProject underwent several phases of development. This section documents the status of the project from its inception to expansion.

2.1       Design

The hardware team was tasked to develop a portable camera system that could be installed in remote areas that may not have the equipment or clinical expertise of doctors required to perform critical eye examinations and provide accurate triage for medical conditions. The camera was designed to be portable as well as light weight, and was able to focus on objects at different distances because patients have different facial contours. HIPPA compliance was met by choosing Apple devices as the primary camera device, ensuring compatibility issues between hardware and drivers of iPad or iPhone. A portable case containing a power source and lighting system with white light and blue cobalt light was designed to store the entire setup.

  • Table-top device ensures stability and safety during image capture.
  • Possesses Refined Exo Labs camera housing that facilitates optimal positioning of the device inside the base.
  • Design is modular and clip-in which provides easy exchange of components.
  • LED light ring on the device delivers maximum illumination to the entire eye.
  • Diffuser installed that is engineered to disperse light and reduce reflected glare.

C:UsersDELLDesktopAppliedMayoPosterpicsImagingMediumPic1.pngC:UsersDELLDesktopAppliedMayoPosterpicsImagingMediumPic2.png

Fig 2.1 Table top device design

2.2       Development and Deployment

The TeleMedicine iOS App was developed to communicate with the designed hardware to transmit high quality images to the Mayo servers seamlessly and securely. The workflow of the application is detailed as follows –

C:UsersDELLDesktopAppliedMayoPosterpicsWorkflowPic2.png

Fig 2.2.1 Workflow of the mobile application

  1. User Secure Login: The user logs into the Mayo application through VPN network by entering valid credentials in Edge Client for Mayo. On the login screen, the user enters username and password. If the ID exists, the user encounters the next screen, i.e. Questionnaire. If the ID doesn’t exist, the user has the option to register with Mayo Clinic and get a new Patient ID.
  1. Enter Valid Patient ID: The logged in user enters the given Patient ID to move to the Questionnaire screen.

 

  1. Fill PRO questionnaire: The user is provided with a list of PRO questions that the user has to fill before moving on.
  2. Review questionnaire: The user reviews the filled questionnaire before continuing to the next step.

C:UsersDELLDesktopAppliedMayoPosterpicsMobileLoginphase2.png

Fig 2.2.2 Mobile application interface

  1. View/Select picture: The user can now choose the desired image from the captured images and decide to upload the picture to Mayo servers. The user can click on View Picture from Library to choose from the photo album, and further drag and pinch to Zoom in and Zoom out the picture. User can only continue once a single picture is selected.
  1. Upload: The user can now review the picture and cancel or upload the picture. Once, the picture is uploaded the user can go back to Step 2 and enter a different Patient ID to start the process for another patient. After upload the local copy of the photo will be deleted to maintain HIPPA compliance.

2.3       Expansion (Mobile)

The TeleMedicine iOS device was designed with two versions namely –

  • Light version: iPhone with Camera attachment.
  • Ultra Light version: iPhone without Camera attachment. (using built-in camera)

The Mayo Clinic TeleVision team, comprising of manufacturing engineering students, designed a Light version of the equipment – a low-cost, Smartphone-based telemedicine prototype for eye disease and injury triage as proposed by Mayo Clinic. The capstone project won second place in the 2016 Capstone Design Conference in Columbus, Ohio.

C:UsersDELLDesktopAppliedMayoPosterpicsprizeDevice.png

Fig 2.3 The Mayo Clinic TeleVision team’s prototype of an imaging system for eye disease/injury triage Image Courtesy of Mayo Clinic TeleVision team

The ASU team’s prototype helped avoid wait time for patients in order to get a diagnosis and reduced the need for bulky equipments that can cost $4,000 – $15,000. Primary care physicians, nurse practitioners or even physician assistants can instead use a $500 device that attaches to a Smartphone captures an image of the eye and electronically transmits the image to an ophthalmologist for triage.

2.4       Modular Architecture

  • EyePiece and Light is the lighting system that emits white light and cobalt blue light to complement fluorescent dye to highlight certain areas. LED light ring allows maximum illumination to the eye. A diffuser is also installed to reduce reflected glare in the images.
  • The camera is the hardware that captures high quality images of the anterior and posterior segment of the eye that are later transmitted to Mayo servers via mobile app.

C:UsersDELLDesktopAppliedMayoPosterpicsTele_Medicine_Modular_Architecture_Last_Slide.png

Fig 2.4 Modular Architecture of TeleVision

  • The Image Acquisition software is HIPPA compliant and is responsible for deleting the image once the image is successfully uploaded to the Mayo data repository.
  • The Local Connectivity over Client Device entails using the mobile app to communicate with the camera hardware to obtain images.
  • The Patient Report Outcome involves PRO questionnaire that is filled out by the user to accrue information about the patient.
  • The Web Transmission Modules is the communication protocol that uses RESTful services to transmit images to Mayo DICOM repository.
  • The Mayo Web Service Portal is the gateway that validates the data to be stored on Mayo servers.
  • The Mayo Server stores patient’s information,  related images and the PRO questionnaire corresponding the the Patient ID on the Mayo DICOM repository.
  • Other Patients Records include the repository of each patient’s record (images and metadata).
  • The Mayo Ophthalmology Application is used as interface for the clinician to evaluate existing records and issue triage and diagnose the medical conditions of the patient.
  1. FEATURES

3.1       HIPPA Compliant

HIPAA, the Health Insurance Portability and Accountability Act, sets the standard for protecting sensitive patient data. The application is configured to not store patient’s confidential data apart from the images, which in itself are not patient identifiable. The app implements a workflow which automatically deletes the local copy of the image from the device in order to maintain HIPPA compliance. Even if the user chooses to proceed to the next patient without uploading the picture, the application will notify the user with a pop-up that the image will be deleted.

3.2       FDA Ready

The Food and Drug Administration (FDA or USFDA) is a federal agency of the United States Department of Health and Human Services responsible for protecting and promoting public health through the control and supervision of prescription and over-the-counter pharmaceutical drugs (medications). The application is considered a mobile medical app and is ready for FDA regulation.

3.3       Clinically Tested at Mayo ER (IRB)

The TeleVision application has been clinically tested in Mayo Emergency Room and approved by Institutional Review Board.

3.4       Web Service APIs

The Web Application Description Language (WADL) document describes most of DCAMS Imaging Import RESTful web service APIs that can be readily consumed by any Mayo imaging system applications internally. The service requests have to be in Application/ JSON format. The response received is in JSON format (short for JavaScript Object Notation).

3.5       VPN Authentication

The application requires a VPN connection to login. VPN has to be established by three-tier authentication using ‘vlink.mayo.edu’ or through EdgeClient application. First, a registered Patient ID and Password are to be entered, and then a secure code, which is shared only with valid users, has to be provided. In the third step, a dynamically generated password has to be entered. Once VPN is established, the user can enter patient ID to move to the questionnaire screen. If the user does not have a patient ID, the user has to register with Mayo Clinic in order to get a new patient ID for further access.

Since Mayo switches servers and services on a regular basis, new server updates and domain change configurations are inevitable. To adapt to these services and handle security, the application needs constant update. Security is a major concern that should be handled to safeguard patient’s confidentiality which is served by this authentication.

TeleVision hardware utilizes a 32 pin lightning connector to send images to its iOS application. Then it uses Mayo web services to upload images to the DICOM repository in a safe and secured manner using the VPN connection. This workflow is depicted in the Figure 3.5.

Screen Shot 2014-12-03 at 8.25.55 PM.png

Fig 3.5 VPN Connectivity workflow

  1. GOAL

The existing TeleVision application can be enhanced to establish a downlink that facilitates the delivery of feedback/prescription from the medical expert to the patient.  The feedback can be accessed by the nurse and/or the patient corresponding to the Patient ID.

The table top version of the device involves the following steps –

  • Patient goes to the nurse (clinician) located in a remote region lacking the availability of an expert practitioner.
  • Nurse uses table top module to capture high quality eye images of the patient.
  • Nurse integrates it into the application and uploads the picture to Mayo server securely to the DICOM repository.
  • The downlink feature allows the doctor to triage the patient and sends the diagnosis back to the nurse.

The mobile version of the device involves the following steps –

  • Patient downloads the Mayo application and logs into the system with a unique Patient ID, once the registration with the Mayo clinic is approved.
  • Patient uploads an eye picture and fills out the PRO questionnaire detailing the symptoms of the disease.
  • The picture is sent over the Mayo server and stored in the data repository securely.
  • The downlink feature allows the doctor to triage the patient and sends back the diagnosis to the patient.

C:UsersDELLDesktopAppliedDownlink_Tele_Medicine_Modular_Architecture_Last_Slide.png

Fig 4 Modular Architecture with Downlink

4.1       TeleDermoscopy App

The TeleVision iOS application inherits its architecture from a Dermatology app, TeleDermoscopy, a Mayo Clinic’s in-house built iOS application that is used for triaging case profiles based on the captured image of patient’s affected skin. Derma app, which is currently in Production at Mayo Clinic utilizes an underlying architecture of REST based Web Services similar to the TeleVision app. These existing Web Services are also used in TeleVision to facilitate patient authentication and content upload functionality.

4.2       Approach

  1. The RESTful Web Service APIs that are consumed by Mayo imaging system applications state that the service requests have to be in Application/ JSON format. The response received is in JSON format (short for JavaScript Object Notation). It is commonly used for representing structural data and data interchange in client-server applications, serving as an alternative to XML. A lot of the services we use every day have JSON-based APIs.

To implement downlink, the data can be fetched in JSON format as NSData object and serialized into an array.

The following code can be used in the iOS application. This uses Objective C language.

APPROACH 1: Array

NSError *error;

NSString *url_string = [NSString stringWithFormat: @”http://http://javadev.mayo.edu/ImagingImportServices/ serviceRequest”];

NSData *data = [NSData dataWithContentsOfURL: [NSURL URLWithString:url_string]];

NSMutableArray *json = [NSJSONSerialization JSONObjectWithData:data options:kNilOptions error:&error];

NSLog(@”json: %@”, json);

 

  1. Another approach is to access every node; we can create a Dictionary object which gets data from NSJSONSerialization method and perform iteration. The following Objective C code can be added to the application. [3]

APPROACH 2: Dictionary

NSURL *urlPath = [NSURL URLWithString:@” http://http://javadev.mayo.edu/ImagingImportServices/ serviceRequest“];

NSData *jsonData = [NSData dataWithContentsOfURL:urlPath];

NSError *error = nil;

NSDictionary *dataDictionary = [NSJSONSerialization JSONObjectWithData:jsonData options:0 error:&error];

NSLog(@”%@”,dataDictionary);

_ccStatus = [NSMutableArray array];

CCStatus *status = [CCStatu statuWithTitle:dataDictionary[@”firstName”]];

[_ccStatus addObject:status];

 

  1. MAYO COMMITTEE INTERNAL STRUCTURE

Mayo Clinic follows a stringent code protection policy and the process of releasing copy of the existing PhotoExam App required a series of approvals from the Mayo Clinic Committee. The project required a sandbox version of the existing application source code to build new screens. The HIPPA compliance issue and Non Disclosure Agreement were the requirements to be fulfilled before the students could get access to the code.

The process of code release went through several phases of discussions and approvals. The committees that granted permissions for the purpose were – Clinical Digital Imaging Committee (CDIS), Security/Privacy/Architecture/Data SPAD, Enterprise Technology Oversight Group (ETOG)  and Clinical Systems Oversight (CSO) Group. Out of six Mayo committees, five committees successfully granted the approval for code release by the end of Spring 2017 session. The final approval was supposed to come from the CSO Group. In the end, the case was set for a formal review by the Clinical Systems Oversight (CSO) Group on 5/9/2017.

The timeline of the series of approvals received are depicted in Figure 5.

Fig 5. Timeline of Mayo Clinic Committee Approvals for code release

5.1       Timeline of Approvals –

  • 12/5/2016: MC Clinical Digital Imaging Committee (CDIS) Exec Topic Discussion
    • The Clinical Digital Imaging Committee that approved the access required that the process be submitted and approved through the Mayo Security Privacy Architecture Data (SPAD) process. The process typically takes about 4 weeks, and can be expedited for a non development project. Thereafter, SPAD Questionnaire for PhotoExam Television was submitted.
  • 1/30/2017: Security/Privacy/Architecture/Data SPAD Review
  • 2/7/2017: Security/Privacy/Architecture/Data SPAD Approval
  • The IT PMO forwarded the initiative to Enterprise Technology Oversight Group (ETOG) for Endorsement to Execute. The SPAD Operations Group noted that one must comply with Master Data Management (MDM) standards.  This initiative was required to return to SPAD when the project would be determined to go into production.
  • 2/14/2017: Enterprise Technology Oversight Group (ETOG) endorsement for Endorsement to Execute.
  • 2/27/2017: MC Clinical Digital Imaging Committee (CDIS)  Exec Review and Approval
  • 3/6/2017: Teleconference with the Legal Department in regards to the Mutual Confidentiality Agreement.
  • 3/31/2017: The Project Proposal Intake Form was filled out and returned to Legal, they use that to create a new Statement of Work Agreement.
  • TBD:  Clinical Systems Oversight (CSO) Group Review and Approval.

PHASE II – IBM WATSON VISUAL RECOGNITION

  1. MACHINE LEARNING HEALTHCARE SOLUTIONS

The concept of machine learning is a revolutionary technology that uses Artificial Intelligence (AI) to gradually train the computer when exposed to new data. Vision-tagging, Speech recognition, Text detection, Sentiment analysis are some of the powerful applications in the field of data analytics that are being developed by companies like IBM, Google and Microsoft. It is used to learn and establish baseline behavioral profiles for various entitiesand then used to find meaningful anomalies.

In the healthcare domain, artificial intelligence can be very advantageous and can facilitate meaningful use of complex medical data by fostering collaborations between computer scientists and clinical experts. Machine learning can be used to improve diagnostics, predict outcomes in order to assist the clinician in triaging patients with the most updated information. In recent years, cognitive analytics has been used to develop programs to diagnose patients using patterns observed through deep learning algorithms, which increases the chances of achieving an accurate diagnosis. Predictive analytics tool, like IBM’s Pathway Genomics and Lumiata, can also discover accurate insights to determine the possibility of a disease. These tools can further guide the clinician in treatment of the patient by crafting a set of clinical practice guidelines that suggests the best course of treatment for the detected ailment, for instance IBM Watson’s CareEdit tool. Another issue solved by this technology is hospital readmittance by frequently alerting the patients to take the right medications based on their responses and providing information back to the doctor. Apps like AiCure, NextIT and Cafewell Concierge, which uses IBM Watson’s Natural Language Processing (NLP) service, are some of the developed technologies for this purpose.

The TeleVision iProject stands to benefit from these advancements in the machine learning technologies. The images produced by the in-house camera hardware forms a corpus of informative database that can be used to model a neural network for visual pattern recognition. It can use a deep learning algorithm to ascertain similarities in the human eye images to diagnose a patient for any imminent disease. By using the real life images of the human eye, the model can be trained to serve a particular purpose like classifying the image into a healthy patient or affected patient category. The more data it receives the more accurate the classification will be, that can determine if the patient is healthy or not. Several image analytic APIs have been developed by tech-giants like IBM, Microsoft, Google and Amazon –

6.1       Google Cloud Vision API:

This powerful API allows the users to understand the content of the given image using REST API based on machine learning models. It is capable of classifying the image into categories based on the available data, detects faces and printed text within images and analyzes sentiments depicted by the image. The image in this case, can be uploaded by the user or obtain the data from Google cloud storage.

6.2       Amazon Rekognition:

The service offered by Amazon lets users add image analysis to the applications. Amazon uses its highly scalable, deep learning technology of neural network to analyze, detect and label thousands of objects and scenes in the images. Rekognition API lets one build powerful search and discovery in applications.

6.3       Microsoft Computer Vision API:

The API offered by Microsoft is another visual content recognition tool that can identify content by using tagging, description and domain-specific models. It labels the image based on confidence level from the accumulated data.

6.4       IBM Watson:

IBM offers a powerful service that lets the user not only analyze the image but to train the service using custom models, and create classifiers restricted to specific domains. Using customized collection of image data, the classifiers can be trained to engender desired classification of the image. The Watson API is a general purpose platform that offers services like Language, Speech, Vision and Data processing. This technology can be useful in TeleVision iProject because of its powerful functionality of training the model with our custom classifiers.

  1. ARTIFICIAL INTELLIGENCE BACKGROUND

The field of Artificial Intelligence (AI) is an important asset for developing systems that possess cognitive capabilities and can be trained to comprehend the world around them. It facilitates development of computers that are capable of functioning with human intelligence like reasoning, problem solving and learning. Artificial Intelligence systems are characterized by symbolic processing of the data as a whole rather than numbers or letters, wherein the system can comprehend the relations between abstract symbols. It is also non-algorithmic in nature, and does not necessarily follow a step by step procedure to reach a solution. This has a lot of applications in the field of healthcare and can be used to assist clinical experts in making better decisions. [4]

The evolution of AI field has seen several milestones, 1950 marked the introduction of Turing test, a term inspired by Alan Turing who proposed “the imitation game” which later became known as the “Turing Test” published in “Computing Machinery and Intelligence” – in which a machine was considered to be performing intelligently if an interrogator could not distinguish between the responses of a machine from those of a human. This resulted in development of general problem solving methods in the field of AI. Thereafter, AI was recognized as a research field in 1960, with the inception of Knowledge-based Expert Systems. In 1970, decision support systems and transaction processing started using AI which lead to the commercialization of AI. In 1980, artificial neural networks were introduced, that are computing systems modeled on the complex interconnected neuron structure of the human brain. The timeline is depicted in Figure 7.

C:UsersDELLDesktopai-history-tomlearning-4-638.png

Fig. 7. Artifical Intelligence Timeline

Computer Vision is the subset of AI that deals with automating the tasks that human visual system is capable of doing. It is one of the most advanced AI sensory systems that simulates human senses to interpret digital images or videos by acquiring, processing and analyzing the images to produce meaningful high-dimensional information e.g., in the form of decisions. It is also known as visual scene recognition. AI and computer vision share topics like learning methods and pattern recognition. One of its important application is in the field of neurobiology, which involves extensive study of eyes, neurons, and the brain structures to ascertain how real vision system operate to mimic their behavior, which in turn lead to the development of neural net and deep learning based methods.

Image-processing and image-analysis are the related fields of computer vision that deals with processing of 2D images. Imaging is another field that deals with analysis of data, wherein Medical imaging is the field that is utilized in medical applications to analyze image data. [5] These are the fields that can have a substantial amount of usage in telemedicine applications similar to TeleVision’s PhotoExam App.

7.1       Expert Systems

An important applied area of AI is the Expert System field, which is the simplest form of artificial intelligence. It is a system with a substantial knowledge-base that employs knowledge about the problem domain and uses an inference engine (reasoning) to solve the problem requiring human expertise. This problem-specific knowledge is stored in the knowledge base, which also stores derived heuristic knowledge along with factual expert knowledge. Expert systems’ role is to assist the decision makers. Some applications include diagnosis systems to detect disease form observable data, monitoring system to prescribe behavior, schedule and plan an action or classify objects.

Based on knowledge representation methods, the knowledge base is organized representing actions to be taken under circumstances, time, dependencies and other high-level concepts. Frame-based systems are used to build powerful expert systems that specifies frame for complex objects with different relationships. Production rules is the form of knowledge representation for Rule-based expert systems that represent knowledge in terms of set of rules to be inferred from in different situations. Rules are expressed as a set of IF (condition or premise) – THEN (an action or conclusion) statements called production rules. [6]

C:UsersDELLDesktopstacks_image_5738.png

Fig. Expert Systems AI

The inference engine is responsible for combining facts of the problem with the rules present in the knowledge-base to come up with a recommendation. It also controls the order in which rules are applied and resolves conflict in case multiple rules are applicable at the same time, by using reasoning. It can also query the user to acquire missing information. The facts are stored in working memory, in which the inference engine adds new information based on rules until a goal state or termination criterion is met. Forward-chaining is a data driven strategy employed by inference engine to reach the goal by using facts to solve open-ended problems like designing and planning. Backward-chaining, tries to match an assumed conclusion with the given conclusion set in the rule. This is good for problems that have a set amount of possible solutions. The limitation of rule based system is that, although faster than humans, it should be used for a relatively small domain because high number of rules can make the expert system inefficient.

7.2       Neural Networks

Neural networks are computing systems modeled on human brain’s interconnected network of processing elements called neurons, intended to emulate its complex functional behavior although neural networks are much simpler. These networks can be trained to recognize patterns and autonomously solve problems through reasoning, especially one related to healthcare domain that requires complex logical diagnosis. [7] The knowledge in neural network is represented by the pattern of connection amongst the elements and their weights. However, neural networks cannot always produce an explanation for the conclusion they make.

These artificial neural networks have discrete layers, connections, and directions of data propagation. However, until recent years, even the most basic neural networks were considered very computationally intensive and impractical. Although, the deployment of GPUs and parallelized algorithms helped the effort, the chances of a large neural network coming up with correct solution were low. It needed more tuning and training, that’s where deep learning comes into the picture.

7.3       Deep learning

To maximize the utility of a neural network it is important that it is subjected to thousands, or even millions of images, to train the network so that neuron weights are tuned precisely to give accurate answers. Facebook’s Andrew Ng made a breakthrough by making huge convolution neural networks by increasing the neurons and layers, embedded with big amount of data, and training it over and over. This is the deep learning that allowed image recognition feature to identify faces on Facebook. Deep learning has engendered many practical applications of machine learning and its applications can go as far as identifying cancer in blood and tumors in MRI scan reports pretty accurately. [8] The research in the medical field have been aided by deep convolutional networks.

7.4       Traditional Computer Vision Technique without AI

Initially, image recognition algorithm used traditional Computer Vision techniques to detect images. [9] An algorithm invented by Paul Viola and Michael Jones for face detection in 2001 was considered the most efficient algorithm. [10] In one such algorithm, an image classifier takes an image and produces the data describing the image content. So for a two-class classifier, it needs to be trained with positive and negative examples of the data. These techniques were rendered inefficient due to the development of more advanced techniques later on. For instance, deep learning skips the feature extraction step completely. That said, traditional computer vision approaches still power many applications like OpenCV library.

7.5       Comparison

Image recognition techniques have evolved over the years to present more efficient solutions. However, they all have some characteristics that still keep them from becoming expendable. Convolutional Neural networks, are the most recent development in image recognition, which have been proven to give results with 85 percent accuracy. [11]

CNN are advanced approaches to analyze data and have many special characteristics that make them more efficient than traditional and Expert systems. Some points of comparisons are delineated below-

  • Traditional systems are simplistic and can be suitable for applications like binary classification that do not require advanced features of neural networks. Neural networks can however deliver results with unprecedented accuracy.
  • Expert systems, although accurate, require complex rules to be established for a particular application. For maximum accuracy, they require bigger knowledge base from experts. As the complexity of the system increases, more computing resources are required rendering the system slow. Neural networks, on the other hand, models a system that programs itself and learns on its own without the need of an expert.
  • Expert system processes the data sequentially and logically aided by Rules and calculations, whereas Artificial Neural Networks exhibit parallel processing via images.
  • Expert systems provide knowledge and reasoning to users as a learning utility and can provide domain specific expertise which would otherwise be too costly. They are ideal for situations where rules don’t overlap and require less expertise from end users, like in healthcare domain where software can monitor patients’ reactions to medications.
  • When it comes to neural network applications, they don’t require a knowledge base and can streamline the process to use resources efficiently with maximum throughput. They have a learning algorithm and perception structure that allows them to learn autonomously. The training is required for every purpose but they are highly tolerant to inaccurate data, and find applications in financial forecasting and risk management.
  • Evidently, neural network provides more advantages compared to traditional computing and expert systems techniques. That said, they still need constant training and are susceptible to mistakes.

C:UsersDELLDesktopcatnet.png

Fig. 7.5. Traditional learning without AI Vs. Deep Neural Network learning with AI

Visual recognition using neural networks have vast applications in the future. We can safely postulate that the field of telemedicine can benefit greatly from these advancements. Several companies have proposed powerful tools for this purpose. IBM Watson is one such tool that can annotate photos and videos in addition to providing custom training functionality. It is also one of the few systems that are delving into intention-driven interface- which responds to vague user input by discerning their intent and take action based on those insights, as opposed to quoting user’s literal queries. These utilities render IBM Watson as an efficient choice for the purpose of our project.

  1. WATSON AND AI – OVERVIEW

IBM Watson can be described as a cognitive system that can decode unstructured information to identify inferences with human-like accuracy and potentially at a faster speed and greater scale than any normal human being is capable of doing. Watson is capable of inferring the context of a phrase, rather than processing individual words.

Its breakthrough was realized when Watson competed on Jeopardy! against former winners and received the first place prize of $1 million.This IBM supercomputer was capable of storing terabytes of memory and was able to infer unstructured data accurately.

8.1       IBM Watson Architecture

Watson parses the problem to extract major features, and then generate a hypothesis by searching across the corpus for similar problems containing a valuable solution. Then it uses deep learning technique to compare the problems using reasoning algorithms, the scores produced are associated with the potential solution. Each score is then compared against a statistical model that captures how well the algorithms performed during the training period. Using this, a level of confidence is summarized for the potential solution. Watson runs the same process for all potential solutions to find the best candidate as illustrated in Figure 8.1

Fig. 8.1. IBM Watson response architecture [12]

8.2       Components of Watson

  • Watson Discovery: The language API lets user build a cognitive search and analytics engine. Watson helps the developers to utilize data into finding patterns, enabling better decision making.
  • Watson Speech: The speech API allows the deployment of bots or virtual agents across different platforms including mobile, messaging or even physical robots to facilitate natural conversations between users and applications.
  • Watson Vision: The vision API comprehends the visual content of an image enabling tagging, face detection, and classification. The service can be trained with concepts customized to user’s demands.
  • Watson Data Insights: The data insights API provides analytics for the users to make decisions and monitor trends by comparison with historical data.

8.3       Watson Health – AI Based Healthcare Solutions

AI can be revolutionary for healthcare as its processing power exceeds any human doctor and IBM aims to spearhead this development with its AI technology – Watson. Its ability to process vast stores of data and patterns make it a suitable fit for medical applications indeed. In fact, Watson’s first application was utilization management in lung cancer treatment in partnership with Memorial Sloan Kettering Cancer Center, New York City. Currently Watson is venturing into genomics, oncology and assists with cancer diagnosis using its accurate AI processing power. [13]

The medical field counts for almost one-third of employment of Watson AI units, which is a testament to IBM’s seriousness into bringing Watson in healthcare industry. Watson is now addressing a variety of other medical areas including patient engagement, imaging review, personalized care, and drug discovery.

  1. IBM WATSON VISUAL RECOGNITION

The IBM Watson API offers products and services that are artificial intelligence based solutions. Watson was originally built by IBM as a computing system that provides answers to questions posed in natural language by applying advanced natural language processing, automated reasoning and machine learning methods. It uses unstructured data and converts it into useful intelligence. The products offered by Watson includes – Natural Language understanding, Speech Analyzer, Visual Recognition, Tradeoff Analytics amongst others.

To aid our purpose, we utilized the powerful visual recognition service to classify the images obtained from the TeleVision application into ascertaining the possibility of a disease. The doctor can use the functionality on the server side, to get better assistance for the diagnosis. The service allows user to create custom classifiers and train them with a set collection of images as shown in Figure 9. Once the classifier is trained, a given image can be used to determine which class it belongs to.

C:UsersDELLDesktopvr-process2.png

Fig. 9 Visual Recognition service workflow

Another feature of IBM is that it has a Bluemix cloud platform as a service (PaaS) that supports Watson services and has a DevOps integrated that can build, run and deploy application on the cloud. Bluemix is based on Cloud Foundary open technology and runs on SoftLayer infrastructure. It supports IBM, open source and third party services in the catalog and has a GIT compatible source control. It supports several programming language including Java, Node.js, Go, PHP, Swift, Python, Ruby Sinatra, Ruby on Rails and can be extended to use other languages. This makes it all the more convenient choice for the project.

There have been research going on in this platform, in particular, to deliver MBaaS services that can be managed through a web-based portal. [14] This can be beneficial for our Television application in the future.

9.1       Implementation

  1.                Created Bluemix account

A free account of the IBM Bluemix can be created.

  1.                Created Visual Recognition Service

From the catalog of services, choose the Visual Recognition service. Give it a name and click on Create.

  1.                Get service Credentials

Once the service is created, one can go to the dashboard and click on the newly created service. Go to the service credential tab, and select the View Credentials dropdown and copy the {api_key}.

  1.                Created training zip files

Three zip files were created to set up the classifier namely diabetic.zip, glaucoma.zip and healthy.zip. High-resolution fundus images of human eye available for research purposes were taken for – glaucaoma patients, diabetic patients and healthy patients. [15] The publicly available database of healthy and pathological retinas is used to give accurate results. [16]

  1.                Create classifiers using POST method on Watson platform and begin training

Using cURL command, the training data files are uploaded to create a classifier named “TBEyeClassifier”. The {api_key} is copied from the credentials save in Step 3. The class names are entered as – {class}_positive_examples. The prefix class is replaced with glaucoma, diabetic and healthy for the 3 classes.

curl -X POST -F “healthy_positive_examples=@healthy.zip” -F “glaucoma_positive_examples=@glaucoma.zip” -F “diabetic_positive_examples=@diabetic.zip” -F “name=TBEyeClassifier” https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classifiers?api_key=7923cbc94486b09045cbd155ab3bad8b273ae250&version=2016-05-20

The response received –

{

“classifier_id”: “TBEyeClassifier_1774509133”,

“name”: “TBEyeClassifier”,

“owner”: “8ac18405-e55b-4809-ad67-0bbd093b481d”,

“status”: “training”,

“created”: “2017-03-17T12:57:10.028Z”,

“classes”: [

{“class”: “healthy”},

{“class”: “glaucoma”},

{“class”: “diabetic”}

]

}

  1.                Check training status periodically

The training takes a while to complete so in order to check the training status, the following cURL command was used. The {api key} and {classifier_id} is replaced by our information –

curl -X GET https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classifiers/TBEyeClassifier_1774509133?api_key=7923cbc94486b09045cbd155ab3bad8b273ae250&version=2016-05-20

  1.                Classify an image using the trained classifier

To check the classifier, I took one of the training images from the glaucoma category to test if Watson classifies into the correct class.

It is also possible to check the image against Watson’s default classifier and get an estimate of its correlation with the entire Watson database. A JSON file with all the desired {classifier id} can be passed as a parameter using the cURL command –F “parameters=@myparams.json”. The JSON file will look like –

{

“classifier_ids”: [“dogs_1941945966”, “default”]

}

Upload the desired image to be classified and get a URL redirecting to the image. We can use the POST method to test the classifier with this image using the cURL command –

curl -X GET -H “Accept-Language: en” https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classify?&api_key=7923cbc94486b09045cbd155ab3bad8b273ae250&url=http://i.imgur.com/VAxIpXD.jpg&classifier_ids=TBEyeClassifier_1774509133&owners=me&threshold=0&version=2016-05-20

The response received includes the classes and the confidence score predicting the score for image correlation with the class. The score ranges from 0-1, higher score indicating greater similarity.

Here, the results demonstrate that the sample image has been classified into the three custom classes created by us. Based on its intel, it was able to classify the image into the healthy category with 0.0502 % confidence, diabetic category with 0.0529 % confidence and the glaucoma category with 0.830 % confidence.

As evident, the testing image is classified into the Glaucoma class with the highest confidence of 0.830943 %, predicting it to be an image of a glaucoma patient. Fig 9.1 depicts the results obtained and proves our testing correct.

C:UsersDELLDesktopWhatsApp Image 2017-03-18 at 7.14.43 PM.jpeg

Fig 9.1 Response received for the Custom classifier

  1. RESULTS

In summary, the results derived from the research point towards a bright future for the cognitive solutions in healthcare domain. It is entirely possible for AI to assist individual clinicians with interpretation, display of patient data and treatment recommendations for an efficient triage. The TeleVision project is a remarkable step in the field of telemedicine and can help us achieve this goal.

The underlying architecture allows the application to consume and develop web services using RESTful APIs. The DICOM repository stores the collected images of the patients in a HIPPA compliant format that can be used to train the Watson Visual Recognition service. While using the web service for image transfer, the Patient Reported Outcome Report integrated in TeleVision app is also sent using the same web service in a PDF format. This lets the medical expert see the metadata associated with the case for better diagnosis. The software architecture displaying the model, view, controller architecture of the TeleVision application with downlink functionality and visual recognition service is depicted in Fig. 10.

Fig 10. Software MVC Architecture of TeleVision Application with Downlink functionality

  1. CONCLUSIONS

This paper documents the work done during a two semester long individual study pursued by me [Richa Srivastava]. The Phase I of the paper encompasses the research and evaluation process followed during the first semester to document the project from its inception to expansion. The exposition of the various approaches assumed while coming up with a solution for the downlink feature is outlined, in concert with highlighting the timeline of the correspondence with key stakeholders from Mayo Clinic and Arizona State University. The Phase II of the Project expounds upon the possibility of enhancing the TeleVision application by training IBM Watson’s cognitive system to detect symptoms exhibited by the patients from their digital eye images. The implementation of the technology and results of the findings are described in detail to conclude the paper.

During the individual study, several phases were observed that are detailed henceforth. During the first semester, I worked closely with the Mayo Clinic team – Dr. Peter J. Pallagi, Brian Willaert and Yong Peng, under the aegis of Dr. Ashraf Gaffar, to extensively research over and enhance the TeleVision project that has a long standing history of development at Arizona State University. We also had frequent teleconferences with Mayo Clinic’s Dr. Dharmedra Patel and team to ascertain our next steps with the project development. I was also given the opportunity to curate a poster to present our findings in the ASU Innovation showcase. The following weeks, I was able to document the project and work on implementing a downlink feature for the Television workflow in collaboration with some key members of the Software team.

During the second semester, we worked with the Mayo team to initiate the process of highly confidential code release process to get access to a sandbox version of the existing PhotoExam App code for downlink feature implementation. In conjunction with that, it was an informative experience for me to implement IBM Watson’s visual recognition service and ascertain its usage in our project. The results obtained demarcated the possibility of using cognitive systems to get accurate medical diagnosis. I was able to collaborate and share ideas with a group of software developers which helped me in developing team work and management skills.

REFERENCES

  1. Perednia, Douglas A., and Ace Allen. “Telemedicine technology and clinical applications.” Jama 273.6 (1995): 483-488.
  2. Robb, Richard A. “The biomedical imaging resource at Mayo Clinic.” IEEE transactions on medical imaging 20.9 (2001): 854-867.
  3. API Reference, JSONSerialization. Retrieved from https://developer.apple.com/reference/foundation/jsonserialization
  4. Dilsizian, Steven E., and Eliot L. Siegel. “Artificial intelligence in medicine and cardiac imaging: harnessing big data and advanced computing to provide personalized medical diagnosis and treatment.” Current cardiology reports 16.1 (2014): 1-8.
  5. L. Lu, J. Bi, S. Yu, Z. Peng, A. Krishnan, and X. S. Zhou. Hierarchical learning for tubular structure parsing in medical imaging: A study on coronary arteries using 3D CT Angiography. In IEEE 12th International Conference on Computer Vision. 2009.
  6. Grosan, Crina, and Ajith Abraham. “Rule-based expert systems.” Intelligent Systems (2011): 149-185.
  7. Bate, Andrew, et al. “A Bayesian neural network method for adverse drug reaction signal generation.” European journal of clinical pharmacology 54.4 (1998): 315-321.
  8. H. C. Shin, H. R. Roth, M. Gao and et al., Deep convolutional neural networks for computer-aided detec-tion: CNN architectures, dataset characteristics and trans-fer learning. IEEE transactions on medical imaging 35(5) (2016), 1285–1298.
  9. Samaria, F. and Harter, A. 1994. Parametrisation of a stochastic model for human face identification. Paper presented at the 2nd IEEE Workshop on Applications of Computer Vision, Sarasota, FL.
  10. Viola, Paul, and Michael Jones. “Rapid object detection using a boosted cascade of simple features.” Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on. Vol. 1. IEEE, 2001.
  11. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.” Advances in neural information processing systems. 2012.
  12. High, Rob. “The era of cognitive systems: An inside look at ibm watson and how it works.” IBM Corporation, Redbooks (2012).
  13. Kohn, M. S., et al. “IBM’s health analytics and clinical decision support.” IMIA Yearbook (2014): 154-162.
  14. Gheith, A., et al. “IBM Bluemix Mobile Cloud Services.” IBM Journal of Research and Development 60.2-3 (2016): 7-1.
  15. High-Resolution Fundus (HRF) Image Database. Retrieved fromhttps://www5.cs.fau.de/research/data/fundus-images/
  16. Odstrcilik, Jan, et al. “Retinal vessel segmentation by improved matched filtering: evaluation on a new high-resolution fundus image database.” IET Image Processing 7.4 (2013): 373-383.

Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

DMCA / Removal Request

If you are the original writer of this dissertation and no longer wish to have your work published on the UKDiss.com website then please: