Chest X-rays can quickly assess the COVID-19 status of test subjects and address the problem of inadequate medical resources in emergency departments and centers. The image classification model established by the deep learning method of artificial intelligence can help doctors make a better judgment on patients with COVID-19 and related lung diseases. We compared and analyzed the current popular deep learning image classification methods, VGGNet, GoogleNet, and ResNet, using publicly available chest X-ray datasets on COVID-19 from different organizations. According to the characteristics of chest X-ray images and the classification results of the deep learning algorithm, a novel image classification algorithm, CovidXNet, is proposed. Based on the ResNet model, the CovidXNet algorithm introduces the hard sample memory pool method to improve the accuracy and generalization of the algorithm. CovidXNet is able to categorize chest X-ray images more efficiently and accurately than other popular image classification algorithms, allowing doctors to quickly confirm the patient’s diagnosis.
In recent years, the computational power of handheld devices has increased rapidly to the point of parity with computers of only a generation ago. The multiple tools integrated into these devices and the progressive expansion of cloud storage have created a need for novel compressing techniques for both storage and transmission. In this work, a novel L1 principal component analysis (PCA) informed K-means approach is proposed. This new technique seeks to preserve the color definition of images through the application of K-means clustering algorithms. Assessment of the efficacy is carried out utilizing the structural similarity index (SSIM).
There has been a sharp rise in the amount of data available for analysis in many professional fields in recent years. In the medical sector, this significant increase in data can help detect and confirm underlying symptoms in patients that would otherwise remain undetected. Machine learning techniques have been applied in the medical sector and can help diagnose irregularities when data is provided for the specific area on which the system has been trained. Leveraging the newfound amount of big data and advanced diagnostic techniques, higher dimensional data feature extraction can be better analyzed. The algorithm presented in this paper utilizes a convolutional neural network to categorize electrocardiogram (ECG) data by processing the original data implementing the fast Fourier transform (FFT) and principal component analysis (PCA) to reduce dimensionality while maintaining performance. The paper proposes three intelligent identification algorithms that can be fed into another specialized machine learning system or analyzed using traditional diagnostic procedures.
Living in a constant news cycle creates the need for automated tracking of events as they happen. This can be achieved through the investigation of broadcast overlay textual content. There exists a great amount of information to be deciphered via these means before further processing, with applications spanning from politics to sports. We utilize image processing to create mean cropping masks based on binary slice clustering from intelligent retrieval to identify areas of interest. This data is handed off to CEIR, based on the connectionist text proposal network (CTPN) to fine-tune the text locations and an advanced convolutional recurrent neural networks (CRNN) system to carry out text recognition to recognize the text strings. In order to improve the accuracy and reduce processing time, this novel approach utilizes a preprocessing mask identification and cropping module to reduce the amount of data being processed by the more finely tuned neural network.
The fusion of multispectral sensor data techniques for sets containing complementary information about the subject of observation leads to the visualization of data into a form more easily interpreted by both humans and algorithms. Many applications of feature-level fusion seek to combine edges and textures, across the bandwidth of the sensory spectrum. Visualization techniques can be skewed by the introduction of corruption and redundancies induced by harmonics. A majority of image fusion techniques rely on intensity hue saturation (IHS) transforms, principal component analysis (PCA), and Gram Schmidt. PCA’s ability to remove the redundancy from a set of correlated data while preserving the variance and its resistance to color distortion lends itself to this application. PCA also has a lower spectral distortion as compared to IHS and has been found to create superior image fusion. The application of neural network control techniques has been shown to more accurately recreate results similar to those found by human inference. Over the years, increased computation power has given rise to the spread of neural networks into roles previously carried out by humans. Select advanced image processing techniques have benefited greatly from their implementation. We propose a novel method of utilizing PCA in conjunction with a neural network to achieve a higher quality of image fusion. Implementation of an autoencoder neural network to fuse this information creates a higher level of data visualization when compared to traditional weighted fusion techniques.
As one of the classic fields of computer vision, image classification has been booming with the improvement of chip performance and algorithm efficiency. With the rapid progress of deep learning in recent decades, remote sensing land cover and land use image classification has ushered in a golden period of development. This paper presents a new deep learning classifier to classify remote sensing land cover and land use images. The approach first uses multi-layer convolutional neural networks to extract the image features, attached through a fully-connected neural network to generate the sample loss. Then, a hard sample memory pool is created to collect the samples with large losses during the training. A batch of hard samples is randomly extracted from the memory pool to participate in the training of the convolutional fully connected model so that the model becomes more robust. Our method is validated by testing the classic remote sensing land cover and land use dataset. Compared with the previous popular classification algorithm, our algorithm can classify images more accurately with a shorter training iteration.
Big data has been driving professional sports over the last decade. In our data-driven world, it becomes important to find additional methods for the analysis of both games and athletes. There is an abundance of videos taken in professional and amateur sports. Player datasets can be created utilizing computer vision techniques. We propose a novel approach by creating an autonomous masking algorithm that can receive live or previously recorded video footage of sporting events. This procedure can identify graphical overlays to optimize further processing by tracking and text recognition algorithms for real-time analysis.
We propose a new recognition method to extract effective information from receipts by integrating deep learning algorithms from computer vision and natural language processing. Our method consists of three parts. The first part provides effective areas for receipt detection. By removing noise and extracting the gradient of the receipt image, we determine the threshold to crop and reshape the useful receipt area. Detecting text from a receipt image is the second part, we modify and deploy the text detection algorithm connectionist text proposal network (CTPN) to locate the text region in the receipt. In the third part, we import the connectionist temporal classification with maximum entropy regularization as the loss function for updating the convolutional recurrent neural networks (CRNN) to recognize the text detection area, which converts the receipt from an image into the text. Based on our method, the effective information of a receipt can be integrated and utilized. We train and test our system using the data set published by scanned receipts optical character recognition and information extraction (SROIE). The results illustrate that our recognition system is able to identify receipt information quickly and accurately.
Nowadays, intelligent unmanned vehicles, such as unmanned aircraft and tanks, are involved in many complex tasks in the modern battlefield. They compose the networked intelligent systems with varying degrees of operational autonomy, which will continue to be used increasingly on the future battlefield. To deal with such a highly unstable environment, intelligent agents need to collaborate to explore the information and achieve the entire goal. In this paper, we will establish a novel comprehensive cooperative deep deterministic policy gradients (C2DDPG) algorithm by designing a special reward function for each agent to help collaboration and exploration. The agents will receive states information from their neighboring teammates to achieve better teamwork. The method is demonstrated in a real-time strategy game, StarCraft micromanagement, which is similar to a battlefield with two groups of units.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.