摘要:Session recommendation aims to predict the next most likely interaction item based on the current anonymous behavior sequence, and one of the key research questions is how to effectively recommend anonymous users using the sequence information of the item. Aiming at the problem that existing session recommendation methods do not fully consider sequence dependency information and global information from other sessions, a session recommendation method SDGI that combines sequence dependency and global information is proposed. This method learns the sequence dependency relationships between items through convolutional time aware gated recurrent unit networks, and constructs local and global graphs using graph neural networks to obtain global item transition information. To address the issues of bias and overfitting, a lightweight graph convolutional network layer combined with gating mechanism is introduced to obtain global level item embeddings, and a focus loss function is applied to handle the problem of imbalanced positive and negative samples. Comparing with 12 baseline methods on three public datasets, Diginetica, Tmall, and Yoochoose, the experimental results show that the performance of SDGI is significantly improved compared to baseline methods, indicating that combining sequence dependencies with global information can effectively improve session recommendation performance.
摘要:With the proposed national "double carbon" goal, energy conservation of public buildings is becoming more and more important, and short-term prediction of building power data is helpful to reasonable regulation of electricity consumption. This paper proposes a combined prediction model of grey model GM (1,1) and BP neural network. Firstly, the data is screened by grey correlation, and the electricity consumption data in the short term is predicted according to the advantage that the grey model only needs a small number of samples. The predicted result is taken as the input variable of the BP neural network, and the difference between the original data and the predicted value of GM (1,1) is learned by the network in reverse order to improve the prediction accuracy of the model. It is applied to the electricity consumption data of university buildings to predict the electricity consumption of next week. Compared with the four models, the combined model has the smallest error and the highest accuracy.
关键词:grey model;BP neural network;power forecasting;building energy efficiency
摘要:The traditional reinforcement learning methods lack strong generalization, often performing poorly when directly applied to specific tasks, especially in scenarios involving adversarial two-player games where the situation is more complex. To address this issue, this paper proposes a reinforcement learning-based strategy training method for two-player games. Furthermore, it introduces a method that enhances two-party group game strategies based on both reinforcement learning and a rule library. Through experimental validation, the proposed methods enhance the behavioral decision-making of intelligent agents, with the total reward obtained by the intelligent agent approaching 14.5. This has resulted in effective behavioral decision improvements in simulated capture tasks. Simultaneously, by configuring different rule libraries, the method introduces uncertainty into the simulated environment, better simulating the complexity of real-world environments.
关键词:reinforcement learning;rule base;group game strategy;agent decision-making
摘要:The seismic displacement analysis of high-rise shear wall type building structure can help engineers evaluate the seismic ability of buildings, which is of great significance. All the existing methods use finite elements for calculation, which leads to high time overhead. Therefore, this paper proposes a fast prediction method for seismic displacement of high-rise shear wall structures based on generative adversarial networks. The model is based on a generative adversarial network structure to fit the mapping function between shear wall deployment and structural seismic displacement. In order to improve the accuracy and feasibility of the model shear wall structure seismic displacement prediction, the model processed the building image semantically, retained the key structural information and used it as input to output the predicted structural seismic displacement. At the same time, the performance of the trained model was evaluated by pixel-by-pixel score in the output result evaluation. The experimental results show that the SSIM, PSNR and LPIPS index values of the model are about 0.98, 32.1 and 0.02 respectively, which are superior to the pix2pix model. Meanwhile, the LCA and MAE values of the model also perform well. This method can be used as a useful supplement to the finite element method and can be applied to applications such as structural optimization.
摘要:Aspect-based sentiment analysis is a fine-grained sentiment analysis task that aims to extract aspect words from sentences and identify their sentiments. Since annotation of aspect terms and sentiments are costly, previous work alleviates the data lack problem in new domains by transferring public knowledge across domains. However, these methods have complex models and require expensive multi-level preprocessing. To address these problems, a simple but effective LPIDA model is proposed. Firstly, starting from the middle layer of the BERT model, each layer inserts a set of soft prompts composed of multiple learnable vectors to learn domain-invariant features between the source domain and the target domain. Secondly, a word-level domain classifier is set up, and the results of the classifier are used for instance adaptation so that the model can pay more attention to words in the target domain. Experimental results on four benchmark data sets show that the proposed model achieves an average Micro-F1 of 47.01% in cross-domain aspect- E2E aspect-based sentiment analysis tasks on 10 different domain pairs, and an average Micro-F1 of 52.29% in cross-domain aspect term extraction tasks. At the same time, it was also tested on 3 Chinese data sets, which proved the effectiveness of the proposed method.
摘要:Random noise has always been a key and difficult point in seismic data processing. Traditional random noise suppression methods are prone to artifacts and blurred edge information when processing actual seismic data. It is necessary to develop a deep learning based random noise suppression method that directly learns the deep features of the image to achieve denoising. Given that Swin Transformer can effectively extract deep information from images in image processing, an improved denoising method based on Swin Transformer is proposed. This method adopts the Unet framework of encoder decoder, which uses a dual channel parallel extraction of multiple dimensional features in the encoder, and introduces a new feature fusion mechanism to merge these features. Finally, the decoder reproduces the extracted useful information. Using actual work area data for testing, the experimental results show that compared with current mainstream deep learning models, the proposed method achieves a maximum improvement of 2.33 dB in SNR and 0.07 dB in SSIM, demonstrating excellent denoising performance.
关键词:Swin-Transformer;Unet;image denoising;seismic data
摘要:The fluctuations in the stock market have increasingly become a focal topic in society, making the efficient and accurate prediction of stock prices a popular research area among scholars. To reduce computational load and improve work efficiency, dimensionality reduction techniques are applied to stock data prior to forecasting, while also considering stock volatility. This article combines three models: Principal Component Analysis (PCA), Generalized Autoregressive Conditional Heteroskedasticity (GARCH), and Long Short-Term Memory (LSTM) networks to construct a composite model for stock price prediction. To test the predictive performance of the model, this study takes the Shanghai Composite Index and the CSI 500 Index as examples to predict closing prices. Through comparative experiments, the RMSE, MAE, and MAPE values of the proposed PCA-GARCH-LSTM composite model are all lower than those of other models, indicating the effectiveness of the proposed model in prediction.
摘要:Music enlightenment education plays an important role in shaping children's personalities, improving their listening and memory skills, and enhancing their aesthetic abilities. To this end, design a teaching aid method that is easy for children to learn music. By processing the data from the built-in sensors of smartphones, different angles of shaking are distinguished into different notes, and then continuous shaking is used to achieve the effect of playing simplified music. Select an acceleration sensor to reflect the directional changes of the phone's movement in space. To improve the accuracy of action recognition and real-time feedback, machine learning is used to train a sensor data classification model and optimize its parameters. The experimental results show that the model recognition accuracy can reach over 95%. After integrating the mobile application into the classification model, through continuous design iterations, especially the introduction of the human machine music collaboration mechanism, the shaking test data feedback is timely and 100% accurate for recognition. Users can easily shake their phones to accurately play the note sequence in the simplified music score.
摘要:The named-entity recognition of Chinese medicinal materials used for gastropathy treatment is one of the important tasks of text information mining in the field of Chinese medicinal materials development , and is one of the most important basic tasks of building a Knowledge graph. In order to better realize the extraction of Chinese medicinal materials for the treatment of gastropathy entities, five Named-entity recognition models were designed for experimental comparison, and different designs were carried out in the input layer, neural network layer, and output layer. Finally, a more suitable BERT-BILSTM-CRF model was chosen. Firstly, BERT is used to generate word vectors for the neural network BILSTM. Then, BILSTM is used to obtain text features in the front and back directions of the text, obtaining relevant feature vectors. Finally, CRF is used for decoding and label prediction .The experiment showed that the accuracy, recall, and F1 values of the model used in the experiment were 85.20%, 85.47%, and 85.33%, respectively, on the dataset created by oneself. And among the relevant literature that has been searched, the relevant label item indicator performs the best.
关键词:treatment of gastric diseases with traditional Chinese medicine;named entity recognition;deep learning;BERT;BILSTM-CRF
摘要:A self-adaptive neural network backstepping control scheme was designed based on the backstepping method to address the control issues of Buck type converters under load switching and input voltage fluctuations. Firstly, the radial basis function neural network (RBFNN) is applied to design an adaptive law that approximates the nonlinear function containing the load resistance term in the system online, in order to improve the recovery speed and stability of the output voltage. Then, the generalized proportional integral observer and neural network adaptive law are introduced to handle different types of disturbances in the system, in order to enhance the anti-interference ability of the controller. Finally, by combining backstepping control and neural network adaptive technology, an adaptive neural network backstepping control (ANNBC) for the Buck type converter was designed. The experiment shows that the proposed scheme reduces the maximum deviation voltage from 0.87 V to 0.35 V during the load switching stage, verifying the effectiveness and superiority of the scheme.
摘要:With the continuous development of the textile and apparel industry, product traceability has become a crucial element in ensuring product quality and enhancing brand reputation. Traditional traceability solutions face challenges such as information asymmetry, lack of transparency, and vulnerability to tampering. To address these issues, a textile and apparel traceability scheme based on a dual-chain structure and Schnorr threshold signatures is proposed. This scheme establishes a mapping relationship between a private chain and a consortium chain through smart contracts, simplifying data interaction and significantly improving the efficiency of traceability data queries. By incorporating Schnorr threshold signature technology, the scheme effectively mitigates the risk of data tampering and ensures the secure recording of traceability data. Experimental results indicate that utilizing Schnorr threshold signature technology reduces the completion time for transactions on the private chain,a reduction of about 5.13%. Moreover, as the volume of data increases, The query efficiency of this scheme is improved by 87.5%, and the effect is remarkable.
摘要:Compared to traditional centralized systems, the throughput of decentralized blockchain networks remains at a relatively low level. As a result, many studies have explored the scalability of blockchain through consensus mechanisms, sharding, and more. However, as throughput increases, the requirements for the efficiency of the underlying message broadcasting network in blockchain systems continue to rise. This study focuses on the underlying P2P message broadcasting network and introduces a neighbor evaluation mechanism into the Gossip algorithm, proposing the NE-Gossip broadcasting algorithm. Based on the evaluation results, NE-Gossip selects nodes with better message forwarding capabilities during message relay. Experimental results indicate that under the same broadcast redundancy, the NE-Gossip algorithm demonstrates better performance in terms of broadcast coverage rate.
摘要:Due to the harsh growth environment of Tibetan medicine, it is very difficult to manually identify Tibetan medicinal plants. A lightweight detection algorithm LTP-YOLO based on improved YOLOv8 is proposed to detect Tibetan medicinal plants in outdoor environments. Firstly, replace the YOLOv8 feature extraction network with MobileViT to reduce the number of algorithm parameters and computational complexity. Secondly, the introduction of the content aware feature recombination upsampling operator CARAFE helps the algorithm perceive contextual information during upsampling. Once again, a multi-scale fusion attention mechanism MFA is proposed to establish local cross channel interaction to improve the detection accuracy of the algorithm. The experiment shows that the proposed algorithm reduces the parameter size from 3.02 MB to 1.28 MB and the computational complexity from 8.2 GFLOPs to 5.8 GFLOPs on a self built Tibetan medicine plant image dataset. Compared to YOLOv8, the mAP@.5 of the proposed algorithm is superior, demonstrating its ability to meet the high-precision and low computational deployment requirements of mobile devices, and demonstrating broad application prospects in various intensive plant detection tasks.
摘要:In computer graphics, many modeling operations result in non-manifold surfaces. Although non-manifold surfaces have more complex topological properties and a stronger geometric description capability, many mesh processing algorithms in the field of graphics, including mesh simplification and subdivision, require the input meshes to possess the property of being two-manifold. Therefore, to ensure compatibility with existing graphics algorithms, this paper proposes a method for converting non-manifold surfaces into geo-metrically similar manifold topology structures. This method aims to bridge the gap between non-manifold mesh surfaces and traditional digital geometry processing techniques. To demonstrate the universality of the algorithm framework, this paper applies it to three key application scenarios involving non-manifold surfaces: including the computation of geodesic distance fields, mesh simplification and farthest point sampling. Through an in-depth analysis of these application examples, the robustness and accuracy of the algorithm in different scenarios are verified. Experimental results demonstrate that the algorithm exhibits significant effectiveness in each application scenario, further confirming its potential in practical applications.
关键词:non-manifold surfaces;double linked face list;geodesic field;mesh simplification;farthest point sampling
摘要:Complex and multi-scale rendering of elements in thangka can affect the accuracy of object detection technology tasks. Therefore, a Tangka element object detection method is proposed to optimize the YOLOv8 model. Firstly, a cascaded fusion network is used to extract image features, and the feature extraction parameters are used for subsequent feature fusion to effectively increase parameter utilization efficiency; Secondly, drawing on the idea of bidirectional feature pyramid network, an additional path is added in the feature information transmission layer of the same layer to achieve cross scale connection, in order to enhance the model's feature fusion capability; Finally, EloU Loss and ClOU Loss are introduced into the regression loss function of the detection head, taking into account various factors of bounding box regression, and combining width to height and aspect ratio parameters to improve the efficiency and accuracy of model target localization. The experiment showed that the optimized YOLOv8 model reduced the parameter and computational complexity by 7.21% and 7.23% respectively compared to the original model, while mAP50 and mAP50-95 increased by 3.72% and 4.55% respectively; Compared to other object detection algorithms, it has significant advantages; The ablation experiment also verified the positive effects of different improvement modules on the model.
关键词:Thangka image object detection;YOLOv8 model;cascaded fusion network;cross-scale connection;regression loss function
摘要:The research focus of efficient super-resolution is to improve deep small kernel convolution to reduce model complexity and enhance efficiency. However, smaller receptive fields will limit the network's ability to reconstruct details, while large kernel convolution can provide larger receptive fields and improve image reconstruction quality, but the computational cost is too high. In order to reduce the number of model parameters and achieve efficient super-resolution reconstruction, a symmetric visual attention network (SVAN) is proposed. Firstly, the large kernel convolution is decomposed into three different lightweight and efficient convolutions. In the convolution combination, bottleneck structures are formed by using the receptive field sizes of different convolutions, and combined with attention mechanisms to form a bottleneck attention module to enhance the network's ability to focus on features; Secondly, the bottleneck attention modules are symmetrically arranged to form symmetric large kernel attention blocks, in order to further enhance the network's ability to extract deep features. The experiment shows that the proposed model has significantly improved quantitative indicators compared to other lightweight super-resolution methods, and the reconstructed images have richer texture details with only 183K parameters. It is a competitive lightweight high-quality super-resolution model that provides new solutions for efficient super-resolution.
摘要:Automatic lung image segmentation is a key step for computer-aided diagnosis systems to detect diseases such as lung cancer. However, the diversity of lung cells easily introduces local noise in the lung region of CT images, and organ interventions like the heart often blur the lung boundaries. To address these issues, this paper proposes a novel U-Net framework that combines local context relation learning modules and adaptive perception modules. To reduce the influence of local noises, this paper constructs a multi-level context relationship for the lung feature extraction by exploiting the surrounding information of specific lung region, which enhances the recognition ability of U-Net to diverse lung cells. To address the issue of blurry lung boundaries, this paper proposes an adaptive perception learning module working at the skipping connections. This module includes a mixed attention mechanism to guide the model for focusing more on the lung area at both the channel and spatial dimensions. In addition, the designed bottom-up feature fusion path can further enhance the robustness of the learned lung features. The proposed method achieves 98.58% and 97.68% accuracy on LUNA and SHCXR dataset, which are 0.34% and 0.25% higher than other segmentation methods on average. The proposed approach can provide help for further analysis of lung diseases.
摘要:In recent years, artificial intelligence and image recognition technology have played an increasingly important role in monitoring and managing dust pollution sources. To achieve lightweight and high-precision detection of dust pollution sources, this paper proposes a lightweight dust pollution source target detection method based on CenterNet. Firstly, in order to address the issue of missed detection of some targets in the image, this paper proposes a dust pollution source detection method based on one-time aggregation expansion. The one-time aggregation expansion module is used to replace conventional convolution, which expands the receptive field while obtaining features containing different receptive domains, reducing the average missed detection rate. Secondly, in response to the lightweight requirements of monitoring networks, this paper introduces MobileNet based on the above work and adds a centralized feature pyramid to achieve feature fusion. A lightweight pollution source detection method mCFP-CenterNet based on the centralized feature pyramid is proposed. Finally, this paper constructs a dust pollution source image dataset and conducts experiments on both the dataset and the PASCAL VOC 2007 dataset. The experimental results show that this method performs better than other methods in terms of evaluation indicators such as inference time, computational complexity, parameter quantity, and average miss detection rate, and can meet the requirements of lightweight and high-precision applications.
关键词:object detection;CenterNet;fugitive dust source;feature fusion;lightweight model
摘要:Automatically and accurately locating a large number of cells in biomedical imaging images is of great significance for biomedical research. The existing image processing methods have low accuracy in locating densely distributed and adherent cells, and parameter settings are highly sensitive to data. Therefore, a cell localization method based on density peak clustering and adaptive optimization parameters is proposed. Firstly, establish a deep learning model for cell segmentation to improve clustering performance; Next, analyze the trend change of the minimum distance between the local density of the foreground area and higher density points in the image, and automatically optimize the density threshold and distance threshold parameters to select clustering centers, achieving automatic localization of dense and adherent cells. Compared with five commonly used algorithms, it was found that the proposed method has a detection rate and accuracy of 0.89 and 0.81, respectively, in the fluorescence micro optical slice tomography imaging mouse dataset. It is superior to the comparison method in more complex datasets and provides a new high-precision automated method for cell localization work. It has good prospects in the computer application field of biomedical image processing.
摘要:With the development of 3D scanning technology, a large amount of point cloud data is generated, and how to effectively process and analyze this data has become an important issue. Point cloud registration is a crucial step in point cloud processing, and 3DSC is a feature-based registration algorithm widely used in the field of computer vision. This algorithm has high registration accuracy, but there is a problem of long registration time when registering large-scale point cloud data or data with severe noise interference. It is not suitable for scenarios that require high accuracy and registration efficiency. To solve this problem, the 3DSC algorithm is optimized by extracting voxel centers for downsampling during its alignment, and repeating octree segmentation on the complete point cloud data until the maximum recursive depth is reached or segmentation is no longer possible; Then extract voxel centers from data points within every 8 adjacent minimum units; Finally, reconstruct all voxel centers into new point cloud data and perform registration. Compared to traditional 3DSC algorithms, optimized 3DSC has higher registration efficiency and greatly reduces registration time.
摘要:A improved FCOS defect detection method, FCOS-TCA, is proposed to address the difficulty in detecting small-sized crack defects in magnetic particle inspection tasks on the current production line of rail links, as well as the low accuracy of defect recognition in complex backgrounds. Firstly, Swin Transformer is introduced as a feature extraction network to enhance its ability to extract features from small targets; Secondly, a CA-PAFPN module is proposed based on path aggregation feature pyramid network and coordinated attention mechanism to promote the fusion of high-low dimensional and multi-scale feature information; Finally, optimize the bounding box regression loss function to EIoU_Loss loss function to accelerate model convergence and improve regression accuracy. The self-made TL dataset was used for validation analysis, and the results showed that the average accuracy of the FCOS-TCA model was 83.7%, which was 6.5% higher than the original FCOS model. It has certain reference value for promoting the development of surface and subsurface defect detection of chain links.
摘要:Change detection is the process of obtaining surface changes by analyzing remote sensing images from two different stages at the same location. Convolutional neural networks are difficult to solve change detection tasks in complex scenes. The BIT network concatenates the Transformer after convolution to improve detection performance by capturing global information. However, directly concatenating Transformers will weaken the ability to represent local information features, leading to missed detections. Therefore, based on the BIT network, a residual network is used to fuse the local and global information before and after the Transformer, in order to compensate for the loss of local information caused by directly concatenating Transformers and achieve the goal of improving accuracy. At the same time, a new loss function Dice is added to optimize training for the imbalance between changing and unchanged classes in remote sensing images. The experimental results showed that on the publicly available datasets LEVIR-CD and BCDD, the improved network improved the F1 index by 0.51% and 0.69%, respectively, with a parameter increase of 0.02M; The recall rates have increased by 0.45% and 0.50% respectively.
摘要:Wetlands are an important barrier for maintaining the water quality of Dianchi Lake. Quickly and accurately obtaining the distribution of wetland types in Dianchi Lake is of great significance for its water quality protection. However, there is currently a significant lack of research on different wetland types in Dianchi Lake. To this end, Sentinel-1 and Sentinel-2 image data based on the GEE platform were used to classify the Dianchi wetland, and the performance of three machine learning methods (SVM, CART, and RF) using different combinations of classification features was analyzed and compared. The results indicate that the classification performance of RF is superior to CART and SVM; When using spectral bands, spectral index,radar features, terrain features, and texture features as classification criteria, the overall classification accuracy and swamp wetland classification accuracy are the highest, with an overall accuracy of 86.81%, a Kappa coefficient of 0.84%, and an F-score of 85.83%. The F-scores for woody swamps and herbaceous swamps are 85.71% and 78.69%, respectively.
摘要:Under the background of new engineering disciplines, the rapid development of the big data industry has put forward higher requirements for practical education in data science and big data technology majors. Explore the current situation and problems of practical teaching in big data majors, and propose innovative practical platform construction plans based on actual situations to improve the quality of practical teaching. Through years of experience in running schools, it has been found that practical teaching has problems such as disconnection from industry development, incomplete system, and insufficient software and hardware resources. To this end, an innovative practice platform integrating virtualization technology and container technology is constructed, which uses high-performance computing resources and large capacity storage systems to support diverse practical needs such as virtual experimental environments, project training, innovation and entrepreneurship, and competitions. This significantly enhances students' practical abilities and innovative thinking, and expands their technical horizons. Practice has shown that innovative practice platforms not only optimize the allocation of teaching resources and improve the practical teaching effectiveness of big data majors, but also provide strong support for cultivating high-quality big data talents that conform to the concept of engineering certification. At the same time, they also provide useful references for practical teaching in related disciplines.
摘要:In the context of the digital transformation of education, mobile learning has become the new norm in digital learning. Language learning mobile applications (APP) have emerged as crucial tools for second language learners. This paper adheres to an "evidence-based data" evaluation perspective and attempts to propose an educational APP evaluation technique combining pre-trained language model and big data mining. The study collects user review data from the top 20 language learning apps on mainstream APP markets, and employs pre-trained language model to calculate sentiment scores for the review texts. Building upon text information, the paper utilizes techniques such as topic modeling to analyze learners' needs and preferences when using language learning apps, aiming to extract multiple indicators for evaluating the apps. Finally, the study integrates various analytical results to establish an accurate and objective language learning APP evaluation system. Visualized analysis is conducted on collected review information for selected apps, leveraging the value of pre-trained language model and data elements to contribute to the scientific governance of digital language teaching resources.
关键词:APP;language learning;data mining;pre-trained language model;evaluation visualization;sentiment analysis
摘要:Scientific computation and visualization serve as cornerstone technologies underpinning scientific research, while case-based teaching emerges as a pivotal teaching method. This paper delves into the design concept of teaching cases for scientific computation and visualization courses, leveraging MWORKS as the platform. Specifically, we introduce a quintessential case study on sudoku image recognition and solving. This case study integrates a wide array of knowledge, including image preprocessing techniques such as correction and segmentation from "Digital Image Processing," classifier or deep learning concepts derived from "Introduction to Machine Learning," backtracking and pruning optimization algorithms sourced from "Algorithm Design and Analysis," and the powerful scientific computation and visualization tools offered by MWORKS. The case study encompasses the entire system workflow, starting from image preprocessing, digit recognition, sudoku solving, and result visualization. This holistic approach enhances students' systematic practical abilities, enabling them to apply acquired knowledge in comprehensive thinking and problem-solving endeavors. The teaching methodology has proven to be highly effective, showcasing substantial application value.
关键词:scientific computation and visualization;MWORKS platform;systematic practice;sudoku image recognition and solving
摘要:Generative artificial intelligence has always been a hot topic in the field of machine learning, widely used in various fields such as text generation and computer vision. The maximum likelihood estimation method establishes the training objectives for generating models, but there are still shortcomings in meeting the personalized needs of users. In recent years, reinforcement learning has shown certain potential in building high-performance models by introducing new training signals such as artificially customized evaluation mechanisms, bringing breakthrough progress to the design and application of generative artificial intelligence models. Conduct a comprehensive and systematic review of the latest developments in the field of generative artificial intelligence, classify and summarize various models and their application scenarios from a cross disciplinary perspective; Focus on the rapidly developing large-scale modeling technology, explore the current limitations and future development directions. By analyzing the weaknesses of the model, we aim to gain a comprehensive and in-depth understanding of the theoretical foundation and practical application status of generative artificial intelligence, in order to guide industry practice and provide reference for future research.
摘要:With the surge in personal opinions on online platforms, sentiment analysis has become crucial. It can help institutions gain a deeper understanding of users' emotional tendencies, optimize product services, provide strong support for market decision-making, and more accurately understand social public opinion trends. Summarize the latest developments in the field of sentiment analysis, including preprocessing techniques, feature extraction methods, classification techniques, commonly used datasets, etc. Explore the limitations and future research directions of this field, in order to provide valuable resources for relevant researchers and practitioners.
摘要:Facial expression recognition research is an important research direction in artificial intelligence technology. To this end, CiteSpace scientific knowledge graph software was used to statistically analyze 702 articles in the field of facial expression research included in the core journals of Peking University and CSSCI on China National Knowledge Infrastructure from 2014 to 2023. From the aspects of time distribution, main journal sources, core authors and cooperative relationships, research institution cooperation, keyword co-occurrence and prominence, keyword clustering, etc., the strength of researchers, research hot topics and frontiers in this field were analyzed. Research has shown that the research content in the field of facial expression mainly focuses on the application and model innovation of deep learning technology, multimodal fusion and cross disciplinary technology, micro facial expression recognition and special group facial expression recognition. Based on this, theoretical and application problems that need to be further studied and solved in facial expression recognition are proposed, which not only provides comprehensive research status and cutting-edge dynamics for scholars in the field of facial expression recognition, but also provides reference for future research directions.
摘要:Image captioning is an important and challenging research field aiming to generate natural language descriptions for static images. In recent years, the development of deep learning and visual-language pretraining techniques has propelled advancements in the field of image captioning. This survey paper provides a comprehensive classification framework and extensively discusses the application of deep learning methods in image captioning. Various deep learning models and algorithms, including recurrent neural networks (RNNs), convolutional neural networks (CNNs), generative adversarial networks, and Transformers, are introduced, along with the latest visual-language pretraining techniques. These models play a crucial role in the task of image captioning by learning the associations between images and language, enabling the generation of accurate and fluent descriptions. Moreover, the survey emphasizes the challenges faced in this field, such as object hallucination, missing context, lighting conditions, contextual understanding, and referential expression, which require models to possess reasoning abilities. In the future, the performance and quality of image captioning generation will be further improved to promote its application in practical applications.
关键词:image captioning;deep learning;cross-modal;computer vision;natural language processing